DOC PREVIEW
UT Dallas CS 6375 - exam_sample

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 8 Artificial Neural Networks written examination Institutionen f r informationsteknologi Olle G llmo Universitetsadjunkt Monday May 15 2006 900 1400 Adress L gerhyddsv gen 2 Box 337 751 05 Uppsala Allowed help material Pen paper and rubber dictionary Telefon 018 471 10 09 Telefax 018 51 19 25 Hemsida user it uu se crwth Epost olle gallmo it uu se Please answer in Swedish or English the following questions to the best of your ability Any assumptions made which are not already part of the problem formulation must be stated clearly in your answer Write your name on top of each page Don t forget to hand in the last page your answers to question 10 The maximum number of points is 40 To get the grade G pass a total of 20 points is required The grade VG pass with distinction requires approximately 30 points but also depends on the results on the lab course labs project Information Technology Olle G llmo Lecturer Address L gerhyddsv gen 2 Box 337 SE 751 05 Uppsala SWEDEN Your teacher will drop in sometime between 10 00 and 11 00 to answer questions In this exam some concepts may be called by different names than the ones used in the book Here is a list of useful synonyms and acronyms Perceptron summation unit SU conventional neuron Binary perceptron summation unit with binary step activation function Multilayer perceptron MLP Feedforward network of summation units RBF Radial Basis Functions Standard Competitive Learning LVQ I without a neighbourhood function Objective function the function to be minimzed or maximized error function fitness function Telephone 46 18 471 10 09 Telefax 46 18 51 19 25 Web site user it uu se crwth E mail olle gallmo it uu se Now sit back relax and enjoy the exam Good luck 21 students attended this exam of which 9 failed 7 passed G and 5 passed with 30 points or more may become pass with distinction depending on the results from lab course The best result was 37 points 1 Why is it impossible for a single binary perceptron to solve the XOR problem 2 Because XOR is not a linearly separable problem Perceptrons solve classification tasks by adjusting a hyper plane in the input space i e in 2D as in this case a line Most students got this Some failed to mention that the discriminant is a hyperplane line though which is the main point here 21 answers 15 with max credit Average 1 5 2 Neural networks require lots of data to be trained properly If you have too little data too few input target pairs the first thing to try is to get more However sometimes this is simply not possible and then to split up the few data you have in a training set and a test set might be considered wasteful Describe how K fold cross validation can be used to deal with this problem Note This is not early stopping 3 Split the data into K sets of N K patterns each where N is the total number of patterns Train on all but one and test on the one left out Do that for each of the K sets Report the average error over the K tests Alternative Select N K patterns at random train on the rest test on the ones selected and run this K times Report result as above K Fold cross validation is not an early stopping technique though as was clearly pointed out in the question 1 point was deducted for answers which did not mention what to do with the K test results 18 answers 5 with max credit Average 1 9 3 What is weight decay What is it good for and how can it be implemented 2 Weight decay is to let each weight in a neural network strive for 0 in addition to the change given by the training algorithm of course There are several reasons for wanting to do this For example most common answer So that we can remove unecessary weights after training since they will be very close to 0 By the way if you do this you should retrain the network afterwards To avoid numerical problems with too large weights since the weighted sum is in the exponent of the sigmoid large weights may quickly lead to numerical problems related to the previous Large weights also means that the sigmoids are likely to bottom out in either end where the derivative is close to 0 This makes the network rigid since this derivative is multiplied in the weight update formula So weight decay gives the network more flexibility and can speed up learning since it tends to move the weighted sums closer to the region in the sigmoid where the derivative is the largest Implementation After updating the weights according to the update rule update them again by w 1 e w where e is the decay rate Some students associated weight decay with Ant Colony Optimization instead of Neural Networks Indeed the decay of pheromones can be viewed as a form of weight decay but since the concept has not been discussed in those term on this course only partial credit was given for such answers 21 answers 7 with max credit Average 1 1 4 Write down the back propagation algorithm For full credit your description must be clear enough for someone who knows what a multilayer perceptron is to implement the algorithm 5 Considering that so much of this course focus on MLPs and backprop I was very surprised to see how many students failed this question Only two students got max credit only two more were close most got less than half of that 20 answers 2 with max credit Average 1 9 5 How is the hidden layer of a RBF network different from the hidden layer in a MLP Explain this difference in terms of a what the hidden nodes compute when feeding data to the network 2 MLP hidden nodes compute weighted sums of the input and feed that through a sigmoid RBF nodes compute the distance between the input vector and the weight vector and feed that through a Gaussian or similar function b how this affects the shape of the discriminant when using the networks for classification 2 MLP hidden nodes form hyperplanes RBF nodes form hyperspheres or hyperellipses Sidenote It is not the activation function which decides the shape of the discriminant The weighted sum forms the hyperplane in MLPs the sigmoid only decides what to output given the distance from that hyperplane Similarly for RBFs it is the distance calculation between input vector and weight vector which forms the hypersphere not the Gaussian c how the hidden nodes are trained 2 MLPs are usually trained by some form of backprop see Q4 The hidden layer of a RBF network is usually trained by some form of unsupervised learning e g competitive learning or K Means Some students got minor deductions for only describing RBF not comparing to MLP A few


View Full Document

UT Dallas CS 6375 - exam_sample

Documents in this Course
ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw5

hw5

2 pages

hw3

hw3

3 pages

20.mdp

20.mdp

19 pages

19.em

19.em

17 pages

16.svm-2

16.svm-2

16 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

21.rl

21.rl

18 pages

CNF-DNF

CNF-DNF

2 pages

ID3

ID3

4 pages

mlHw6

mlHw6

3 pages

MLHW3

MLHW3

4 pages

MLHW4

MLHW4

3 pages

ML-HW2

ML-HW2

3 pages

vcdimCMU

vcdimCMU

20 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

Load more
Download exam_sample
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view exam_sample and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view exam_sample and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?