DOC PREVIEW
CMU CS 10701 - Neural Networks

This preview shows page 1-2-19-20 out of 20 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

„CodePrinted by Mathematica for StudentsNeural NetworksJoseph E. GonzalezWe are going to go through Neural Networks and review the process of back propagation. Experimental Mathematica based presentation.2 neural_networks_slide_show.nbPrinted by Mathematica for StudentsSingle Perceptron„The PerceptronperceptronPlotw0w1w1X0=1g@‚i=03wiXiDX1X2„There are several parts1.Link Function [email protected] wi3.A bias term X0neural_networks_slide_show.nb 3Printed by Mathematica for StudentsLink Function(1)g@xD =11 + ‰-xg = FunctionBx,11 + Exp@-xDF;Plot@g@xD, 8x, -8, 8<D-550.20.40.60.81.04 neural_networks_slide_show.nbPrinted by Mathematica for StudentsDemoManipulateBg = FunctionBx,11 + Exp@-xDF;Plot3D@ g@w0 + w1 x1 + w2 x2D, 8x1, -3, 3<, 8x2, -3, 3<D,88w0, 0<, -3, 3<, 88w1, 2<, -3, 3<, 88w2, -2<, -3, 3<Fw0w1w2neural_networks_slide_show.nb 5Printed by Mathematica for StudentsNeural Network with Multiple Hidden Layers„Lets Consider what this network looks likepltu0u1u2u3w01w11w21w02w12w22w03w13w23Z0=1outHxL=g@‚i=03uiZiDZ1=g@‚i=02wi1XiDZ2=g@‚i=02wi2XiDZ3=g@‚i=02wi3XiDX0=1X2X1„Matlab Style Forward PropagationLets define a matrix Was:(2)W =w01w11w21w02w12w22w03w13w23We can multiply this matrix by X where we have added a 1(3)W ÿ1X1X2=w01w11w21w02w12w22w03w13w23ÿ1X2X3=w01+ w11 X1+ w21 X2w02+ w12 X1+ w22 X2w03+ w13 X1+ w23 X2Lets define function application as element wise. Then we obtain:(4)gBW ÿ1X2X3F =gAw01+ w11 X1+ w21 X2EgAw02+ w12 X1+ w22 X2EgAw03+ w13 X1+ w23 X2E=Z1Z2Z3We can then prepend a 1 to the result to obtain:6 neural_networks_slide_show.nbPrinted by Mathematica for Students(5)outHXL = gBHu0u1u2u3Lÿ1Z1Z2Z3F = gBu0+‚i=13ui ZiFneural_networks_slide_show.nb 7Printed by Mathematica for StudentsForward Propagation (Example) #1„What is the value of Z1plotTree@8"outHxL=g@?D", "Z0=1", "Z1=g@?D", "Z2=g@?D", "Z3=g@?D","X0=1", "X2=2", "X1=3"<D21-3-2123-3211-21Z0=1outHxL=g@?DZ1=g@?DZ2=g@?DZ3=g@?DX0=1X2=2X1=38 neural_networks_slide_show.nbPrinted by Mathematica for StudentsWhat is the value of Z2?plotTree@8"outHxL=g@?D", "Z0=1", "Z1=g@1-6+3D=0.88", "Z2=g@?D","Z3=g@?D", "X0=1", "X2=2", "X1=3"<D21-3-2123-3211-21Z0=1outHxL=g@?DZ1=g@1-6+3D=0.88Z2=g@?DZ3=g@?DX0=1X2=2X1=3neural_networks_slide_show.nb 9Printed by Mathematica for StudentsWhat is the value of Z3?plotTree@8"outHxL=g@?D", "Z0=1", "Z1=0.88", "Z2=g@2+4-6D=0.5", "Z3=g@?D","X0=1", "X2=2", "X1=3"<D21-3-2123-3211-21Z0=1outHxL=g@?DZ1=0.88Z2=g@2+4-6D=0.5Z3=g@?DX0=1X2=2X1=310 neural_networks_slide_show.nbPrinted by Mathematica for StudentsWhat is the value of outHXL?plotTree@8"outHxL=g@?D", "Z0=1", "Z1=0.88", "Z2=0.5", "Z3=g@3+2+3D=1","X0=1", "X2=2", "X1=3"<D21-3-2123-3211-21Z0=1outHxL=g@?DZ1=0.88Z2=0.5Z3=g@3+2+3D=1X0=1X2=2X1=3neural_networks_slide_show.nb 11Printed by Mathematica for StudentsDone!plotTree@8"outHxL=g@2+0.88-1.5-2D=0.35", "Z0=1", "Z1=0.88", "Z2=0.5","Z3=1", "X0=1", "X2=2", "X1=3"<D21-3-2123-3211-21Z0=1outHxL=g@2+0.88-1.5-2D=0.35Z1=0.88Z2=0.5Z3=1X0=1X2=2X1=312 neural_networks_slide_show.nbPrinted by Mathematica for StudentsDemodynamicDemoJneural_networks_slide_show.nb 13Printed by Mathematica for StudentsGeneralized Back Propagationpltu0u1u2u3w01w11w21w02w12w22w03w13w23Z0=1outHxL=g@‚i=03uiZiDZ1=g@‚i=02wi1XiDZ2=g@‚i=02wi2XiDZ3=g@‚i=02wi3XiDX0=1X2X1Suppose we want to find the best model outHx; U, W L with respect to the parameters W and U. How can we quantify best?Lets considered mean squared error. (6)E =‚i=1nHoutHXiL- YiL2There are many ways to do this. One of the most common (and least effective) methods is to use gradient descent. Thiscorresponds to the update rule:(7)uiHt+1LôuiHtL- h∂ E∂uiuiHtL(8)wijHt+1LôwijHtL- h∂ E∂wijwijHtLRecall we have the following graph:14 neural_networks_slide_show.nbPrinted by Mathematica for Studentspltu0u1u2u3w01w11w21w02w12w22w03w13w23Z0=1outHxL=g@‚i=03uiZiDZ1=g@‚i=02wi1XiDZ2=g@‚i=02wi2XiDZ3=g@‚i=02wi3XiDX0=1X2X1Lets first derive the update rule for U:(9)E = HoutHXL - Y L2Taking the derivative we get (stuck?):(10)∂ E∂uk=∂∂uk HoutHXL - Y L2Applying the infamous chain rule:(11)∂∂ x f HgHxLL =∂∂u f HuLu = g HxL ∂∂ x gHxL(12)∂ E∂uk= 2 HoutHXL- YiL∂∂x x2= 2 x ∂∂uk outHXL∂∂ x f HgHxLL = f£HgHxLLg£HxLNow we need to take the derivative of the neural network. Lets first replace out with the function from the top perceptron(13)∂ E∂uk= 2 HoutHXL- YL ∂∂uk gB‚i=03ui ZiFChain rule againneural_networks_slide_show.nb 15Printed by Mathematica for Students(14)∂ E∂uk= 2 HoutHXL- YL g£B‚i=03ui ZiF ‚i=03∂∂uk ui ZiWe know that only one term in the Zi sum will remain and that is Zi=k(15)∂ E∂uk= 2 HoutHXL- YL g£B‚i=03ui ZiFZkDone thats it!!! Sort of. Lets look at the derivative of g@xD =11-Exp@-xD(16)g£@xD =∂∂ x H1 + Exp@-xDL-1(17)g£@xD = -H1 + Exp@-xDL-2 ∂∂ x H1 + Exp@-xDL(18)g£@xD = -H1 + Exp@-xDL-2 ∂∂ x 1 +∂∂ x Exp@-xD(19)g£@xD = -H1 + Exp@-xDL-2 0 + Exp@-xD ∂∂ x H-xL(20)g£@xD = -H1 + Exp@-xDL-2 H0 - Exp@-xDL(21)g£@xD = H1 + Exp@-xDL-2 Exp@-xDWith some manipulation we get:(22)g£@xD =Exp@-xDH1 + Exp@-xDL 1H1 + Exp@-xDL(23)g£@xD =Exp@-xD1 + Exp@-xD g@xD(24)g£@xD =1 + Exp@-xD- 11 + Exp@-xD g@xD(25)g£@xD =1 + Exp@-xD1 + Exp@-xD+-11 + Exp@-xD g@xD(26)g£@xD = 1 -11 + Exp@-xD g@xD(27)g¢@xD = H1 - g@xDL g @xDRecall that we earlier had:(28)∂ E∂uk= 2 HoutHXL- YL g£B‚i=03ui ZiF Zkwe can make a simple substitution to get:16 neural_networks_slide_show.nbPrinted by Mathematica for Studentswe can make a simple substitution to get:(29)∂ E∂uk= 2 HoutHXL- YL 1 - gB‚i=03ui ZiF gB‚i=03ui ZiF Zk(30)∂ E∂uk= 2 HoutHXL- YL H1 - outHXLL outHXL Zkpltu0u1u2u3w01w11w21w02w12w22w03w13w23Z0=1outHxL=g@‚i=03uiZiDZ1=g@‚i=02wi1XiDZ2=g@‚i=02wi2XiDZ3=g@‚i=02wi3XiDX0=1X2X1neural_networks_slide_show.nb 17Printed by Mathematica for StudentsGradient of WThat wasn't too bad. How about the next layer. We again start with:(31)E = HoutHXL - Y L2Taking the derivative with respect to wkr (and applying the chain rule)(32)∂ E∂wkr=∂ E∂outHXL ∂outHXL∂wkr= 2 HoutHXL- YL ∂∂wkr outHXLExpanding out we get:(33)∂ E∂wkr= 2 HoutHXL- YL ∂∂wkr gB‚i=03ui ZiFChain rule:(34)∂ E∂wkr= 2 HoutHXL- YL g£B‚i=03ui ZiF ‚i=03∂∂wkrui ZiRecall that g£@xD = H1 - g@xDL g@xD(35)∂ E∂wkr= 2 HoutHXL- YL H1 - outHXLL outHXL ‚i=03∂∂wkrui ZiRemember that each of the Zi is connected to all the perceptrons from the lower level so we must take the


View Full Document

CMU CS 10701 - Neural Networks

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

Lecture

Lecture

32 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download Neural Networks
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Neural Networks and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Neural Networks 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?