DOC PREVIEW
UW-Madison ECE 539 - Kalman Filter Based Algorithms for Fast Training of Multilayer Perceptrons

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Kalman Filter Based Algorithms for Fast Training of Multilayer Perceptrons: Implementation and ApplicationsIntroductionPowerPoint PresentationSlide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13ConclusionsSlide 15Kalman Filter Based Algorithms for Fast Training of Multilayer Perceptrons: Implementation and Applications ECE 539 Project• Dan Li• Spring, 2000IntroductionIntroduction•Multilayer perceptron (MLP) –A feedforward neural network model–Extensively used in pattern classification–Essential issue: training/learning algorithm•MLP training algorithms–Error backpropogation (EBP)•A conventional iterative gradient algorithm•Easy to implement•Long and uncertain training process–An algorithm proposed by Scalero and Tepedelenlioglu [1]: S.T. Algorithm (based on Kalman filter techniques)–Modified S.T. algorithm proposed by Wang and Chen [2] : Layer-by-layer (LBL) Algorithm (based on Kalman filter techniques)EBP AlgorithmEBP Algorithmx1x2xM11 Fh(.) Fh(.)Fh(.) Fo(.) Fo(.) Fo(.)u1u2uHy1y2yHz1z2zNv1v2vH..................NkOkjOkjhHjHjiHjiHjHjiHjinwnnuFnnwnwnnwnw1)()(')()()()()()()()())(()()]1()([)()()1()]()([))(()()]1()([)()()1(')()()()()()(nzntnvFnnwnwnnwnwjjjoojojiojiojojiojiFor the hidden layer For the output layerS.T. AlgorithmS.T. Algorithmx1x2xM11 Fh(.) Fh(.) Fh(.) Fo(.) Fo(.) Fo(.)u1u2uHy1y2yHz1z2zN F-1o(.)t1t2tN F-1o(.) F-1o(.)v1v2vN-+ev1*-+e-+ev2*vN*-+e-+e-+eu1*u2*uM*..................For the hidden layer For the output layer  )()())(()()())(())(()()()()1()(')()()(')()()()(nzntnvFnnwnnuFnnknnwnwjjjoojOjTojhHjHHjHjHj)())(()()()()()()1()(1*)()(nvntFnvnvnenenknwnwjjojjjjoojojLBL AlgorithmLBL Algorithmx1x2xM1 Fh() Fh() Fh() Fo() Fo() Fo()u1u2uHy1y2yHz1z2zN F-1o()t1t2tN F-1o() F-1o()v1v2vN-+ev1*-+e-+ev2*vN*-+e F-1h() F-1h() F-1h()-+e-+eu1*u2*uN*y1*y2*yH*1..................For the hidden layer For the output layer)()1())(()()()()()()1()(*1*nxnWnyFnununenenknWnWohHTHHHH)1()1())1(()1()1()1()1()()1()(1*nnWntFnvnvnenenknWnWoooTooooExperiment #1: 4-4 Encoding/DecodingExperiment #1: 4-4 Encoding/Decoding0 0 0 0 0 0 0 00 0 0 1 0 0 0 10 0 1 0 0 0 1 00 0 1 1 0 0 1 10 1 0 0 0 1 0 00 1 0 1 0 1 0 10 1 1 0 0 1 1 00 1 1 1 0 1 1 11 0 0 0 1 0 0 01 0 0 1 1 0 0 11 0 1 0 1 0 1 01 0 1 1 1 0 1 11 1 0 0 1 1 0 01 1 0 1 1 1 0 11 1 1 0 1 1 1 01 1 1 1 1 1 1 1InputTarget0 200 400 600 800 100011.522.533.544.5Learning CurveEpochMSEEBPS.T.LBL # of Epochs CPU time (sec) Learning error Correct rateEBP 1000 73.906 >1.4 62.50%S.T. 1000 85.473 >1.4 75%LBL 42 4.164 1.2 87.50%• MLP Structure: 4-3-4; =0.16 • EBP: =0.3; =0.8; S.T.: =0.3; H= o=0.9; LBL: =0.15; H= o=0.9;Experiment #2: Pattern Classification (IRIS)Experiment #2: Pattern Classification (IRIS) 4 input features3 classes (001, 010, 100)75 training patterns75 testing patterns0 200 400 600 8000123456Learning CurveEpochMSEEBP S.T. # of epochs CPU time (s) Correct rate (Training) Correct rate (Testing)EBP 800 339.178 96.00% 88.00%S.T. 800 393.045 98.67% 93.33%• MLP Structure: 4-3-3; =0.01• EBP: =0.3; =0.8; S.T.: =20; H= o=0.9;Experiment #3: Pattern Classification (wine)Experiment #3: Pattern Classification (wine)13 input features3 classes (001, 010, 100)60 training patterns118 testing patterns• MLP Structure: 13-15-3; • EBP: =0.3; =0.8; S.T.: =20; H= o=0.9; LBL: =0.2; H= o=0.9; 0 100 200 300 400 50001234567Learning CurveEpochMSEEBP LBL S.T. # of epochs CPU time (s) Correct rate (Training) Correct rate (Testing)EBP 500 201.469 30.00% 27.97%S.T. 500 254.156 100.00% 70.34%LBL 500 301.814 55.00% 48.33%Experiment #4: Image RestorationExperiment #4: Image Restoration20 40 601020304050600 100 200 300 400 5000510152025Learning CurveEpochMSEEBP (bat) EBP (seq) LBL (seq) LBL (bat) • Raw image 64  648 bit • MLP structure: 64-16-64• EBP: =0.3; =0.8; S.T.: =0.3; H= o=0.9; LBL: =0.15; H= o=0.9; 20 40 60102030405060LBL (bat)20 40 60102030405060LBL (seq)20 4060102030405060EBP (bat)20 4060102030405060EBP (seq)Experiment #5: Image Reconstruction Experiment #5: Image Reconstruction (I)(I)Original Image(2562568 bit)* Schemes of selecting training subsets (shaded area)113225632 256116425664 256A 32 input features B 64 input featuresExperiment #5: Image Reconstruction Experiment #5: Image Reconstruction (II)(II)Epoch0 50 100 150 2000102030405060MSEEBP (bat) LBL (bat) LBL (seq) EBP (seq) Restored: LBL (bat) Restored: LBL (seq)Restored: EBP (seq) • MLP structure: 32-16-32• Convergence threshold: MSE=5• EBP: =0.3; =0.8; LBL: =0.15; H= o=0.9; Scheme A# of epchs CPU timeEBP (seq) 200 1237.789EBP(bat) 200 66.806LBL (seq) 200 2406.661LBL (bat) 7 4.3760 50 100 150 2000102030405060708090EpochMSEEBP (bat) EBP (seq) LBL (bat) LBL (seq) Restored: LBL (bat) Restored: LBL (seq)Restored: EBP (seq)Experiment #5: Image Reconstruction Experiment #5: Image Reconstruction (III)(III)• MLP structure: 64-32-64• Convergence threshold: MSE=5• EBP: =0.3; =0.8; LBL: =0.15; H= o=0.9; # of epchs CPU timeEBP (seq) 200 2587.361EBP(bat) 200 166.039LBL (seq) 200 7058.199LBL (bat) 6 9.123Scheme BExperiment #5: Image Reconstruction Experiment #5: Image Reconstruction (IV)(IV)Restored: LBL (seq)Restored: S.T. (seq)Restored: EBP (seq)0 20 40 60 80 10001020304050607080EpochMSEST (seq) EBP (seq) EBP (seq) LBL (seq) LBL (bat) EBP (bat) Scheme A, Noisy Image for Training• MLP structure: 32-16-32• Convergence threshold: MSE=5• EBP: =0.3; =0.8; LBL: =0.15; H= o=0.9;ConclusionsConclusions•Compared with EBP algorithm, Kalman-filter-based S.T. and LBL algorithms generally induce a lower MSE in the training process in a significantly smaller number of epochs.•However, the CPU time needed to run one iteration is longer for the S.T. and LBL algorithms, due to the computation of Kalman gain, the inverse of correlation matrices and the (pseudo)inverse of the output in each layer. LBL often required even longer computation time than the S.T. algorithm.•Therefore, the total computation time required


View Full Document

UW-Madison ECE 539 - Kalman Filter Based Algorithms for Fast Training of Multilayer Perceptrons

Documents in this Course
Load more
Download Kalman Filter Based Algorithms for Fast Training of Multilayer Perceptrons
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Kalman Filter Based Algorithms for Fast Training of Multilayer Perceptrons and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Kalman Filter Based Algorithms for Fast Training of Multilayer Perceptrons 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?