Lip-recognition Software using a Kohonen Algorithm for Image CompressionOutlineProblemProblem of lip-recognition softwareMotivationPreprocessingKohonen Self Organisation Map (SOM)Kohonen SOMSlide 9Slide 10Multi-Layer perceptronMulti-Layer perceptron: ResultSlide 13ConclusionSome referencesLip-recognition Software using a Kohonen Algorithm for Image CompressionECE 539Final Project Fall 2003Demetz ClémentOutline-Problem and motivation-Data creation: preprocessing-Kohonen self organization map (SOM)-Multi-Layer perceptron-Final results-Conclusion-ReferencesProblem-Problem of voice recognition:A combined approach always leads to better resultsLip-recognitionVoice-recognitionCombined recognitionFor cell phone and PDA: voice recognition and visual recognitionProblem of lip-recognition software-Need high computational power.-Need to be implement on low-power systems (PDA, cell phone)How can we reduce the size of the information?Pb: Find a way to implement such an algorithm with few computation.MotivationReduce the size of the image with a Kohonen Self organization mapKohonen SOMFilterMulti-Layer perceptronImage of a cell phone digital cameraContour of the mouthPreprocessing-Starting with low quality JPEG pictures-Gradient filters are applied to only keep the contour of the mouths.-the opening of the mouth is a relevant input: needs to follow a certain pattern to pronounce a sound.Pb: a contour corresponds to thousands points: it is still too large to have a low computation timeDark part of the mouthJPEG picture of the mouthContour of the dark partKohonen Self Organisation Map (SOM)-Idea of using a Kohonen self organization map to reduce the information to 12 neurons-problems:•Initialization•Bad stretching or turning of the SOMKohonen SOM-problems:•Initialization•Bad stretching or turning of the SOMWe want to keep all the information: here we are losing the left partKohonen SOM-A way to avoid problems:•We link the first and the last neuronsNeurons (n) Neurons (n) * 1 Neurons (n) 1 * Kohonen process: next iteration Vector n+2 Vector n+2 * 1Kohonen SOM-Results of the Kohonen Map: we keep 12 points representing the contour:Multi-Layer perceptron-We take the 12 points given by the SOM as inputs. SOM applied many times on each picture to create the database-3 classes of pictures: only 3 sounds, because the lip-recognition is a support to a voice recognition -Training on 15 pictures, testing on 3 pictures.Multi-Layer perceptron: ResultLayersalphamomentumConfiguration(hidden l)Testing classification rate(%)Training classification rate(%)20.10.8 1027 3320.050.05 1073.33 9320.010.01 1092 10030.10.8 10 1052 7630.010.01 10 10100 100 100% Classification rate is obtainedMulti-Layer perceptron: Result 100% Classification rate is obtainedWith a 400 iterations training.Conclusion• Kohonen SOM reduces the problem to a 12 dimension problem (previously, working on pictures mean thousands dimension) .• Multi-Layer perceptron needs a training, but once it is trained computations are made very fast.• we can obtain a 100% classification rate with 3 sounds.•Pb: because of Matlab, transforming picture into Matrix needs computations. (solution: use another language more picture processing-oriented)Some references-Image compression by Self-Organized kohonen MapChristophe Amerijckx, Philippe Thissen..IEE Transition on Neural Networks 1998.http://www.dice.ucl.ac.be/~verleyse/papers/ieeetnn98ca.pdf-SRAM bitmap shape recognition and sorting Using Neural Networks.Randall S. Collica. IEEE.http://www.ibexprocess.com/solutions/wp_SRAM.pdf-From your lips to your printer.James Fallow.-SRAM bitmap shape recognition and sorting using neural networks.Collica, R.S., Card, J.P., and Martin. W. ISBN 0894-6507-A kohonen Neural Network Controlled All-optical router system.E.E.E Frietman, M.T. Hill, G.D.
View Full Document