HMM and Neural NetworkHMMTasksInference MAP (given the parameters)幻灯片编号 5Forward Algorithm幻灯片编号 7Backward Algorithm Viterbi AlgorithmViterbi AlgorithmViterbi AlgorithmTasksLearning Learning: Fully ObservedExample Learning: Fully ObservedLearning: partially observedLearning: partially observedLearning: partially observedLearning: partially observedEM Summary for HMM LearningNeural Network Activation FunctionSigmoid UnitMultilayer Multiple Output NNBack PropagationXi Chen(HMM Modified Based on Amr’s Recitation)1Factorization 2Short hand# of ParamtersInitial State K-1Transition K*(K-1)Emission K*(M-1) Inference (known parameters): - MAP: - Veterbi: Learning (learn parameters): - Fully Observed Data: Count and Normalize - Partially Observed Data: EM 3 Find: 4Trick: add a variable and marginalize over it to enable the recursion56Compute:78 Find the globally maximal posterior sequence: Goal: Maximal Probability of ending in state kat time twhere we maximize over 9Keeping track of iScaling: 10Hidden State: 11 Inference (known parameters): - MAP: - Veterbi: Learning (learn parameters): - Fully Observed Model: count and normalize - No Observed Hidden State: EM 12 Fully Observed Data Initial State: Transition: Emission: 131415 All parameters are decoupled Take the gradient w.r.t each parameter and set it to zero Simple count and normalization 16 EM: E-Step1718Forward-Backward Algorithm19 EM: M-Step20Solve MLE as in fully observed case:210101Output ( ) ( )() ( )niiiniiio f w wxnet w w xo f net=== +∑= +∑=xx22( )f net net=Linear activationThreshold activationHyperbolic tangent activationSigmoid activation( ) ( )2211netnetef net tanh nete−−−= =+( )11netf nete−=+( )1, 0,sign( )1, 0.if netf net netif net≥= =−<z1-11001 -1 2324Forward
View Full Document