HMM POS Tagging
HMM and Part of Speech Tagging. Viterbi Algorithm. Limits of Sequence Labeling.
POS Tagger ToolPenn Treebank POS Tag Set
Clause introduced by a (possibly empty) subordinating conjunction.
Examples:
[S I can't believe [SBAR that John went without me.]]
[S I can't believe [SBAR __ John went without me.]]
HMM Viterbi Algorithm
Training States
Transition Probability / Prior Probability
Trans_Prob(TagA→TagB)=Count(TagA)Count(TagB_following_TagA)
Emission Probability / Likelihood Emis_Prob(TokenA,TagA)=Count(TagA)Count(TokenA_being_TagA)
Viterbi Algorithm
Each step:
Step_Prob(TokenA,TagA)=max[Step_Prob(Last_Token,TagX)∗Trans_Prob(TagX,TagA)]∗Emis_Prob(TokenA,TagA)
Example: Fish sleep.
Transition Probability
Transition Probability for "Fish sleep." Emission Probability
Emis_Prob(fish,noun)=8+28=0.8 Emis_Prob(fish,verb)=5+55=0.5 Emis_Prob(sleep,noun)=8+22=0.2 Emis_Prob(sleep,verb)=5+55=0.5
Steps result for "Fish sleep."