During my Diploma thesis I read most literature about Hidden Markov Models from the speech community. Since I arrived at Tech and started reading more about the Machine Learning perspective on these models I noticed a difference in depiction of Hidden Markov Models. In this post I want to argue why I think the "old" speech depiction is a bit more expressive.

The old depiction is a state machine. Each state of the HMM is depicted as a circle and the arrows

show how one can "travel" through the model. I saw this depiction in the classic Rabiner papers.

show how one can "travel" through the model. I saw this depiction in the classic Rabiner papers.

In the second representation is a Hidden Markov Model as a Bayes Network. There are two kinds of nodes. The one at the top show a discrete variable, only dependent on it's successor. Encoded in this node are the transition probabilities. Or in other words the state machine is encoded in each of these nodes. There are as many transition nodes as there are samples in the time series. The nodes at the bottom are the observation probabilities which are not seen in the model from the speech community.

I think if one only publishes the second depiction it does not tell much about the model itself. One already assumes that there are observation distributions in an HMM so why depict them. Second just drawing as many nodes as there are samples is weird for me for two reasons. The first reason is that we do not depict anything about the transition model, which is the only thing to model in an HMM. The second reason is that the number of samples or time slices does not matter at all for an HMM. So I think drawing the state machine is a way better way of showing your model then making the ML depiction that does not tell much of your specific model except that it is an HMM. Just to make my point more clear. The second model, even when specifying in the text that we use 3 states could be the top model or the one at the bottom, without the reader knowing.

However, explaining inference algorithms such as variable elimination is easier in the middle model, but for specific models I prefer the state machine depiction.

## No comments:

## Post a Comment