Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in AI and Deep Learning by (50.2k points)

Recently, I stumbled across this article, and I was wondering what the difference between the results you would get from a recurrent neural net, like the ones described above, and a simple Markov chain would be.

I don't understand the linear algebra happening under the hood in an RNN, but it seems that you are just designing a super convoluted way of making a statistical model for what the next letter is going to be based on the previous letters, something that is done very simply in a Markov Chain.

Why are RNNs interesting? Is it just because they are a more generalizable solution, or is there something happening that I am missing?

1 Answer

0 votes
by (108k points)

The Markov chain seizes the Markov property that is it's "memoryless". The possibility of the next symbol is calculated based on the k previous symbols. In practice, k is limited to low values

(let's say 3-5), because the transition matrix grows exponentially. Hence sentences generated by a Hidden Markov Model are very inconsistent.

On the other hand, RNNs (e.g. with LSTM units) are not restricted by the Markov property. Their rich internal state enables them to keep track of long-distance dependencies.

The foremost advantages we tend to get using a recurrent neural network(RNN) over Markov chains and hidden Markov model would be the greater objective power of neural network and their ability to perform intellectual smoothing by taking into account syntactic and semantic features.

Browse Categories

...