Recently, I stumbled across __this article__, and I was wondering what the difference between the results you would get from a recurrent neural net, like the ones described above, and a simple Markov chain would be.

I don't understand the linear algebra happening under the hood in an RNN, but it seems that you are just designing a super convoluted way of making a statistical model for what the next letter is going to be based on the previous letters, something that is done very simply in a Markov Chain.

Why are RNNs interesting? Is it just because they are a more generalizable solution, or is there something happening that I am missing?