You should simply use train_on_batch(), which would give you greater control of the state of the LSTM.
When using a stateful LSTM and controlling calls to model.reset_states() is needed. You may have multi-series data and need to reset the state after each series, which you can do with train_on_batch(), but if you used .fit() then the network would be trained on all the series of data without resetting the state. It depends on what data you're using, and how you want the network to behave.
To know more about Keras, study Machine Learning Online Course. Machine Learning Tutorial on the same aforementioned topic would also help as well.
Hope this answer helps.