I have spent a day researching a library that can be used to accomplish the following:
- Retrieve the full contents of a webpage like in the background without rendering the result to a view.
- The lib should support pages that fire off ajax requests to load some additional result data after the initial HTML has loaded for example.
- From the resulting html, I need to grab elements in XPath or CSS selector form.
- In future, I also possibly need to navigate to a next page (fire off events, submitting buttons/links, etc)
Here is what I have tried without success:
- Jsoup: Works great but no support for javascript/ajax (so it does not load full page)
- Android built-in HttpEntity: the same problem with javascript/ajax as jsoup
- HtmlUnit: Looks exactly what I need but after hours cannot get it to work on Android (Other users failed by trying to load the 12MB+ worth of jar files. I myself loaded the full source code and referenced it as a project library only to find that things such as Applets and java.awt (used by HtmlUnit) do not exist in Android).
- Rhino - I find this very confusing and don't know how to get it working in Android and even if it is what I am looking for.
- Selenium Driver: Looks like it can work but you don't have a straightforward way to implement it in a headless way so that you don't have the actual HTML displayed to a view.
I really want HtmlUnit to work as it seems the best suited for my solution. Is there any way or at least another library I have missed that is suitable for my needs?
I am currently using Android Studio 0.1.7 and can move to Ellipse if needed.
Thanks in advance!