Well, I found the answer, which was given by @BalusC on a different thread:
- If you just want to use a XML based
tool to traverse it: JTidy. - If you like to unit test the HTML:
HtmlUnit - If you like to extract specific data
from the HTML: Jsoup
Thank you @BalusC.