How to parse malformed HTML in python, using standard libraries
Parsing HTML reliably is a relatively modern development (weird though that may seem). As a result there is definitely nothing in the standard library. HTMLParser may appear to be a way to handle HTML, but it’s not — it fails on lots of very common HTML, and though you can work around those failures there … Read more