Python and BeautifulSoup encoding issues [duplicate]
In your case this page has wrong utf-8 data which confuses BeautifulSoup and makes it think that your page uses windows-1252, you can do this trick: soup = BeautifulSoup.BeautifulSoup(content.decode(‘utf-8′,’ignore’)) by doing this you will discard any wrong symbols from the page source and BeautifulSoup will guess the encoding correctly. You can replace ‘ignore’ by ‘replace’ … Read more