How do I avoid HTTP error 403 when web scraping with Python?
This is probably because of mod_security or some similar server security feature which blocks known spider/bot user agents (urllib uses something like python urllib/3.3.0, it’s easily detected). Try setting a known browser user agent with: from urllib.request import Request, urlopen req = Request( url=”http://www.cmegroup.com/trading/products/#sortField=oi&sortAsc=false&venues=3&page=1&cleared=1&group=1″, headers={‘User-Agent’: ‘Mozilla/5.0’} ) webpage = urlopen(req).read() This works for me. By … Read more