urllib2 – Page 2 – Tarik Billa

Get json data via url and use in python (simplejson)

September 1, 2023 by Tarik

Try f = opener.open(req) simplejson.load(f) without running f.read() first. When you run f.read(), the filehandle’s contents are slurped so there is nothing left when your call simplejson.load(f)

Parse XML from URL into python object

August 30, 2023 by Tarik

I’d use xmltodict to make a python dictionary out of the XML data structure and pass this dictionary to the template inside the context: import urllib2 import xmltodict def homepage(request): file = urllib2.urlopen(‘https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread’) data = file.read() file.close() data = xmltodict.parse(data) return render_to_response(‘my_template.html’, {‘data’: data})

Login to website using urllib2 – Python 2.7

August 24, 2023 by Tarik

I’ll preface this by saying I haven’t done logging in in this way for a while, so I could be missing some of the more ‘accepted’ ways to do it. I’m not sure if this is what you’re after, but without a library like mechanize or a more robust framework like selenium, in the basic … Read more

Python URLLib / URLLib2 POST

August 17, 2023 by Tarik

u = urllib2.urlopen(‘http://myserver/inout-tracker’, data) h.request(‘POST’, ‘/inout-tracker/index.php’, data, headers) Using the path /inout-tracker without a trailing / doesn’t fetch index.php. Instead the server will issue a 302 redirect to the version with the trailing /. Doing a 302 will typically cause clients to convert a POST to a GET request.

How to retry urllib2.request when fails?

August 13, 2023 by Tarik

I would use a retry decorator. There are other ones out there, but this one works pretty well. Here’s how you can use it: @retry(urllib2.URLError, tries=4, delay=3, backoff=2) def urlopen_with_retry(): return urllib2.urlopen(“http://example.com”) This will retry the function if URLError is raised. Check the link above for documentation on the parameters, but basically it will retry … Read more

How to send a POST request using django?

August 5, 2023 by Tarik

Here’s how you’d write the accepted answer’s example using python-requests: post_data = {‘name’: ‘Gladys’} response = requests.post(‘http://example.com’, data=post_data) content = response.content Much more intuitive. See the Quickstart for more simple examples.

Fetch a Wikipedia article with Python

August 4, 2023 by Tarik

You need to use the urllib2 that superseedes urllib in the python std library in order to change the user agent. Straight from the examples import urllib2 opener = urllib2.build_opener() opener.addheaders = [(‘User-agent’, ‘Mozilla/5.0’)] infile = opener.open(‘http://en.wikipedia.org/w/index.php?title=Albert_Einstein&printable=yes’) page = infile.read()

Download and decompress gzipped file in memory?

July 30, 2023 by Tarik

You need to seek to the beginning of compressedFile after writing to it but before passing it to gzip.GzipFile(). Otherwise it will be read from the end by gzip module and will appear as an empty file to it. See below: #! /usr/bin/env python import urllib2 import StringIO import gzip baseURL = “https://www.kernel.org/pub/linux/docs/man-pages/” filename = … Read more

Python and urllib2: how to make a GET request with parameters

July 26, 2023 by Tarik

Is urllib.urlencode() not enough? >>> import urllib >>> urllib.urlencode({‘foo’: ‘bar’, ‘bla’: ‘blah’}) foo=bar&bla=blah EDIT: You can also update the existing url: >>> import urlparse, urlencode >>> url_dict = urlparse.parse_qs(‘a=b&c=d’) >>> url_dict {‘a’: [‘b’], ‘c’: [‘d’]} >>> url_dict[‘a’].append(‘x’) >>> url_dict {‘a’: [‘b’, ‘x’], ‘c’: [‘d’]} >>> urllib.urlencode(url_dict, True) ‘a=b&a=x&c=d’ Note that parse_qs function was in cgi … Read more

Python urllib2 with keep alive

July 25, 2023 by Tarik

Use the urlgrabber library. This includes an HTTP handler for urllib2 that supports HTTP 1.1 and keepalive: >>> import urllib2 >>> from urlgrabber.keepalive import HTTPHandler >>> keepalive_handler = HTTPHandler() >>> opener = urllib2.build_opener(keepalive_handler) >>> urllib2.install_opener(opener) >>> >>> fo = urllib2.urlopen(‘http://www.python.org’) Note: you should use urlgrabber version 3.9.0 or earlier, as the keepalive module has been … Read more