Get json data via url and use in python (simplejson)
Try f = opener.open(req) simplejson.load(f) without running f.read() first. When you run f.read(), the filehandle’s contents are slurped so there is nothing left when your call simplejson.load(f)
Try f = opener.open(req) simplejson.load(f) without running f.read() first. When you run f.read(), the filehandle’s contents are slurped so there is nothing left when your call simplejson.load(f)
I’d use xmltodict to make a python dictionary out of the XML data structure and pass this dictionary to the template inside the context: import urllib2 import xmltodict def homepage(request): file = urllib2.urlopen(‘https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread’) data = file.read() file.close() data = xmltodict.parse(data) return render_to_response(‘my_template.html’, {‘data’: data})
I’ll preface this by saying I haven’t done logging in in this way for a while, so I could be missing some of the more ‘accepted’ ways to do it. I’m not sure if this is what you’re after, but without a library like mechanize or a more robust framework like selenium, in the basic … Read more
u = urllib2.urlopen(‘http://myserver/inout-tracker’, data) h.request(‘POST’, ‘/inout-tracker/index.php’, data, headers) Using the path /inout-tracker without a trailing / doesn’t fetch index.php. Instead the server will issue a 302 redirect to the version with the trailing /. Doing a 302 will typically cause clients to convert a POST to a GET request.
I would use a retry decorator. There are other ones out there, but this one works pretty well. Here’s how you can use it: @retry(urllib2.URLError, tries=4, delay=3, backoff=2) def urlopen_with_retry(): return urllib2.urlopen(“http://example.com”) This will retry the function if URLError is raised. Check the link above for documentation on the parameters, but basically it will retry … Read more
Here’s how you’d write the accepted answer’s example using python-requests: post_data = {‘name’: ‘Gladys’} response = requests.post(‘http://example.com’, data=post_data) content = response.content Much more intuitive. See the Quickstart for more simple examples.
You need to use the urllib2 that superseedes urllib in the python std library in order to change the user agent. Straight from the examples import urllib2 opener = urllib2.build_opener() opener.addheaders = [(‘User-agent’, ‘Mozilla/5.0’)] infile = opener.open(‘http://en.wikipedia.org/w/index.php?title=Albert_Einstein&printable=yes’) page = infile.read()
You need to seek to the beginning of compressedFile after writing to it but before passing it to gzip.GzipFile(). Otherwise it will be read from the end by gzip module and will appear as an empty file to it. See below: #! /usr/bin/env python import urllib2 import StringIO import gzip baseURL = “https://www.kernel.org/pub/linux/docs/man-pages/” filename = … Read more
Is urllib.urlencode() not enough? >>> import urllib >>> urllib.urlencode({‘foo’: ‘bar’, ‘bla’: ‘blah’}) foo=bar&bla=blah EDIT: You can also update the existing url: >>> import urlparse, urlencode >>> url_dict = urlparse.parse_qs(‘a=b&c=d’) >>> url_dict {‘a’: [‘b’], ‘c’: [‘d’]} >>> url_dict[‘a’].append(‘x’) >>> url_dict {‘a’: [‘b’, ‘x’], ‘c’: [‘d’]} >>> urllib.urlencode(url_dict, True) ‘a=b&a=x&c=d’ Note that parse_qs function was in cgi … Read more
Use the urlgrabber library. This includes an HTTP handler for urllib2 that supports HTTP 1.1 and keepalive: >>> import urllib2 >>> from urlgrabber.keepalive import HTTPHandler >>> keepalive_handler = HTTPHandler() >>> opener = urllib2.build_opener(keepalive_handler) >>> urllib2.install_opener(opener) >>> >>> fo = urllib2.urlopen(‘http://www.python.org’) Note: you should use urlgrabber version 3.9.0 or earlier, as the keepalive module has been … Read more