Python urllib2 with keep alive

Use the urlgrabber library. This includes an HTTP handler for urllib2 that supports HTTP 1.1 and keepalive: >>> import urllib2 >>> from urlgrabber.keepalive import HTTPHandler >>> keepalive_handler = HTTPHandler() >>> opener = urllib2.build_opener(keepalive_handler) >>> urllib2.install_opener(opener) >>> >>> fo = urllib2.urlopen(‘http://www.python.org’) Note: you should use urlgrabber version 3.9.0 or earlier, as the keepalive module has been … Read more

urllib.quote() throws KeyError

You are trying to quote Unicode data, so you need to decide how to turn that into URL-safe bytes. Encode the string to bytes first. UTF-8 is often used: >>> import urllib >>> urllib.quote(u’sch\xe9nefeld’) /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py:1268: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode – interpreting them as being unequal return ”.join(map(quoter, s)) … Read more

urllib2 read to Unicode

After the operations you performed, you’ll see: >>> req.headers[‘content-type’] ‘text/html; charset=windows-1251’ and so: >>> encoding=req.headers[‘content-type’].split(‘charset=”)[-1] >>> ucontent = unicode(content, encoding) ucontent is now a Unicode string (of 140655 characters) — so for example to display a part of it, if your terminal is UTF-8: >>> print ucontent[76:110].encode(“utf-8’) <title>Lenta.ru: Главное: </title> and you can search, etc, … Read more

How do I prevent Python’s urllib(2) from following a redirect

You could do a couple of things: Build your own HTTPRedirectHandler that intercepts each redirect Create an instance of HTTPCookieProcessor and install that opener so that you have access to the cookiejar. This is a quick little thing that shows both import urllib2 #redirect_handler = urllib2.HTTPRedirectHandler() class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler): def http_error_302(self, req, fp, code, msg, headers): … Read more

How can I use a SOCKS 4/5 proxy with urllib2?

You can use SocksiPy module. Simply copy the file “socks.py” to your Python’s lib/site-packages directory, and you’re ready to go. You must use socks before urllib2. (Try it pip install PySocks ) For example: import socks import socket socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, “127.0.0.1”, 8080) socket.socket = socks.socksocket import urllib2 print urllib2.urlopen(‘http://www.google.com’).read() You can also try pycurl lib and … Read more

How to make HTTP DELETE method using urllib2?

you can do it with httplib: import httplib conn = httplib.HTTPConnection(‘www.foo.com’) conn.request(‘PUT’, ‘/myurl’, body) resp = conn.getresponse() content = resp.read() also, check out this question. the accepted answer shows a way to add other methods to urllib2: import urllib2 opener = urllib2.build_opener(urllib2.HTTPHandler) request = urllib2.Request(‘http://example.org’, data=”your_put_data”) request.add_header(‘Content-Type’, ‘your/contenttype’) request.get_method = lambda: ‘PUT’ url = opener.open(request)

Using an HTTP PROXY – Python [duplicate]

You can do it even without the HTTP_PROXY environment variable. Try this sample: import urllib2 proxy_support = urllib2.ProxyHandler({“http”:”http://61.233.25.166:80″}) opener = urllib2.build_opener(proxy_support) urllib2.install_opener(opener) html = urllib2.urlopen(“http://www.google.com”).read() print html In your case it really seems that the proxy server is refusing the connection. Something more to try: import urllib2 #proxy = “61.233.25.166:80” proxy = “YOUR_PROXY_GOES_HERE” proxies = … Read more

Using MultipartPostHandler to POST form-data with Python

It seems that the easiest and most compatible way to get around this problem is to use the ‘poster’ module. # test_client.py from poster.encode import multipart_encode from poster.streaminghttp import register_openers import urllib2 # Register the streaming http handlers with urllib2 register_openers() # Start the multipart/form-data encoding of the file “DSC0001.jpg” # “image1” is the name … Read more

Python: urllib/urllib2/httplib confusion

Focus on urllib2 for this, it works quite well. Don’t mess with httplib, it’s not the top-level API. What you’re noting is that urllib2 doesn’t follow the redirect. You need to fold in an instance of HTTPRedirectHandler that will catch and follow the redirects. Further, you may want to subclass the default HTTPRedirectHandler to capture … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)