How do you read a file inside a zip file as text, not bytes?

I just noticed that Lennart’s answer didn’t work with Python 3.1, but it does work with Python 3.2. They’ve enhanced zipfile.ZipExtFile in Python 3.2 (see release notes). These changes appear to make zipfile.ZipExtFile work nicely with io.TextWrapper.

Incidentally, it works in Python 3.1, if you uncomment the hacky lines below to monkey-patch zipfile.ZipExtFile, not that I would recommend this sort of hackery. I include it only to illustrate the essence of what was done in Python 3.2 to make things work nicely.

$ cat test_zip_file_py3k.py 
import csv, io, sys, zipfile

zip_file    = zipfile.ZipFile(sys.argv[1])
items_file  = zip_file.open('items.csv', 'rU')
# items_file.readable = lambda: True
# items_file.writable = lambda: False
# items_file.seekable = lambda: False
# items_file.read1 = items_file.read
items_file  = io.TextIOWrapper(items_file)
    
for idx, row in enumerate(csv.DictReader(items_file)):
    print('Processing row {0} -- row = {1}'.format(idx, row))

If I had to support py3k < 3.2, then I would go with the solution in my other answer.

Update for 3.6+

Starting w/3.6, support for mode="U" was removed^1:

Changed in version 3.6: Removed support of mode="U". Use io.TextIOWrapper for reading compressed text files in universal newlines mode.

Starting w/3.8, a Path object was added which gives us an open() method that we can call like the built-in open() function (passing newline="" in the case of our CSV) and we get back an io.TextIOWrapper object the csv readers accept. See Yuri’s answer, here.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)