reading and parsing a TSV file, then manipulating it for saving as CSV (*efficiently*)

You should use the csv module to read the tab-separated value file. Do not read it into memory in one go. Each row you read has all the information you need to write rows to the output CSV file, after all. Keep the output file open throughout.

import csv

with open('sample.txt', newline="") as tsvin, open('new.csv', 'w', newline="") as csvout:
    tsvin = csv.reader(tsvin, delimiter="\t")
    csvout = csv.writer(csvout)

    for row in tsvin:
        count = int(row[4])
        if count > 0:
            csvout.writerows([row[2:4] for _ in range(count)])

or, using the itertools module to do the repeating with itertools.repeat():

from itertools import repeat
import csv

with open('sample.txt', newline="") as tsvin, open('new.csv', 'w', newline="") as csvout:
    tsvin = csv.reader(tsvin, delimiter="\t")
    csvout = csv.writer(csvout)

    for row in tsvin:
        count = int(row[4])
        if count > 0:
            csvout.writerows(repeat(row[2:4], count))

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)