KeyError when selecting pandas columns

Question

Use sep=r'\s*,\s*' to parse a file where the columns may have some number of spaces preceding or following the delimiter (e.g. , ):

transactions = pd.read_csv('transactions.csv', sep=r'\s*,\s*',
                           header=0, encoding='ascii', engine="python")

Prove:

print(transactions.columns)

Output:

Index(['product_id', 'customer_id', 'store_id', 'promotion_id', 'month_of_year', 'quarter', 'the_year', 'store_sales', 'store_cost', 'unit_sales', 'fact_count'], dtype="object")

Alternatively, remove unquoted spaces in the CSV file, and use your command (unchanged).

Leave a Comment Cancel reply