KeyError when selecting pandas columns

Use sep=r’\s*,\s*’ to parse a file where the columns may have some number of spaces preceding or following the delimiter (e.g. , ): transactions = pd.read_csv(‘transactions.csv’, sep=r’\s*,\s*’, header=0, encoding=’ascii’, engine=”python”) Prove: print(transactions.columns) Output: Index([‘product_id’, ‘customer_id’, ‘store_id’, ‘promotion_id’, ‘month_of_year’, ‘quarter’, ‘the_year’, ‘store_sales’, ‘store_cost’, ‘unit_sales’, ‘fact_count’], dtype=”object”) Alternatively, remove unquoted spaces in the CSV file, and use … Read more

Python dictionary key error when assigning – how do I get around this?

KeyError occurs because you are trying to read a non-existant key when you try to access myDict[2000]. As an alternative, you could use defaultdict: >>> from collections import defaultdict >>> myDict = defaultdict(dict) >>> myDict[2000][‘hello’] = 50 >>> myDict[2000] {‘hello’: 50} defaultdict(dict) means that if myDict encounters an unknown key, it will return a default … Read more

Why do I get a KeyError when using pandas apply?

As answered by EdChum in the comments. The issue is that apply works column wise by default (see the docs). Therefore, the column names cannot be accessed. To specify that it should be applied to each row instead, axis=1 must be passed: test.apply(lambda x: find_max(x,test,’document_id’,’confidence_level’,’category_id’), axis=1)