How can I iterate over rows in a Pandas DataFrame?

DataFrame.iterrows is a generator which yields both the index and row (as a Series): import pandas as pd df = pd.DataFrame({‘c1’: [10, 11, 12], ‘c2’: [100, 110, 120]}) df = df.reset_index() # make sure indexes pair with number of rows for index, row in df.iterrows(): print(row[‘c1’], row[‘c2’]) 10 100 11 110 12 120 Obligatory disclaimer … Read more

Must have equal len keys and value when setting with an iterable

You can use apply to index into leader and exchange values with DatasetLabel, although it’s not very pretty. One issue is that Pandas won’t let us index with NaN. Converting to str provides a workaround. But that creates a second issue, namely, column 9 is of type float (because NaN is float), so 5 becomes … Read more

How to make two rows in a pandas dataframe into column headers

If using pandas.read_csv() or pandas.read_table(), you can provide a list of indices for the header argument, to specify the rows you want to use for column headers. Python will generate the pandas.MultiIndex for you in df.columns: df = pandas.read_csv(‘DollarUnitSales.csv’, header=[0,1]) You can also use more than two rows, or non-consecutive rows, to specify the column … Read more

How to convert pandas dataframe to nested dictionary

I think you were very close. Use groupby and to_dict: df = df.groupby(‘Name’)[[‘Chain’,’Food’,’Healthy’]] .apply(lambda x: x.set_index(‘Chain’).to_dict(orient=”index”)) .to_dict() print (df) {‘George’: {‘KFC’: {‘Healthy’: False, ‘Food’: ‘chicken’}, ‘McDonalds’: {‘Healthy’: False, ‘Food’: ‘burger’}}, ‘John’: {‘McDonalds’: {‘Healthy’: True, ‘Food’: ‘salad’}, ‘Wendys’: {‘Healthy’: False, ‘Food’: ‘burger’}}}

How to check if a pandas dataframe contains only numeric values column-wise?

You can check that using to_numeric and coercing errors: pd.to_numeric(df[‘column’], errors=”coerce”).notnull().all() For all columns, you can iterate through columns or just use apply df.apply(lambda s: pd.to_numeric(s, errors=”coerce”).notnull().all()) E.g. df = pd.DataFrame({‘col’ : [1,2, 10, np.nan, ‘a’], ‘col2’: [‘a’, 10, 30, 40 ,50], ‘col3’: [1,2,3,4,5.0]}) Outputs col False col2 False col3 True dtype: bool