Get a list from Pandas DataFrame column headers
You can get the values as a list by doing: list(my_dataframe.columns.values) Also you can simply use (as shown in Ed Chum’s answer): list(my_dataframe)
You can get the values as a list by doing: list(my_dataframe.columns.values) Also you can simply use (as shown in Ed Chum’s answer): list(my_dataframe)
You have four main options for converting types in pandas: to_numeric() – provides functionality to safely convert non-numeric types (e.g. strings) to a suitable numeric type. (See also to_datetime() and to_timedelta().) astype() – convert (almost) any type to (almost) any other type (even if it’s not necessarily sensible to do so). Also allows you to … Read more
One easy way would be to reassign the dataframe with a list of the columns, rearranged as needed. This is what you have now: In [6]: df Out[6]: 0 1 2 3 4 mean 0 0.445598 0.173835 0.343415 0.682252 0.582616 0.445543 1 0.881592 0.696942 0.702232 0.696724 0.373551 0.670208 2 0.662527 0.955193 0.131016 0.609548 0.804694 0.632596 … Read more
The column names (which are strings) cannot be sliced in the manner you tried. Here you have a couple of options. If you know from context which variables you want to slice out, you can just return a view of only those columns by passing a list into the __getitem__ syntax (the []’s). df1 = … Read more
For a dataframe df, one can use any of the following: len(df.index) df.shape[0] df[df.columns[0]].count() (== number of non-NaN values in first column) Code to reproduce the plot: import numpy as np import pandas as pd import perfplot perfplot.save( “out.png”, setup=lambda n: pd.DataFrame(np.arange(n * 3).reshape(n, 3)), n_range=[2**k for k in range(25)], kernels=[ lambda df: len(df.index), lambda … Read more
The best way to do this in Pandas is to use drop: df = df.drop(‘column_name’, axis=1) where 1 is the axis number (0 for rows and 1 for columns.) To delete the column without having to reassign df you can do: df.drop(‘column_name’, axis=1, inplace=True) Finally, to drop by column number instead of by column label, … Read more
RENAME SPECIFIC COLUMNS Use the df.rename() function and refer the columns to be renamed. Not all the columns have to be renamed: df = df.rename(columns={‘oldName1’: ‘newName1’, ‘oldName2’: ‘newName2’}) # Or rename the existing DataFrame (rather than creating a copy) df.rename(columns={‘oldName1’: ‘newName1’, ‘oldName2’: ‘newName2’}, inplace=True) Minimal Code Example df = pd.DataFrame(‘x’, index=range(3), columns=list(‘abcde’)) df a b … Read more
To select rows whose column value equals a scalar, some_value, use ==: df.loc[df[‘column_name’] == some_value] To select rows whose column value is in an iterable, some_values, use isin: df.loc[df[‘column_name’].isin(some_values)] Combine multiple conditions with &: df.loc[(df[‘column_name’] >= A) & (df[‘column_name’] <= B)] Note the parentheses. Due to Python’s operator precedence rules, & binds more tightly than … Read more
DataFrame.iterrows is a generator which yields both the index and row (as a Series): import pandas as pd df = pd.DataFrame({‘c1’: [10, 11, 12], ‘c2’: [100, 110, 120]}) df = df.reset_index() # make sure indexes pair with number of rows for index, row in df.iterrows(): print(row[‘c1’], row[‘c2’]) 10 100 11 110 12 120