Rename a single pandas DataFrame column without knowing column name

Should work: drugInfo.rename(columns = {list(drugInfo)[1]: ‘col_1_new_name’}, inplace = True) Example: In [18]: df = pd.DataFrame({‘a’:randn(5), ‘b’:randn(5), ‘c’:randn(5)}) df Out[18]: a b c 0 -1.429509 -0.652116 0.515545 1 0.563148 -0.536554 -1.316155 2 1.310768 -3.041681 -0.704776 3 -1.403204 1.083727 -0.117787 4 -0.040952 0.108155 -0.092292 In [19]: df.rename(columns={list(df)[1]:’col1_new_name’}, inplace=True) df Out[19]: a col1_new_name c 0 -1.429509 -0.652116 0.515545 … Read more

PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance

Aware that this might be a reply that some will find highly controversial, I’m still posting my opinion here… Proposed answer: Ignore the warning. If the user thinks/observes that the code suffers from poor performance, it’s the user’s responsibility to fix it, not the module’s responsibility to propose code refactoring steps. Rationale for this harsh … Read more

Create multiindex from existing dataframe

You could simply use groupby in this case, which will create the multi-index automatically when it sums the sales along the requested columns. df.groupby([‘user_id’, ‘account_num’, ‘dates’]).sales.sum().to_frame() You should also be able to simply do this: df.set_index([‘user_id’, ‘account_num’, ‘dates’]) Although you probably want to avoid any duplicates (e.g. two or more rows with identical user_id, account_num … Read more

How to surface plot/3d plot from dataframe

.plot_surface() takes 2D arrays as inputs, not 1D DataFrame columns. This has been explained quite well here, along with the below code that illustrates how one could arrive at the required format using DataFrame input. Reproduced below with minor modifications like additional comments. Alternatively, however, there is .plot_trisurf() which uses 1D inputs. I’ve added an … Read more

Pandas groupby and aggregation output should include all the original columns (including the ones not aggregated on)

agg with a dict of functions Create a dict of functions and pass it to agg. You’ll also need as_index=False to prevent the group columns from becoming the index in your output. f = {‘NET_AMT’: ‘sum’, ‘QTY_SOLD’: ‘sum’, ‘UPC_DSC’: ‘first’} df.groupby([‘month’, ‘UPC_ID’], as_index=False).agg(f) month UPC_ID UPC_DSC NET_AMT QTY_SOLD 0 2017.02 111 desc1 10 2 1 … Read more

Aggregation over Partition in pandas

You can use pandas transform() method for within group aggregations like “OVER(partition by …)” in SQL: import pandas as pd import numpy as np #create dataframe with sample data df = pd.DataFrame({‘group’:[‘A’,’A’,’A’,’B’,’B’,’B’],’value’:[1,2,3,4,5,6]}) #calculate AVG(value) OVER (PARTITION BY group) df[‘mean_value’] = df.groupby(‘group’).value.transform(np.mean) df: group value mean_value A 1 2 A 2 2 A 3 2 B … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)