Changing multiple column names but not all of them – Pandas Python

say you have a dictionary of the new column names and the name of the column they should replace: df.rename(columns={‘old_col’:’new_col’, ‘old_col_2′:’new_col_2′}, inplace=True) But, if you don’t have that, and you only have the indices, you can do this: column_indices = [1,4,5,6] new_names = [‘a’,’b’,’c’,’d’] old_names = df.columns[column_indices] df.rename(columns=dict(zip(old_names, new_names)), inplace=True)

How can I “unpivot” specific columns from a pandas DataFrame?

This can be done with pd.melt(): # value_name is ‘value’ by default, but setting it here to make it clear pd.melt(x, id_vars=[‘farm’, ‘fruit’], var_name=”year”, value_name=”value”) Result: farm fruit year value 0 A apple 2014 10 1 B apple 2014 12 2 A pear 2014 6 3 B pear 2014 8 4 A apple 2015 11 … Read more

In Pandas, does .iloc method give a copy or view?

You are starting with a DataFrame that has two columns with two different dtypes: df.dtypes Out: age int64 name object dtype: object Since different dtypes are stored in different numpy arrays under the hood, you have two different blocks for them: df.blocks Out: {‘int64’: age student1 21 student2 24, ‘object’: name student1 Marry student2 John} … Read more

Pandas rolling apply using multiple columns

How about this: def masscenter(ser): print(df.loc[ser.index]) return 0 rol = df.price.rolling(window=2) rol.apply(masscenter, raw=False) It uses the rolling logic to get subsets from an arbitrary column. The raw=False option provides you with index values for those subsets (which are given to you as Series), then you use those index values to get multi-column slices from your … Read more

Grouping by multiple columns to find duplicate rows pandas

You need duplicated with parameter subset for specify columns for check with keep=False for all duplicates for mask and filter by boolean indexing: df = df[df.duplicated(subset=[‘val1′,’val2’], keep=False)] print (df) id val1 val2 0 1 1.1 2.2 1 1 1.1 2.2 3 3 8.8 6.2 4 4 1.1 2.2 5 5 8.8 6.2 Detail: print (df.duplicated(subset=[‘val1′,’val2’], … Read more

What is the fastest and most efficient way to append rows to a DataFrame?

As Mohit Motwani suggested fastest way is to collect data into dictionary then load all into data frame. Below some speed measurements examples: import pandas as pd import numpy as np import time import random end_value = 10000 Measurement for creating a list of dictionaries and at the end load all into data frame start_time … Read more

404 Not Found

Not Found

The requested URL was not found on this server.

Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.