pandas-groupby
How to do a conditional count after groupby on a Pandas Dataframe?
I think you need add condition first: #if need also category c with no values of ‘one’ df11=df.groupby(‘key1’)[‘key2′].apply(lambda x: (x==’one’).sum()).reset_index(name=”count”) print (df11) key1 count 0 a 2 1 b 1 2 c 0 Or use categorical with key1, then missing value is added by size: df[‘key1’] = df[‘key1’].astype(‘category’) df1 = df[df[‘key2’] == ‘one’].groupby([‘key1’]).size().reset_index(name=”count”) print (df1) … Read more
What is the difference between pandas agg and apply function?
apply applies the function to each group (your Species). Your function returns 1, so you end up with 1 value for each of 3 groups. agg aggregates each column (feature) for each group, so you end up with one value per column per group. Do read the groupby docs, they’re quite helpful. There are also … Read more
Pandas – dataframe groupby – how to get sum of multiple columns
By using apply df.groupby([‘col1’, ‘col2’])[“col3”, “col4”].apply(lambda x : x.astype(int).sum()) Out[1257]: col3 col4 col1 col2 a c 2 4 d 1 2 b d 1 2 e 2 4 If you want to agg df.groupby([‘col1’, ‘col2’]).agg({‘col3′:’sum’,’col4′:’sum’})
How to move pandas data from index to column after multiple groupby
Method #1: reset_index() >>> g uses books sum sum token year xanthos 1830 3 3 1840 3 3 1868 2 2 1875 1 1 [4 rows x 2 columns] >>> g = g.reset_index() >>> g token year uses books sum sum 0 xanthos 1830 3 3 1 xanthos 1840 3 3 2 xanthos 1868 2 … Read more
Use pandas.shift() within a group
Pandas’ grouped objects have a groupby.DataFrameGroupBy.shift method, which will shift a specified column in each group n periods, just like the regular dataframe’s shift method: df[‘prev_value’] = df.groupby(‘object’)[‘value’].shift() For the following example dataframe: print(df) object period value 0 1 1 24 1 1 2 67 2 1 4 89 3 2 4 5 4 2 … Read more
Sample each group after pandas groupby
Apply a lambda and call sample with param frac: In [2]: df = pd.DataFrame({‘a’: [1,2,3,4,5,6,7], ‘b’: [1,1,1,0,0,0,0]}) grouped = df.groupby(‘b’) grouped.apply(lambda x: x.sample(frac=0.3)) Out[2]: a b b 0 6 7 0 1 2 3 1
What is the equivalent of SQL “GROUP BY HAVING” on Pandas?
As mentioned in unutbu’s comment, groupby’s filter is the equivalent of SQL’S HAVING: In [11]: df = pd.DataFrame([[1, 2], [1, 3], [5, 6]], columns=[‘A’, ‘B’]) In [12]: df Out[12]: A B 0 1 2 1 1 3 2 5 6 In [13]: g = df.groupby(‘A’) # GROUP BY A In [14]: g.filter(lambda x: len(x) > … Read more
Python Pandas Group by date using datetime data
You can use groupby by dates of column Date_Time by dt.date: df = df.groupby([df[‘Date_Time’].dt.date]).mean() Sample: df = pd.DataFrame({‘Date_Time’: pd.date_range(’10/1/2001 10:00:00′, periods=3, freq=’10H’), ‘B’:[4,5,6]}) print (df) B Date_Time 0 4 2001-10-01 10:00:00 1 5 2001-10-01 20:00:00 2 6 2001-10-02 06:00:00 print (df[‘Date_Time’].dt.date) 0 2001-10-01 1 2001-10-01 2 2001-10-02 Name: Date_Time, dtype: object df = df.groupby([df[‘Date_Time’].dt.date])[‘B’].mean() print(df) … Read more
Renaming Column Names in Pandas Groupby function [duplicate]
For the first question I think answer would be: <your DataFrame>.rename(columns={‘count’:’Total_Numbers’}) or <your DataFrame>.columns = [‘ID’, ‘Region’, ‘Total_Numbers’] As for second one I’d say the answer would be no. It’s possible to use it like ‘df.ID’ because of python datamodel: Attribute references are translated to lookups in this dictionary, e.g., m.x is equivalent to m.dict[“x”]