pandas-groupby – Page 3

What is the pandas equivalent of dplyr summarize/aggregate by multiple functions?

April 8, 2023 by Tarik

How to do a conditional count after groupby on a Pandas Dataframe?

March 29, 2023 by Tarik

I think you need add condition first: #if need also category c with no values of ‘one’ df11=df.groupby(‘key1’)[‘key2′].apply(lambda x: (x==’one’).sum()).reset_index(name=”count”) print (df11) key1 count 0 a 2 1 b 1 2 c 0 Or use categorical with key1, then missing value is added by size: df[‘key1’] = df[‘key1’].astype(‘category’) df1 = df[df[‘key2’] == ‘one’].groupby([‘key1’]).size().reset_index(name=”count”) print (df1) … Read more

What is the difference between pandas agg and apply function?

March 20, 2023 by Tarik

apply applies the function to each group (your Species). Your function returns 1, so you end up with 1 value for each of 3 groups. agg aggregates each column (feature) for each group, so you end up with one value per column per group. Do read the groupby docs, they’re quite helpful. There are also … Read more

Pandas – dataframe groupby – how to get sum of multiple columns

March 12, 2023 by Tarik

By using apply df.groupby([‘col1’, ‘col2’])[“col3”, “col4”].apply(lambda x : x.astype(int).sum()) Out[1257]: col3 col4 col1 col2 a c 2 4 d 1 2 b d 1 2 e 2 4 If you want to agg df.groupby([‘col1’, ‘col2’]).agg({‘col3′:’sum’,’col4′:’sum’})

How to move pandas data from index to column after multiple groupby

March 7, 2023 by Tarik

Method #1: reset_index() >>> g uses books sum sum token year xanthos 1830 3 3 1840 3 3 1868 2 2 1875 1 1 [4 rows x 2 columns] >>> g = g.reset_index() >>> g token year uses books sum sum 0 xanthos 1830 3 3 1 xanthos 1840 3 3 2 xanthos 1868 2 … Read more

Use pandas.shift() within a group

March 5, 2023 by Tarik

Pandas’ grouped objects have a groupby.DataFrameGroupBy.shift method, which will shift a specified column in each group n periods, just like the regular dataframe’s shift method: df[‘prev_value’] = df.groupby(‘object’)[‘value’].shift() For the following example dataframe: print(df) object period value 0 1 1 24 1 1 2 67 2 1 4 89 3 2 4 5 4 2 … Read more

What is the equivalent of SQL “GROUP BY HAVING” on Pandas?

February 18, 2023 by Tarik

As mentioned in unutbu’s comment, groupby’s filter is the equivalent of SQL’S HAVING: In [11]: df = pd.DataFrame([[1, 2], [1, 3], [5, 6]], columns=[‘A’, ‘B’]) In [12]: df Out[12]: A B 0 1 2 1 1 3 2 5 6 In [13]: g = df.groupby(‘A’) # GROUP BY A In [14]: g.filter(lambda x: len(x) > … Read more

Python Pandas Group by date using datetime data

February 13, 2023 by Tarik

You can use groupby by dates of column Date_Time by dt.date: df = df.groupby([df[‘Date_Time’].dt.date]).mean() Sample: df = pd.DataFrame({‘Date_Time’: pd.date_range(’10/1/2001 10:00:00′, periods=3, freq=’10H’), ‘B’:[4,5,6]}) print (df) B Date_Time 0 4 2001-10-01 10:00:00 1 5 2001-10-01 20:00:00 2 6 2001-10-02 06:00:00 print (df[‘Date_Time’].dt.date) 0 2001-10-01 1 2001-10-01 2 2001-10-02 Name: Date_Time, dtype: object df = df.groupby([df[‘Date_Time’].dt.date])[‘B’].mean() print(df) … Read more