group-by

Group List of Objects based on Property using Linq?

April 9, 2024 by Tarik

It sounds like you want something like: // No need to sort sites first var grouped = sites.OrderBy(x => x.Type) .GroupBy(x => x.Type); Then just serialize grouped. However, I don’t know quite what an IGrouping will look like in JSON… and the type will be present in each case. You may want something like: var … Read more

Categories c# Tags .net, c++, group-by, linq, list Leave a comment

Get group id back into pandas dataframe

April 9, 2024 by Tarik

A lot of handy things are stored in the DataFrameGroupBy.grouper object. For example: >>> df = pd.DataFrame({‘Name’: [‘foo’, ‘bar’] * 3, ‘Rank’: np.random.randint(0,3,6), ‘Val’: np.random.rand(6)}) >>> grouped = df.groupby([“Name”, “Rank”]) >>> grouped.grouper. grouped.grouper.agg_series grouped.grouper.indices grouped.grouper.aggregate grouped.grouper.labels grouped.grouper.apply grouped.grouper.levels grouped.grouper.axis grouped.grouper.names grouped.grouper.compressed grouped.grouper.ngroups grouped.grouper.get_group_levels grouped.grouper.nkeys grouped.grouper.get_iterator grouped.grouper.result_index grouped.grouper.group_info grouped.grouper.shape grouped.grouper.group_keys grouped.grouper.size grouped.grouper.groupings grouped.grouper.sort grouped.grouper.groups and so: … Read more

Categories python Tags group-by, pandas, python Leave a comment

Is it possible to use Aggregate function in a Select statment without using Group By clause?

April 8, 2024 by Tarik

All columns in the SELECT clause that do not have an aggregate need to be in the GROUP BY Good: SELECT col1, col2, col3, MAX(col4) … GROUP BY col1, col2, col3 Also good: SELECT col1, col2, col3, MAX(col4) … GROUP BY col1, col2, col3, col5, col6 No other columns = no GROUP BY needed SELECT … Read more

Categories sql Tags aggregate-functions, group-by, sql, sql-server, sql-server-2005 Leave a comment

Python pandas unique value ignoring NaN

April 6, 2024 by Tarik

Define a function: def unique_non_null(s): return s.dropna().unique() Then use it in the aggregation: df.groupby(‘b’).agg({ ‘a’: [‘min’, ‘max’, unique_non_null], ‘c’: [‘first’, ‘last’, unique_non_null] })

Categories python Tags group-by, null, pandas, python, unique Leave a comment

Looking for pandas “ungroup by” operation opposite to .groupby in the following string aggregation?

February 19, 2024 by Tarik

The rough equivalent is .reset_index(), but it may not be helpful to think of it as the “opposite” of groupby(). You are splitting a string in to pieces, and maintaining each piece’s association with ‘family’. This old answer of mine does the job. Just set ‘family’ as the index column first, refer to the link … Read more

Categories python Tags group-by, pandas, python Leave a comment

How to get groupby sum of multiple columns

February 17, 2024 by Tarik

By using apply df.groupby([‘col1’, ‘col2’])[“col3”, “col4”].apply(lambda x : x.astype(int).sum()) Out[1257]: col3 col4 col1 col2 a c 2 4 d 1 2 b d 1 2 e 2 4 If you want to agg df.groupby([‘col1’, ‘col2’]).agg({‘col3′:’sum’,’col4′:’sum’})

Categories python Tags dataframe, group-by, pandas, python Leave a comment

Pandas equivalent of GROUP BY HAVING in SQL

February 16, 2024 by Tarik

As mentioned in unutbu’s comment, groupby’s filter is the equivalent of SQL’S HAVING: In [11]: df = pd.DataFrame([[1, 2], [1, 3], [5, 6]], columns=[‘A’, ‘B’]) In [12]: df Out[12]: A B 0 1 2 1 1 3 2 5 6 In [13]: g = df.groupby(‘A’) # GROUP BY A In [14]: g.filter(lambda x: len(x) > … Read more

Categories python Tags filtering, group-by, pandas, python, sql Leave a comment

How to use rolling functions for GroupBy objects

February 15, 2024 by Tarik

For the Googlers who come upon this old question: Regarding @kekert’s comment on @Garrett’s answer to use the new df.groupby(‘id’)[‘x’].rolling(2).mean() rather than the now-deprecated df.groupby(‘id’)[‘x’].apply(pd.rolling_mean, 2, min_periods=1) curiously, it seems that the new .rolling().mean() approach returns a multi-indexed series, indexed by the group_by column first and then the index. Whereas, the old approach would simply … Read more

Categories python Tags group-by, pandas, python, rolling-computation, rolling-sum Leave a comment

Converting a Pandas GroupBy multiindex output from Series back to DataFrame

January 10, 2024 by Tarik

g1 here is a DataFrame. It has a hierarchical index, though: In [19]: type(g1) Out[19]: pandas.core.frame.DataFrame In [20]: g1.index Out[20]: MultiIndex([(‘Alice’, ‘Seattle’), (‘Bob’, ‘Seattle’), (‘Mallory’, ‘Portland’), (‘Mallory’, ‘Seattle’)], dtype=object) Perhaps you want something like this? In [21]: g1.add_suffix(‘_Count’).reset_index() Out[21]: Name City City_Count Name_Count 0 Alice Seattle 1 1 1 Bob Seattle 2 2 2 Mallory … Read more

Categories python Tags dataframe, group-by, multi-index, pandas, python Leave a comment

Attaching a calculated column to an existing dataframe raises TypeError: incompatible index of inserted column with frame index

January 7, 2024 by Tarik

The problem is, as the Error message says, that the index of the calculated column you want to insert is incompatible with the index of df. The index of df is a simple index: In [8]: df.index Out[8]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8], dtype=”int64″) while the index of the calculated column … Read more

Categories python Tags dataframe, group-by, pandas, python, typeerror Leave a comment

Older posts

Page1 Page2 … Page26 Next →