Get group id back into pandas dataframe

A lot of handy things are stored in the DataFrameGroupBy.grouper object. For example: >>> df = pd.DataFrame({‘Name’: [‘foo’, ‘bar’] * 3, ‘Rank’: np.random.randint(0,3,6), ‘Val’: np.random.rand(6)}) >>> grouped = df.groupby([“Name”, “Rank”]) >>> grouped.grouper. grouped.grouper.agg_series grouped.grouper.indices grouped.grouper.aggregate grouped.grouper.labels grouped.grouper.apply grouped.grouper.levels grouped.grouper.axis grouped.grouper.names grouped.grouper.compressed grouped.grouper.ngroups grouped.grouper.get_group_levels grouped.grouper.nkeys grouped.grouper.get_iterator grouped.grouper.result_index grouped.grouper.group_info grouped.grouper.shape grouped.grouper.group_keys grouped.grouper.size grouped.grouper.groupings grouped.grouper.sort grouped.grouper.groups and so: … Read more

Is it possible to use Aggregate function in a Select statment without using Group By clause?

All columns in the SELECT clause that do not have an aggregate need to be in the GROUP BY Good: SELECT col1, col2, col3, MAX(col4) … GROUP BY col1, col2, col3 Also good: SELECT col1, col2, col3, MAX(col4) … GROUP BY col1, col2, col3, col5, col6 No other columns = no GROUP BY needed SELECT … Read more

Looking for pandas “ungroup by” operation opposite to .groupby in the following string aggregation?

The rough equivalent is .reset_index(), but it may not be helpful to think of it as the “opposite” of groupby(). You are splitting a string in to pieces, and maintaining each piece’s association with ‘family’. This old answer of mine does the job. Just set ‘family’ as the index column first, refer to the link … Read more

How to use rolling functions for GroupBy objects

For the Googlers who come upon this old question: Regarding @kekert’s comment on @Garrett’s answer to use the new df.groupby(‘id’)[‘x’].rolling(2).mean() rather than the now-deprecated df.groupby(‘id’)[‘x’].apply(pd.rolling_mean, 2, min_periods=1) curiously, it seems that the new .rolling().mean() approach returns a multi-indexed series, indexed by the group_by column first and then the index. Whereas, the old approach would simply … Read more

Converting a Pandas GroupBy multiindex output from Series back to DataFrame

g1 here is a DataFrame. It has a hierarchical index, though: In [19]: type(g1) Out[19]: pandas.core.frame.DataFrame In [20]: g1.index Out[20]: MultiIndex([(‘Alice’, ‘Seattle’), (‘Bob’, ‘Seattle’), (‘Mallory’, ‘Portland’), (‘Mallory’, ‘Seattle’)], dtype=object) Perhaps you want something like this? In [21]: g1.add_suffix(‘_Count’).reset_index() Out[21]: Name City City_Count Name_Count 0 Alice Seattle 1 1 1 Bob Seattle 2 2 2 Mallory … Read more

Attaching a calculated column to an existing dataframe raises TypeError: incompatible index of inserted column with frame index

The problem is, as the Error message says, that the index of the calculated column you want to insert is incompatible with the index of df. The index of df is a simple index: In [8]: df.index Out[8]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8], dtype=”int64″) while the index of the calculated column … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)