You can access this via attribute .groups
on the groupby
object, this returns a dict, the keys of the dict gives you the groups:
In [40]:
df = pd.DataFrame({'group':[0,1,1,1,2,2,3,3,3], 'val':np.arange(9)})
gp = df.groupby('group')
gp.groups.keys()
Out[40]:
dict_keys([0, 1, 2, 3])
here is the output from groups
:
In [41]:
gp.groups
Out[41]:
{0: Int64Index([0], dtype="int64"),
1: Int64Index([1, 2, 3], dtype="int64"),
2: Int64Index([4, 5], dtype="int64"),
3: Int64Index([6, 7, 8], dtype="int64")}
Update
it looks like that because the type of groups
is a dict
then the group order isn’t maintained when you call keys
:
In [65]:
df = pd.DataFrame({'group':list('bgaaabxeb'), 'val':np.arange(9)})
gp = df.groupby('group')
gp.groups.keys()
Out[65]:
dict_keys(['b', 'e', 'g', 'a', 'x'])
if you call groups
you can see the order is maintained:
In [79]:
gp.groups
Out[79]:
{'a': Int64Index([2, 3, 4], dtype="int64"),
'b': Int64Index([0, 5, 8], dtype="int64"),
'e': Int64Index([7], dtype="int64"),
'g': Int64Index([1], dtype="int64"),
'x': Int64Index([6], dtype="int64")}
then the key order is maintained, a hack around this is to access the .name
attribute of each group:
In [78]:
gp.apply(lambda x: x.name)
Out[78]:
group
a a
b b
e e
g g
x x
dtype: object
which isn’t great as this isn’t vectorised, however if you already have an aggregated object then you can just get the index values:
In [81]:
agg = gp.sum()
agg
Out[81]:
val
group
a 9
b 13
e 7
g 1
x 6
In [83]:
agg.index.get_level_values(0)
Out[83]:
Index(['a', 'b', 'e', 'g', 'x'], dtype="object", name="group")