pandas: where is the documentation for TimeGrouper?

pd.TimeGrouper() was formally deprecated in pandas v0.21.0 in favor of pd.Grouper().

The best use of pd.Grouper() is within groupby() when you’re also grouping on non-datetime-columns. If you just need to group on a frequency, use resample().

For example, say you have:

>>> import pandas as pd
>>> import numpy as np
>>> np.random.seed(444)

>>> df = pd.DataFrame({'a': np.random.choice(['x', 'y'], size=50),
                       'b': np.random.rand(50)},
                      index=pd.date_range('2010', periods=50))
>>> df.head()
            a         b
2010-01-01  y  0.959568
2010-01-02  x  0.784837
2010-01-03  y  0.745148
2010-01-04  x  0.965686
2010-01-05  y  0.654552

You could do:

>>> # `a` is dropped because it is non-numeric
>>> df.groupby(pd.Grouper(freq='M')).sum()
                  b
2010-01-31  18.5123
2010-02-28   7.7670

But the above is a little unnecessary because you’re only grouping on the index. Instead you could do:

>>> df.resample('M').sum()
                    b
2010-01-31  16.168086
2010-02-28   9.433712

to produce the same result.

Conversely, here’s a case where Grouper() would be useful:

>>> df.groupby([pd.Grouper(freq='M'), 'a']).sum()
                   b
           a        
2010-01-31 x  8.9452
           y  9.5671
2010-02-28 x  4.2522
           y  3.5148

For some more detail, take a look at Chapter 7 of Ted Petrou’s Pandas Cookbook.

Leave a Comment