You have a couple options here:
pd.infer_freq
pd.tseries.frequencies.to_offset
I suspect that errors down the road are caused by the missing freq.
You are absolutely right. Here’s what I use often:
def add_freq(idx, freq=None):
"""Add a frequency attribute to idx, through inference or directly.
Returns a copy. If `freq` is None, it is inferred.
"""
idx = idx.copy()
if freq is None:
if idx.freq is None:
freq = pd.infer_freq(idx)
else:
return idx
idx.freq = pd.tseries.frequencies.to_offset(freq)
if idx.freq is None:
raise AttributeError('no discernible frequency found to `idx`. Specify'
' a frequency string with `freq`.')
return idx
An example:
idx=pd.to_datetime(['2003-01-02', '2003-01-03', '2003-01-06']) # freq=None
print(add_freq(idx)) # inferred
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-06'], dtype="datetime64[ns]", freq='B')
print(add_freq(idx, freq='D')) # explicit
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-06'], dtype="datetime64[ns]", freq='D')
Using asfreq
will actually reindex (fill) missing dates, so be careful of that if that’s not what you’re looking for.
The primary function for changing frequencies is the
asfreq
function.
For aDatetimeIndex
, this is basically just a thin, but convenient
wrapper aroundreindex
which generates adate_range
and callsreindex
.