You have a couple options here:
pd.infer_freqpd.tseries.frequencies.to_offset
I suspect that errors down the road are caused by the missing freq.
You are absolutely right. Here’s what I use often:
def add_freq(idx, freq=None):
"""Add a frequency attribute to idx, through inference or directly.
Returns a copy. If `freq` is None, it is inferred.
"""
idx = idx.copy()
if freq is None:
if idx.freq is None:
freq = pd.infer_freq(idx)
else:
return idx
idx.freq = pd.tseries.frequencies.to_offset(freq)
if idx.freq is None:
raise AttributeError('no discernible frequency found to `idx`. Specify'
' a frequency string with `freq`.')
return idx
An example:
idx=pd.to_datetime(['2003-01-02', '2003-01-03', '2003-01-06']) # freq=None
print(add_freq(idx)) # inferred
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-06'], dtype="datetime64[ns]", freq='B')
print(add_freq(idx, freq='D')) # explicit
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-06'], dtype="datetime64[ns]", freq='D')
Using asfreq will actually reindex (fill) missing dates, so be careful of that if that’s not what you’re looking for.
The primary function for changing frequencies is the
asfreqfunction.
For aDatetimeIndex, this is basically just a thin, but convenient
wrapper aroundreindexwhich generates adate_rangeand callsreindex.