How to convert string to datetime with nulls – python, pandas?

Just use to_datetime and set errors="coerce" to handle duff data:

In [321]:

df['Date'] = pd.to_datetime(df['Date'], errors="coerce")
df
Out[321]:
                 Date
0 2014-10-20 10:44:31
1 2014-10-23 09:33:46
2                 NaT
3 2014-10-01 09:38:45

In [322]:

df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 4 entries, 0 to 3
Data columns (total 1 columns):
Date    3 non-null datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 64.0 bytes

the problem with calling strptime is that it will raise an error if the string, or dtype is incorrect.

If you did this then it would work:

In [324]:

def func(x):
    try:
        return dt.datetime.strptime(x, '%Y-%m-%d %H:%M:%S')
    except:
        return pd.NaT

df['Date'].apply(func)
Out[324]:
0   2014-10-20 10:44:31
1   2014-10-23 09:33:46
2                   NaT
3   2014-10-01 09:38:45
Name: Date, dtype: datetime64[ns]

but it will be faster to use the inbuilt to_datetime rather than call apply which essentially just loops over your series.

timings

In [326]:

%timeit pd.to_datetime(df['Date'], errors="coerce")
%timeit df['Date'].apply(func)
10000 loops, best of 3: 65.8 µs per loop
10000 loops, best of 3: 186 µs per loop

We see here that using to_datetime is 3X faster.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)