How to set dtypes by column in pandas DataFrame

I just ran into this, and the pandas issue is still open, so I’m posting my workaround. Assuming df is my DataFrame and dtype is a dict mapping column names to types:

for k, v in dtype.items():
    df[k] = df[k].astype(v)

(note: use dtype.iteritems() in python 2)

For the reference:

  • The list of allowed data types (NumPy dtypes): https://docs.scipy.org/doc/numpy-1.12.0/reference/arrays.dtypes.html
  • Pandas also supports some other types. E.g., category: http://pandas.pydata.org/pandas-docs/stable/categorical.html
  • The relevant GitHub issue: https://github.com/pandas-dev/pandas/issues/9287

Leave a Comment