How to convert currency column with $ and , to numbers

  • pandas has three .replace methods:
    1. pandas.Series.replace for a column
    2. pandas.Series.str.replace for a column
    3. pandas.DataFrame.replace for multiple columns, and removes the need to use .apply
    • regex=False is the default setting, so set regex=True
  • df[df.columns[1:]] selects the last three columns.
  • Tested in python 3.11.4, pandas 2.1.0
# replace values only in selected columns
df[df.columns[1:]] = df[df.columns[1:]].replace('[\$,]', '', regex=True).astype(float)

# replace values in all columns
df = df.replace('[\$,]', '', regex=True).astype(float)
  • Additional patterns for cleaning monetary columns:
    • '[^.0-9]': remove all non-digits except decimals points
    • '[^.0-9\-]': remove all non-digits except decimals points, and negative signs
    • '\D': remove all non-digits, including the decimal point, and negative sign, so it’s good for positive int only columns

Leave a Comment

tech