How to remove accents from values in columns?

The pandas method is to use the vectorised str.normalize combined with str.decode and str.encode:

In [60]:
df['Country'].str.normalize('NFKD').str.encode('ascii', errors="ignore").str.decode('utf-8')

Out[60]:
0    Aland Islands
1    Aland Islands
2          Albania
3          Albania
4          Albania
Name: Country, dtype: object

So to do this for all str dtypes:

In [64]:
cols = df.select_dtypes(include=[np.object]).columns
df[cols] = df[cols].apply(lambda x: x.str.normalize('NFKD').str.encode('ascii', errors="ignore").str.decode('utf-8'))
df

Out[64]:
   Table Code        Country    Year       City      Value
0         240  Aland Islands  2014.0  MARIEHAMN  11437.0 1
1         240  Aland Islands  2010.0  MARIEHAMN  5829.5  1
2         240        Albania  2011.0     Durres   113249.0
3         240        Albania  2011.0     TIRANA   418495.0
4         240        Albania  2011.0     Durres    56511.0

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)