Truncating unicode so it fits a maximum size when encoded for wire transfer

def unicode_truncate(s, length, encoding='utf-8'):
    encoded = s.encode(encoding)[:length]
    return encoded.decode(encoding, 'ignore')

Here is an example for a Unicode string where each character is represented with 2 bytes in UTF-8 and that would’ve crashed if the split Unicode code point wasn’t ignored:

>>> unicode_truncate(u'абвгд', 5)
u'\u0430\u0431'

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)