>>> c="\xe7\x9a\x84".decode('utf8')
>>> c
u'\u7684'
>>> print c
的
though Unicode encodes it in 16 bits, utf8 breaks it down to 3 bytes.
>>> c="\xe7\x9a\x84".decode('utf8')
>>> c
u'\u7684'
>>> print c
的
though Unicode encodes it in 16 bits, utf8 breaks it down to 3 bytes.