How to set character_set_database and collation_database to utf8 in my.ini?

This actually isn’t a setting in the my.cnf (or my.ini in this case). mySQL gets this setting from the database’s own collation (when it was created). Inorder to get this inline with the utf8 encoding you want, do this: ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_general_ci; then do a restart on mysql (cant remember … Read more

utf8 garbled when importing into mysql

I think it might have something to do with collation as well, but I’m not sure. In my case it certainly did, since I had to support cyrillic. Try this, worked for me: Set initial collation while creating the target database to utf8_unicode_ci Add SET NAMES ‘utf8’ COLLATE ‘utf8_unicode_ci’; to the top of your sql … Read more

How to correctly parse UTF-8 encoded HTML to Unicode strings with BeautifulSoup? [duplicate]

As justhalf points out above, my question here is essentially a duplicate of this question. The HTML content reported itself as UTF-8 encoded and, for the most part it was, except for one or two rogue invalid UTF-8 characters. This apparently confuses BeautifulSoup about which encoding is in use, and when trying to first decode … Read more

Dangers of sys.setdefaultencoding(‘utf-8’)

The original poster asked for code which demonstrates that the switch is harmful—except that it “hides” bugs unrelated to the switch. Updates [2020-11-01]: pip install setdefaultencoding Eradicates the need to reload(sys) (from Thomas Grainger). [2019]: Personal experience with python3: No unicode en/decoding problems. Reasons: Got used to writing .encode(‘utf-8’) .decode(‘utf-8’) a (felt) 100 times a … Read more

Convert between std::u8string and std::string

UTF-8 “support” in C++20 seems to be a bad joke. The only UTF functionality in the Standard Library is support for strings and string_views (std::u8string, std::u8string_view, std::u16string, …). That is all. There is no Standard Library support for UTF coding in regular expressions, formatting, file i/o and so on. In C++17 you can–at least–easily treat … Read more

Why is the return value of String.addingPercentEncoding() optional?

I filed a bug report with Apple about this, and heard back — with a very helpful response, no less! Turns out (much to my surprise) that it’s possible to successfully create Swift strings that contain invalid Unicode in the form of unpaired UTF-16 surrogate chars. Such a string can cause UTF-8 encoding to fail. … Read more

What is ?

The characters you are reading on your screen now each have a numerical value. In the ASCII format, for example, the letter ‘A’ is 65, ‘B’ is 66, and so on. If you look at a table of characters available in ASCII you will see that it isn’t much use for someone who wishes to … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)