utf-8 – Page 8 – Tarik Billa

Convert escaped Unicode character back to actual character

September 18, 2023 by Tarik

try str = org.apache.commons.lang3.StringEscapeUtils.unescapeJava(str); from Apache Commons Lang

How does UTF-8 encoding identify single byte and double byte characters?

September 16, 2023 by Tarik

For example, “Aݔ” is stored as “410754” That’s not how UTF-8 works. Characters U+0000 through U+007F (aka ASCII) are stored as single bytes. They are the only characters whose codepoints numerically match their UTF-8 presentation. For example, U+0041 becomes 0x41 which is 01000001 in binary. All other characters are represented with multiple bytes. U+0080 through … Read more

DomDocument and special characters

September 15, 2023 by Tarik

strtolower() for unicode/multibyte strings

September 15, 2023 by Tarik

File was loaded in the wrong encoding:’UTF-8′ in IntelliJ IDEA

September 14, 2023 by Tarik

As Tarik point out, click the Reload in another encoding, and if you want UTF-8 encoding, then click the more -> UTF-8.

What is a unicode string? [closed]

September 11, 2023 by Tarik

Update: Python 3 In Python 3, Unicode strings are the default. The type str is a collection of Unicode code points, and the type bytes is used for representing collections of 8-bit integers (often interpreted as ASCII characters). Here is the code from the question, updated for Python 3: >>> my_str=”A unicode \u018e string \xf1″ … Read more

What encoding does std::string.c_str() use?

September 11, 2023 by Tarik

std::string per se uses no encoding — it will return the bytes you put in it. For example, those bytes might be using ISO-8859-1 encoding… or any other, really: the information about the encoding is just not there — you have to know where the bytes were coming from!

MySQL throws Incorrect string value error

September 8, 2023 by Tarik

It’s the character at the end of the tweet that’s causing the problem. It looks like an ’emoji’ character aka japanese smiley face but it’s not displaying for me in either Chrome or Safari. There are known issues storing 4byte utf characters in some versions of MySQL. Apparently you must use utf8mb4 to represent 4 … Read more

URL Decoding in PHP

September 6, 2023 by Tarik

How to avoid echoing character 65279 in php?

September 5, 2023 by Tarik