UTF-8 Continuation bytes

A continuation byte in UTF-8 is any byte where the top two bits are 10. They are the subsequent bytes in multi-byte sequences. The following table may help: Unicode code points Encoding Binary value ——————- ——– ———— U+000000-U+00007f 0xxxxxxx 0xxxxxxx U+000080-U+0007ff 110yyyxx 00000yyy xxxxxxxx 10xxxxxx U+000800-U+00ffff 1110yyyy yyyyyyyy xxxxxxxx 10yyyyxx 10xxxxxx U+010000-U+10ffff 11110zzz 000zzzzz yyyyyyyy … Read more

Remove diacritics using Go

You can use the libraries described in Text normalization in Go. Here’s an application of those libraries: // Example derived from: http://blog.golang.org/normalization package main import ( “fmt” “unicode” “golang.org/x/text/transform” “golang.org/x/text/unicode/norm” ) func isMn(r rune) bool { return unicode.Is(unicode.Mn, r) // Mn: nonspacing marks } func main() { t := transform.Chain(norm.NFD, transform.RemoveFunc(isMn), norm.NFC) result, _, _ … Read more

How to get UTF-8 in Node.js?

Hook into you response generator or create a middleware that does the following: res.setHeader(“Content-Type”, “application/json; charset=utf-8”); Otherwise the browser displays the content in it’s favorite encoding. If this doesn’t help you DB is probably in the wrong encoding. For older node.js versions use: res.header(“Content-Type”, “application/json; charset=utf-8”);

Django dumpdata UTF-8 (Unicode)

After struggling with similar issues, I’ve just found, that xml formatter handles UTF8 properly. manage.py dumpdata –format=xml > output.xml I had to transfer data from Django 0.96 to Django 1.3. After numerous tries with dump/load data, I’ve finally succeeded using xml. No side effects for now. Hope this will help someone, as I’ve landed at … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)