PHP DOMDocument loadHTML not encoding UTF-8 correctly
DOMDocument::loadHTML will treat your string as being in ISO-8859-1 (the HTTP/1.1 default character set) unless you tell it otherwise. This results in UTF-8 strings being interpreted incorrectly. If your string doesn’t contain an XML encoding declaration, you can prepend one to cause the string to be treated as UTF-8: $profile=”<p>イリノイ州シカゴにて、アイルランド系の家庭に、9</p>”; $dom = new DOMDocument(); $dom->loadHTML(‘<?xml … Read more