What’s the difference between an “encoding,” a “character set,” and a “code page”?
A ‘character set’ is just what it says: a properly-specified list of distinct characters. An ‘encoding’ is a mapping between a character set (typically Unicode today) and a (usually byte-based) technical representation of the characters. UTF-8 is an encoding, but not a character set. It is an encoding of the Unicode character set(*). The confusion … Read more