UTF-8 encoding table and Unicode characters | Search for a title, author or keyword | ||||||||
UTF-8 encoding table and Unicode characters by Tomas Schild. This site is intended as a reference for the UTF8-8 encoding of Unicode characters. The international standard ISO 10646 defines the Universal Character Set ( UCS ). UCS contains the characters required to represent practically all known languages. In the late 1980s, there have been two independent attempts to create a single unified character set. One was the ISO 10646 project of the International Organization for Standardization (ISO), the other was the Unicode Project organized by a consortium of ( initially mostly US ) manufacturers of multi-lingual software. Fortunately, the participants of both projects realized in around 1991 that two different unified character sets is not exactly what the world needs. They joined their efforts and worked together on creating a single code table. UCS and Unicode are first of all just code tables that assign integer numbers to characters. There exist several alternatives for how a sequence of such characters or their respective integer values can be represented as a sequence of bytes.
|
|||||||||
UTF-8 encoding table and Unicode characters | Disclaimer: this link points to content provided by other sites. |