Unicode and Character Sets | Search for a title, author or keyword | ||||||||
Unicode and Character Sets The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets ( No Excuses! ), by Joel Spolsky. Did you ever get an email from your friends in Bulgaria with the subject line "???? ?????? ??? ????"? Almost every stupid "my website looks like gibberish" or "she can't read my emails when I use accents" problem comes down to one naive programmer who didn't understand the mysterious world of character sets, encodings, Unicode, all that stuff. In this article I'll fill you in on exactly what every working programmer should know. Please remember one extremely important fact. It does not make sense to have a string without knowing what encoding it uses. You can no longer stick your head in the sand and pretend that "plain" text is ASCII. There Ain't No Such Thing As Plain Text. If you have a string, in memory, in a file, or in an email message, you have to know what encoding it is in: UTF-8 or ASCII or ISO 8859-1 ( Latin 1 ) or Windows 1252 ( Western European ), or you cannot interpret it or display it to users correctly. There are over a hundred encodings and above code point 127, all bets are off.
|
|||||||||
Unicode and Character Sets | Disclaimer: this link points to content provided by other sites. |