Last updated 2004-06-28 by Roedy
Green ©1996-2004 Canadian Mind Products
Java definitions: 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
You are here : home : Java Glossary : U words : UTF.
| Unicode | UTF | bytes required to represent the character |
|---|---|---|
| 00000000 0xxxxxxx | 0xxxxxxx | 1 |
| 00000yyy yyxxxxxx | 110yyyyy 10xxxxxx | 2 |
| zzzzyyyy yyxxxxxx | 1110zzzz 10yyyyyy 10xxxxxx | 3 |
// encode unicode-16 into UTF-8 void putwchar( char c) { if ( c < 0x80 ) { putchar ( c ); } else if ( c < 0x800 ) { putchar ( 0xC0 | c >> 6 ); putchar ( 0x80 | c & 0x3F ); } else if ( c < 0x10000 ) { putchar ( 0xE0 | c >> 12 ); putchar ( 0x80 | c >> 6 & 0x3F ); putchar ( 0x80 | c & 0x3F ); } else if ( c < 0x200000 ) { putchar ( 0xF0 | c >> 18 ); putchar ( 0x80 | c >> 12 & 0x3F ); putchar ( 0x80 | c >> 6 & 0x3F ); putchar ( 0x80 | c & 0x3F ); } }
UTF-7 is encoded like this, I kid you not:
home |
Canadian Mind Products | |||
| mindprod.com IP:[24.87.56.253] | ||||
| Your IP:[80.134.30.163] | ||||
| You are visitor number 2094. | ||||
| Please send errors, omissions and suggestions | ||||
| to improve this page to Roedy Green. | ||||
| You can get a fresh copy of this page from: | or possibly from your local J: drive mirror: | |||
| http://mindprod.com/jgloss/utf.html | J:\mindprod\jgloss\utf.html | |||