
1.) 7-bit ASCII (127 characters)
    International standard.

2.) 8-but extended ASCII (256 characters)
    Not standardized.
    Code pages
    Code pages vs. fonts

3.) 16-bit Unicode (~65,000 characters)
    International standard (but outdated).
    UCS-2 (obsolete)
    UTF-16
    UTF-16LE
    UTF-16BE
    BOM
    Need to worry about endianness (byte order).
    Used by Java and Windows.

4.) 32-bit Unicode (~4 billion characters)
    UTF-32
    UTF-32LE
    UTF-32BE
    Need to worry about endianness (byte order).
    Not really used, wastes too much memory space.

5.) 8-bit Unicode
    UTF-1 (obsolete)
    UTF-8
    What the Internet uses.
    Variable length encoding of Unicode's 21-bit address space.
    Uses 1 to 4 bytes per character:
       0xxxxxxx
       110xxxxx  10xxxxxx
       1110xxxx  10xxxxxx  10xxxxxx
       11110xxx  10xxxxxx  10xxxxxx  10xxxxxx

6.) 21-bit Unicode (~2 million characters)
    Unicode's "internal" address space.
    There is no such thing as UTF-21.
