A character is an abstract symbol. A character has a name. Examples:
PLUS SIGN CYRILLIC SMALL LETTER TSE BLACK CHESS KNIGHT MUSICAL SYMBOL FERMATA BELOW
Do not confuse a character with a glyph, which is a picture of a character. Two or more characters can share the same glyph (e.g. LATIN CAPITAL LETTER A and GREEK CAPITAL LETTER ALPHA), and one character can have many glyphs (think fonts).
A character set has two parts: (1) a repertoire, which is a set of characters, and (2) a code position mapping, which is a function mapping non-negative integers to characters in the repertoire. When an integer i maps to a character c we say i is the codepoint of c.

An example of a character set is the Universal Character Set which happens to be identical to another character set called Unicode. Here is part of UCS:
25 PERCENT SIGN 2B PLUS SIGN 54 LATIN CAPITAL LETTER T 5D RIGHT SQUARE BRACKET B0 DEGREE SIGN C9 LATIN CAPITAL LETTER E WITH ACUTE 2AD LATIN LETTER BIDENTAL PERCUSSIVE 39B GREEK CAPITAL LETTER LAMDA 446 CYRILLIC SMALL LETTER TSE 543 ARMENIAN CAPITAL LETTER CHEH 5E6 HEBREW LETTER TSADI 635 ARABIC LETTER SAD 784 THAANA LETTER BAA 94A DEVANAGARI VOWEL SIGN SHORT O 9D7 BENGALI AU LENGTH MARK BEF TAMIL DIGIT NINE D93 SINHALA LETTER AIYANNA F0A TIBETAN MARK BKA- SHOG YIG MGO 11C7 HANGUL JONGSEONG NIEUN-SIOS 1293 ETHIOPIC SYLLABLE NAA 13CB CHEROKEE LETTER QUV 2023 TRIANGULAR BULLET 20A4 LIRA SIGN 2105 CARE OF 213A ROTATED CAPITAL Q 21B7 CLOCKWISE TOP SEMICIRCLE ARROW 2226 NOT PARALLEL TO 2234 THEREFORE 265E BLACK CHESS KNIGHT 1D111 MUSICAL SYMBOL FERMATA BELOW 1D122 MUSICAL SYMBOL F CLEF 1F08E DOMINO TILE VERTICAL-06-01
Unicode code points are traditionally written with U+ followed by four to six hex digits (e.g. U+00C9, U+1D122).
The entire character set is described in the two files http://www.unicode.org/Public/UNIDATA/UnicodeData.txt. http://www.unicode.org/Public/UNIDATA/Unihan.txt. The codepoints are not assigned haphazardly: see http://www.unicode.org/Public/UNIDATA/Blocks.txt.
ISO8859-1 is a character set that is exactly equivalent to the first 256 mappings of Unicode. Obviously it doesn't have enough characters.
These 15 charsets also have 256-character repertoires. They all share the same characters in the first 128 positions, but differ in the next 128. See http://www.unicode.org/Public/MAPPINGS/ISO8859/.
This character set, with a repertoire of 256 characters, also known as CP1252, can be found at http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT. It is very close to ISO8859-1. Be careful with this set! Users of Windows systems often unknowingly produce documents with this character set, then forget to specify it when making these documents available on the web or transporting them via other protocols with tend to default to Unicode. Then the end result is annoying. It's best to avoid this.
ASCII is a character set that is exactly equivalent to the first 128 mappings of Unicode. Obviously it doesn't have enough characters. However it is commonly used! It's a good "lowest common denominator" and many Internet protocols require it!
A character encoding specifies how a character (or character string) is encoded in a bit string. There are many, many encodings of Unicode. The most important are UTF-32, UTF-16 and UTF-8.

This is the simplest. Just encode each character in 32 bits. The encoding of a character is simply its code point! Couldn't be more straightforward. Of course, you try to convince people to actually use four bytes per character.
In UTF-16 some characters are encoded in 16 bits and some in 32 bits.
| Character Range | Bit Encoding |
|---|---|
| U+0000 ... U+FFFF | xxxxxxxx xxxxxxxx |
| U+10000 ... U+10FFFF | let y = X-1000016 in 110110yy yyyyyyyy 110111yy yyyyyyyy |
UTF-16 simply cannot encode codepoints beyond U+10FFFF. So far this is not a problem. Note also that the existence of UTF-16, and its blessing by the Unicode Consortium means that U+D800 through U+DFFF cannot be legal characters. Hack!?
Here's another variable length encoding.
| Character Range | Bit Encoding | (Bits) |
|---|---|---|
| U+0000 ... U+007F | 0xxxxxxx | 7 |
| U+0080 ... U+07FF | 110xxxxx 10xxxxxx | 11 |
| U+0800 ... U+FFFF | 1110xxxx 10xxxxxx 10xxxxxx | 16 |
| U+10000 ... U+1FFFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx | 21 |
| U+200000 ... U+3FFFFFF | 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx | 26 |
| U+4000000 ... U+7FFFFFFF | 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx | 31 |
UTF-8 rocks. The number of advantages it has is stunning. For examples:
To stream out the UTF-8 bytes from an integer, use:
if (c <= 0x7F) {
emitByte(c);
} else if (c <= 0x7FF) {
emitByte(0xC0 | c>>6);
emitByte(0x80 | c & 0x3F);
} else if (c <= 0xFFFF) {
emitByte(0xE0 | c>>12);
emitByte(0x80 | c>>6 & 0x3F);
emitByte(0x80 | c & 0x3F);
} else if (c <= 0x1FFFFF) {
emitByte(0xF0 | c>>18);
emitByte(0x80 | c>>12 & 0x3F);
emitByte(0x80 | c>>6 & 0x3F);
emitByte(0x80 | c & 0x3F);
} else if (c <= 0x3FFFFFF) {
emitByte(0xF8 | c>>24);
emitByte(0x80 | c>>18 & 0x3F);
emitByte(0x80 | c>>12 & 0x3F);
emitByte(0x80 | c>>6 & 0x3F);
emitByte(0x80 | c & 0x3F);
} else if (c <= 0x7FFFFFFF) {
emitByte(0xFC | c>>30);
emitByte(0x80 | c>>24 & 0x3F);
emitByte(0x80 | c>>18 & 0x3F);
emitByte(0x80 | c>>12 & 0x3F);
emitByte(0x80 | c>>6 & 0x3F);
emitByte(0x80 | c & 0x3F);
}
I won't describe any others here, but UTF-7 is worth mentioning. If you like the stuff on this page see the IANA Charsets Page. You may also want to check out the UTF page at czybrra.com, which is very complete and well-written (and from which I borrowed the list of UTF-8 advantages).
| Unicode Character | UTF-32 Encoding | UTF-16 Encoding | UTF-8 Encoding |
|---|---|---|---|
| RIGHT SQUARE BRACKET (U+005D) | 00 00 00 5D | 00 5D | 5D |
| LATIN CAPITAL LETTER E WITH ACUTE (U+00C9) | 00 00 00 C9 | 00 C9 | C3 89 |
| CHEROKEE LETTER QUV (U+13CB) | 00 00 13 CB | 13 CB | E1 8F 8B |
| MUSICAL SYMBOL F CLEF (U+1D122) | 00 01 D1 22 | D8 34 DD 22 | F0 9D 84 A2 |