
Goals:
The best cryptosystems assume that Eve and Mallory know E, D, and c. Most cryptosystems do not rely on their algorithms being kept secret, because
The study of cryptology includes the design of various ciphers, cryptanalysis methods (attacks), key exchange, key authentication, cryptographic hashing, digital signing, and social issues (legal, political, etc.). See Wikipedia's topics in cryptography page.
Here are some useful categories of ciphers. Note that a particular cipher may belong to more than one of these categories.
NOTE: In the character-based examples below, we'll assume (without any loss of generality) a 26 symbol alphabet ('A'..'Z').
Secret key (a.k.a. symmetric key) ciphers are much faster than public key ciphers, but key management can be a huge problem.
A completely pathetic and insecure cipher by modern standards. The encryption key ke is a small integer and kd = ke. To encrypt, add ke to each plaintext character; to decrypt, subtract.
Trivial to crack: just guess ke.
Instead of simply adding a fixed offset to each character, you can precompute a "substitution table" by generating a random permutation of your alphabet. For example:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
MQHPSVJYCURFTBILAKWNGZDOEX
Now "ATTACKATDAWN" is now "MNNMHRMNPMDB".
You don't crack this by guessing the key (there are n! possible keys), but frequency analysis can crack any monoalphabetic substitution cipher, provided the message is long enough.
For techniques whose "key" is a permutation, one way to make the key easier to remember is to pick a phrase, lay out its unique letters, then fill in missing letters in order. For example, "PREMATURE OPTIMIZATION IS THE ROOT OF ALL EVIL" yields this substitution mapping:
PREMATUOIZNSHFLVBCDGJKQWXY
Each plaintext letter maps to one or more symbols in the ciphertext. The number of targets should be proportional to its frequency (to defeat frequency analysis). Example:
A 12 15 36 50 56 70 81 95
B 51 84
C 16 44 65
D 04 06 48 82
E 01 17 19 34 47 49 58 60 67 77 85 90
F 13 27
G 09 28
H 26 42 53 59 68 71
I 35 73 76 86 91 96
J 18
K 07
L 29 40 54 87
M 25 30
N 21 61 62 69 74 94
O 02 03 08 10 57 75 93
P 41 98
Q 97
R 32 38 43 45 80 83
S 14 22 39 79 89 99
T 00 20 23 33 46 52 72 78 88
U 11 64 66
V 37
W 63 92
X 31
Y 24 55
Z 05
To encrypt, choose randomly among possibilities. Example, one possible encryption of "ATTACKATDAWN" is
56 78 20 95 65 07 12 72 06 50 92 61
Polyalphabetic substitution ciphers use multiple substitution alphabets. There are quite a few ways to do this.
The cipher known as the simple shift Vigenère cipher was not invented by Vigenère at all... it seems to have been first described by Giovan Battista Bellaso. The key is a string that you add to the plaintext with modular addition, like in this example (A=0, B=1, C=2, ..., Z=25):
Plaintext: TAKEACOPYOFYOURPOLICYTONORMAWILCOXONTHETHIRDFLOOR
Key: QUARKQUARKQUARKQUARKQUARKQUARKQUARKQUARKQUARKQUAR
Ciphertext: JUKVKSIPPYVSOLBFILZMONOEYHGANSBWOOYDNHVDXCRUPBIOI
To generate ciphertext by hand you can use a code wheel or a tabula recta.
This scheme isn't secure since the key repeats. If the key length can be determined, the cryptanalyst can do multiple frequency analyses (one for each shift value in the key). Methods for determining key length are the Kaisiski Method and the Friedman test.
For "binary data" (i.e., a sequence of bits) modular addition base-2 is just a simple xor. Example:
Plaintext: 0110000101010000111101001010101010010000001111101
Key: 0000011100000111000001110000011100000111000001110
Ciphertext: 0110011001010111111100111010110110010111001110011
Vigenère actually created an autokey cipher which is stronger because the key never repeats. Instead the "key" is made up of the keyphrase followed by the plaintext, like this:
Plaintext: TAKEACOPYOFYOURPOLICYTONORMAWILCOXONTHETHIRDFLOOR
Key: QUARKTAKEACOPYOFYOURPOLICYTONORMAWILCOXONTHETHIRD
Ciphertext: JUKVKVOZCOHMDSFUMZCTNHZVQPFOJWCOOTWYVVBHUBYHYSWFU
That one used the plaintext as part of the key. You could also use the ciphertext. See how?
You can still crack autokey Vigenère ciphers by linguistic analysis, because the key contains text and is thus likely to have high-frequency letters. Modern auto-key ciphers generate the shift values with a random number generator. The key seeds the generator.
If the key
Then you have a provably secure cipher called the one time pad. Your actual algorithm can use polyalphabetic substitution or even simple xoring the message with the key, as long as you meet the three criteria above.
The one-time pad can never be cracked. It is a perfect encryption scheme, from a mathematical perspective, anyway.
This is an example of a polygraphic substitution cipher. It replaces pairs of characters. The key is a permutation of {A..I,K..Z}, for example:
Z C B M L
G D A Q E
T U O K H
F S X V N
P I Y R W
To encrypt, write out the plaintext (without spaces or punctuation), sticking in an X between double letters and at the end if necessary to make the text have even length. Then for each pair of letters:
Example: "THEN ATTACK FROM THE EAST" ==> "TH EN AT XT AC KF RO MT HE XE AS TX" ==> "UT HW GO FO DB TV YK ZK NH NA DX OF".
Decryption runs the rules in reverse. The Playfair cipher is pretty insecure.
Encrypts digraphs like playfair, but slightly stronger because it allows for double letters and doesn't yield reversed ciphertext digraphs for reversed plaintext digraphs. Example
a b c d e G I V E M
f g h i k L B R T Y
l m n o p O D A H C
q r s t u F K N P Q
v w x y z S U W X Z
P R E M A a b c d e
T U O I Z f g h i k
N S H F L l m n o p
V B C D G q r s t u
K Q W X Y v w x y z
For which "THEN ATTACK FROM THE EAST" ==> "TH EN AT TA CK FR OM TH EE AS TX" ==> "NI VL EV FM MO BV DF NI MA VV NX".
Okay, so slightly stronger than Playfair but so what! Computers can crack these things in seconds, or perhaps minutes (given enough ciphertext).
The simplest transposition cipher breaks up the message into blocks of size n, then scrambles each block according to a permutation of (1..n).
Write out the message row by row in a grid, then read it out in columns. Totally insecure. The key is just the number of rows. Guess it.
The rail fence is no better than the last one, just funkier. The key is the number of "rails" on which you write the plaintext in an up and down fashion, generating the ciphertext by reading one rail at a time.
Example: To encode "fill out and file a WS2475 form" on 4 rails:
f t l 4 m
i u a i e 2 7 r
l o n f a s 5 o
l d w f
you then read out the ciphertext "ftl4miuaie27rlonfas5oldwf". This is trivial to crack. Just guess k.
Transposition alone is very weak; substitution is weak; combining them is better. You can mix a lot of the classic substitution ciphers with various transpositions, or use some special combination ciphers like bifid. Also, most of the famous rotor machines and modern ciphers use this combination; in fact they apply these transformations many times.
This one substitutes letters with their coordinates in a grid and does a columnar transposition on the coordinates. Example:
Z C B M L
G D A Q E
T U O K H
F S X V N
P I Y R W
Write the (row, column) coordinates under each letter of the plaintext (e.g., "A" is at row 1, column 2; "T" is at row 2, column 0, etc.):
ATTACKATDAWN
122102121143
200213201244
Then read out in rows, group by twos and look up the ciphertext letters
122102121143200213201244
A U B A D R T B Q T A W
Like Bifid, but on a cube. Example:
Z C B M L F V N P
G D A Q E X I R W
T U O K H S Y . J
To encrypt, first write the coordinates
ATTACKATDAWN
000001000022
122102121110
200210201221
000001000022122102121110200210201221
Z C Z O S F H Q V I N .
The Enigma was the famous German rotor machine from World War II (actually a family of machines). Most versions implemented a polyalphabetic substitution cipher with a period of 16900 plus a plugboard for scrambling (transposition). The "key" consisted of the order of the rotors, the starting position of each roter, the ring settings, and the plugboard settings (about 1.6 × 1020 possibilities). There was a new key each day (more or less) prepublished in codebooks.
The Allies were able to crack it thanks to some weaknesses in its design...
...but more importantly, many weakness in the way it was used...
...and by obtaining codebooks from captured vessels.
You can read about how the Enigma was broken from the NSA, and from Wikipedia.
Now that we have Shannon's information theory, very powerful computers, and centuries of theory and practice behind us, most modern techniques
In addition, it's nice if the cipher is
Most ciphers are either block ciphers or stream ciphers. Block ciphers require padding and can operate in different modes (See Schnier's book or the Wikipedia article:
AES is the new standard, replacing DES. It was the winner of the competition (in 2001), where is was submitted under the name Rijndael, beating out RC6, Serpent, MARS, and Twofish.
Public key ciphers solve the key management nightmare of secret key ciphers, at the cost of speed. In a group of n people one needs only n public keys and n private keys.
Diffie and Hellman showed it was possible for two people to exchange a secret key without having to actually meet in secret:
This is secure (provided n is very large, (n-1)/2 is also prime, because Eve knows g, n, gamod n and gbmod n but there's no known efficient way to get a or b from these. That's the discrete logarithm problem, remember?
Example with small n:
DIffie-Hellman doesn't do encryption; it just exchanges a key. RSA can encrypt and decrypt. Here's how. Each person
Now check this out:
A hash, a.k.a. fingerprint, checksum, message digest is a bit pattern (usually around 160 bits or so), generated from a message by a cryptographic hash function. For the hash to be "secure" or "cryptographic" it must
Usually the change of just a single bit in the message will cause the digest to look completely and totally different.
$ cat will This is my will. I leave 1000 dollars to Alice and everything else to Bob. Signed, Eve. $ md5sum will c18feb890752c9e680c99d1e909fd761 will $ sed "s/1/9/g" will > Will $ cat Will This is my will. I leave 9000 dollars to Alice and everything else to Bob. Signed, Eve. $ md5sum Will 85570bc2d0458b1f98f484261dee7d4d Will
A secure hash provides a way to determine whether a message was tampered with.
See Steve Friedl's Illustrated Guide to Cryptpgraphic Hashes.
How can Bob be sure the message came from Alice and not someone else? By Alice signing it; that's how. In practice, one usually signs a hash, not the whole message.
For Alice to send a message to Bob,
m = A(B'(B(A'(m)))