Section11.1What is Cryptography?¶ permalink

Cryptography is not just the science of making (and breaking) codes, as a dictionary might have it. It is the mathematical analysis of the tools of secrecy, from both the perspective of someone keeping a secret and that of the person trying to figure it out. Sometimes it is also called cryptology, while sometimes that term is reserved for a wider meaning.

There are two kinds of codes.

There are codes which disguise information and are intended to remain secret! (Especially for those needing private communication.)
There are codes encapsulating information in a convenient format, not needing secrecy. (Especially to allow for error checking.)

Mathematicians use the word code to indicate information is being stored, reserving the term cipher to talk about a way to protect that information. So, what we do when learning about this is some of each, though mostly about ciphers.

Subsection11.1.1Encoding and decoding

There are many ways to encode a message. The easiest one for us (though not used in practice in exactly this way) will be to simply represent each letter of the English alphabet by an integer from 1 to 26. It is also easy to represent both upper- and lowercase letters from 1 to 52.

We'll use the following embedded cell to turn messages into numbers and vice versa. You encode a plaintext message (no spaces, in quotes, for our examples) and decode a positive integer.

Let's try to encode the letter “q”.

Sage note11.1.1Always evaluate your definitions

If the previous cell doesn't work, then you may need to evaluate the first one in this section again. If anything in this chapter ever gives a NameError about a global name encode, you probably need to reevaluate some previous cell. Most likely, the one with def encode!

The process of decoding (or to decode) is similar.

This should be straightforward. Too straightforward, perhaps. What are some issues here?

First, notice that I didn't bother separating lower and uppercase letters.
Also, no matter how complicated you get, with just a one-to-one correspondence, there are only a few possibilities for each letter. So if you know the human language in question, you can just start guessing which encrypted number stands for its most common letter.
Can you think of other drawbacks?

That means that, in practice, we need to do a few other things. One thing that is commonly done is to make longer blocks of letters, and then turn those into numbers. After all, presumably there are a lot more three-letter (or longer) possible blocks of letters in English than would make it too easy to decrypt them. (Can you think of exceptions, though?)

For pairs, we will represent the first letter as a number from 1 to 26, and the second letter as 26 times the letter number (think of it as base 26). Remember that A=1, B=2, etc.

Now compare the following two encodings of “The best day of the year” and see which one might be easier to decipher.

Whereas there are many 5s in the first encoding, which you could guess were Es, the second one has only one repeat (though knowing English, one might guess it was ‘Th’). For this reason, it's important to point out we haven't made anything secret yet, we've just encoded.

With three letter blocks, there are then already \(26^3=17576\) possibilities.

One could use this to encode the famous phrase INT HEB EGI NNI GWA STH EWO RDX. In this case, we use an extra X to fill out the space from a famous quote.

To be fair, when filler of this type is used, it would more often be used in the middle to confuse things. In addition, one might recombine the message in various ways. We will, however, usually keep our whole message together as one item, since we want to understand the mathematical aspects most, rather then real cryptography.