Section 11.1 What is Cryptography?
Cryptography is not just the science of making (and breaking) codes, as a dictionary might have it. It is the mathematical analysis of the tools of secrecy, from both the perspective of someone keeping a secret and that of the person trying to figure it out. Sometimes it is also called cryptology, while sometimes that term is reserved for a wider meaning.
There are two kinds of codes.
There are codes which disguise information and are intended to remain secret! (Especially for those needing private communication.)
There are codes encapsulating information in a convenient format, not needing secrecy. (Especially to allow for error checking.)
Mathematicians use the word code to indicate information is being stored, reserving the term cipher to talk about a way to protect that information. So, what we do when learning about this is some of each, though mostly about ciphers.
Subsection 11.1.1 Encoding and decoding
There are many ways to encode a message. The easiest one for us (though not used in practice in exactly this way) will be to simply represent each letter of the English alphabet by an integer from 1 to 26. It is also easy to represent both upper- and lowercase letters from 1 to 52.
We'll use the following embedded cell to turn messages into numbers and vice versa. You encode a plaintext message (no spaces, in quotes, for our examples) and decode a positive integer.
Sage note 11.1.1. Definitions.
This cell should not have any output. The code def
followed by a function name and input variable name (and colon) just tells Sage to define a new (computer, not necessarily mathematical) function. Then the commands after the first line of each definition say what to do, including what to send back to the user, the return
statement. As long as nothing goes wrong, no output is required – you told Sage to do something, and it did it.
This is a very handy way to make new mathematical functions too. Even something as basic as def f(x): return x^2
could be useful, though in this simple case Sage gives you many more tools if you use the syntax f(x) = x^2
instead. Try to watch the Sage code throughout, especially in the final few chapters like Section 23.3, for usage of the def
statement to make new functions.
Let's try to encode the letter “q”.
Sage note 11.1.2. Always evaluate your definitions.
If the previous cell doesn't work, then you may need to evaluate the first one in this section again. If anything in this chapter ever gives a NameError
about a global name encode
, you probably need to reevaluate some previous cell. Most likely, the one with def encode
!
The process of decoding (or to decode) is similar.
This should be straightforward. Too straightforward, perhaps. What are some issues here?
First, notice that I didn't bother separating lower and uppercase letters.
Also, no matter how complicated you get, with just a one-to-one correspondence, there are only a few possibilities for each letter. So if you know the human language in question, you can just start guessing which encrypted number stands for its most common letter.
Can you think of other drawbacks? (See Exercise 11.8.14.)
That means that, in practice, we need to do a few other things. One thing that is commonly done is to make longer blocks of letters, and then turn those into numbers. After all, presumably there are a lot more three-letter (or longer) possible blocks of letters in English than would make it too easy to decrypt them. (Can you think of exceptions, though?)
For pairs, we will represent the first letter as a number from 1 to 26, and the second letter as 26 times the letter number (think of it as base 26). Remember that A=1, B=2, etc.
Now compare the following two encodings of “The best day of the year” and see which one might be easier to decipher.
Whereas there are many 5s in the first encoding, which you could guess were Es, the second one has only one repeat (though knowing English, one might guess it was ‘Th’). For this reason, it's important to point out we haven't made anything secret yet, we've just encoded.
With three letter blocks, there are then already \(26^3=17576\) possibilities.
One could use this to encode the phrase INT HEB EGI NNI GWA STH EWO RDX. In this case, we use an extra X to fill out the space from a famous quote; much more sophisticated filler can be used in real cryptography.
To be fair, when filler of this type is used, it would more often be used in the middle to confuse things. In addition, one might recombine the message in various ways. We will, however, usually keep our whole message together as one item, since we want to understand the mathematical aspects most, rather than real cryptography.