Cryptography and RSA

Introduction

Cryptography is used to encode and decode information as a means of security. It’s used to hide messages from non-intended recipients by making the messages unintelligible to them. In Greek, cryptography means “hidden secret.” In the past, cryptography was utilized mainly by the military and political sectors of intelligence, but presently it is commonly used in the commercial sectors for electronic commerce, computer passwords, ATM cards, and other applications. Over the years there have been many different methods of cryptography, but today the most commonly used algorithm for encryption is RSA.

Cryptography Jargon

First we shall start off with some commonly used terms used when discussing cryptography.

Word Definition
cipher pair of algorithms to encode and decode
plaintext information to be coded (ie your message)
ciphertext coded information, (ie your encrypted message)
encryption process of converting plaintext to ciphertext
decryption process of converting cipher text back to plaintext
key secret parameter which is used in encoding and decoding*

*Note that the key used for encryption is not required to be the same as the one used for decryption, however they are related.

Now that we have the important vocab under our belts, we can start looking into some encryption schemes.

Classic Cipher Types (NOT discussed in class, but still interesting)

In history, the sole purpose of cryptography was for keeping messages confidential. The main types of ciphers were transposition ciphers and substitution ciphers.

Transposition ciphers are those in which the letters or characters in a message (the plaintext) are transposed, or jumbled, to form the encoded message (ciphertext). One interesting example of a transposition cipher is the Rail Fence Cipher, also refered to as the Zig Zag Cipher. It is best to see how this works in terms of an example. Let's say your plaintext is as follows:

message: CTU HAS BEEN COMPROMISED

To encode this using the Rail Fence Cipher, we first must decide on how many "rails" we wish to use; we choose 3 rails. Now to start we write our message one letter at a time in a zig zag pattern. Starting with a downward zig we write the first 3 letters (because we chose arbitrarily to use 3 rails). Then we zag back up 2 letters, zig down 2 letters, zag up 2, and so on until the message looks as follows.

C - - - A - - - E - - - M - - - M - - - D
- T - H - S - E - N - O - P - O - I - E -
- - U - - - B - - - C - - - R - - - S - -

Then wee look across the rows to create our ciphertext, so we get

ciphertext: CAEMMD THSENOPOIE UBCRS

To make it more difficult to determine the method of encryption, we can space the cipher text at different intervals in order to make the pattern less recognizable. For instance, we could space our message as CAEMM DTHSE NOPIE UBCRS.

One problem with this method of encryption is that there are only so many combinations of letters from the cipher text which form meaningful words. Any non-intended recipient could use brute force to try and decrypt the message without too much dificulty. By brute force, we simply mean that the interceptor could try all possible cases until he finds the correct order.

Another issue with this form of encryption is that recipient (the decorder) must know the number of rails used in the encryption process before decrypting. Thus, the encoder and the decoder must meet up ahead of time to agree on this convention, which is not always possible or convenient -it might be dangerous!

Substitution Ciphers involves substituting, as you may have guessed, letters in the plain text with other letters in the alphabet. A simple example of this is replacing a letter in your plaintext with the letter immediately preceding it in the alphabet. So, Z replaces A, A replaces B, B replaces C, and so on. Using our previous example, after encoding our plaintext with the substitution cipher our ciphertext is

ciphertext: BST GZR ADDM BNLOQNLHRDC

Compared to the transposition cipher, it is less clear from the ciphertext what the original message was because they are entirely new letters. However,this method still requires the encoder and decoder to agree upon the substitution convention, and if one knows how to encode a message, he has all the information required to decode it as well, which makes it vulnerable to attacks. Also, it is not that hard to brute force these type of messages, since there are only 26 letters of the alphabet, we only need to try at most 25 substitution shifts which is not that hard to do.

RSA

RSA is an algorithm for public key cryptography, and was developed in 1977 by Ronald Rivest, Adi Shamir, and Len Adelman. Taking the first letter from each of their last names they got RSA.

Public Key Encryption

  • Each member within the system, consisting of n people, can encode and send a message to any other member, but only the intended recipient can decode the message
  • Everyone has their own encryption key: $e_1, e_2, e_3,...,e_n$ and everyone has their own decryption key: $d_1, d_2, d_3,...,d_n$

Suppose Chloe wants to send a message to Jack, She would encrypt the message using Jack's public key $e_j$ and Jack would decrypt it using his private decryption key $d_j$

Now, if Jack wanted to reply to Chloe, he must use Chloe's encryption key $e_c$. Then only Chloe could decrypt it with her private decryption key $d_c$.

The beauty of this form of cryptography is that no previous meeting is required to agree upon a key. Person A can simply publish a public key to everyone, and this will allow anyone to send person A a secret message. Since Person A is the only one with the decryption key, he/she will be the only person able to decrypt the message, making it safe.

To discuss this in matters relevant to us, RSA allows you to sit at your computer and confidently type your credit card and other personal information into cites like paypal, amazon, and the like. RSA encodes your sensitive information into ciphertext and sends this across the internet. Then if a 3rd party, say a hacker, intercepts the ciphered information he will not have the required tools to make the message understandable. Only amazon, paypal or whoever you intended to receive your message has this ability. And this is why we love RSA, and why it is such a practical form of encryption.

Now lets take a look at the process of RSA

Creating a Public and Private Key

  • Randomly choose two LARGE distinct primes $p$ and $q$.
  • Let $m=pq$
  • Calculate $\phi (m)=(p-1)(q-1)$
  • Choose an integer $e$ such that ($e,\phi(m)$)=1
  • Find an integer $d$ which satisfies $ed \equiv 1 \mod \phi (m)$

Your public key will be the ordered pair ($e,m$),
while your private key is the ordered pair ($d,m$).

In order to avoid confusion with the Greatest Common Divisor function we will explicitly state when (#,#) is a public/private key. It is important that the only information made public is your public key. This means $p$, $q$, $\phi (m)$ and $d$ must be kept secret because with any of this information, one could break the code.

Encryption

Set Up

To begin, we must associate each letter of the alphabet with a unique number. This will allow us to convert our message into a series of numbers which we can then perform operations on. Let us use the following table for this,

Letter Number Letter Number
A 00 N 13
B 01 O 14
C 02 P 15
D 03 Q 16
E 04 R 17
F 05 S 18
G 06 T 19
H 07 U 20
I 08 V 21
J 09 W 22
K 10 X 23
L 11 Y 24
M 12 Z 25
"__" 26

Note that instead of letting A=0, we set it equal to 00. This is because once we get up to K we start using double digits. If we have a mix of single digits and double digits it would be impossible to convert back to our original message. Also, it is useful to denote spaces in between words with a number. We will use an underscore between words instead of space to make it clearer.

Lets look at an example, say our plaintext is

message: NUMBER_THEORY

N U M B E R _ T H E O R Y
13 20 12 01 04 17 26 19 07 04 14 17 24

So we get,

plaintext: 13201201041726190704141724

The Encryption Algorithm

  • Convert plaintext into its numerical equivalent, as shown previously
  • Recall the public key you are using is ($e,m$)
  • Separate these numbers into blocks, which are viewed as a single number, and label them $P_1$, P_2, ..., P_k$, making sure that for each $1 \leq i \leq k$ that $P_i < m$.
  • For each $1 \leq i \leq k$, compute $(P_i)^e\mod m$ and label the result $C_i$
  • The resulting blocks $C_1, C_2, ..., C_k$ are your new cipher text.

You can now transmit $C_1, C_2, ..., C_k$

Example of the Encryption Algorithm

Lets say our old friend Chloe chose $p=31$, and $q=37$ for her prime numbers, which gives $m=(31)(37)=1147$. So her $\phi (m)=\phi(1147)=(31-1)(37-1)=1080$. Then she must find an integer e which is relatively prime to 1080. She randomly selects $e=17$, and note that $(17,1080)=1$. So Chloe publishes her public key of (17,1147).

Say Jack wants to send Chloe an encrypted message using RSA. All he knows is Chloe's public key (17,1147), so he uses this in the encryption algorithm.

Jack's secret message to Chloe states,

message: WE_LOVE_MATH

After converting to numericals using the table above Jack has,

plaintext : 220426111421042612001907

He breaks the numerical form of the message into blocks of 3 making sure each block is less than m.

plaintext in blocks: 220 426 111 421 042 612 001 907

So,

(1)
\begin{align} \begin{split} 220=P_1\\ 426=P_2\\ 111=P_3\\ 421=P_4\\ 042=P_5\\ 612=P_5\\ 001=P_7\\ 907=P_8 \end{split} \end{align}

He then computes for each $P_{i}$, from $i=1$ to $i=8$, $P_{i}^{e} \mod m$, with e=17, and m=1147 taken from Chloe's public key. And the result is the ciphertext

(2)
\begin{align} \begin{split} (220)^{17} \mod 1147&\equiv 611 \\ (426)^{17} \mod 1147&\equiv 1145 \\ (111)^{17} \mod 1147&\equiv 851\\ (421)^{17} \mod 1147&\equiv 510\\ (042)^{17} \mod 1147&\equiv 96 \\ (612 )^{17} \mod 1147&\equiv 246 \\ (001)^{17} \mod 1147&\equiv 1 \\ (907)^{17} \mod 1147&\equiv 405 \end{split} \end{align}

So his ciphertext becomes,

ciphertext: 611 1145 851 510 96 246 1 405

And Jack sends "611 1145 851 510 96 246 1 405" to Chloe.

The Decryption Algorithm

  • Let $C_1, C_2, ..., C_k$ be your ciphertext blocks, and ($d,m$) be your private key
  • For each $1 \leq i \leq k$, compute $(C_i)^d \mod m$ and the result will be $P_i$ where $0 \leq P_i < m$
  • Then, $P_1, P_2, ..., P_k$ is your plaintext in numerical form
  • Finally, convert the pairs of two-digit numbers back to its alphabetical equivalent using our table.

Your message should now be comprehensible.

Example of the Decryption Algorithm

Again, returning to our Jack and Chloe example, Jack was sending a message to Chloe using her public key. Now for Chloe to decrypt a message sent to her, she must use her private key. We will re-use Chloe's key information in this example.

  • Chloe (and ONLY Chloe) knows m=1147=(31)(37), so $\phi(1147)=1080$
  • Recall she uses e=17
  • The next step is solving for $d$, which is the multiplicative inverse of $e \mod 1080$

Then She wants to solve for d in the linear congruence $17d \equiv 1 \mod 1080$. This is easily done using the Euclidean Algorithm. This is pretty easy, but lets go through the work for a refresher.

(3)
\begin{align} \begin{split} 1080&=(63)17+9\\ 17&=(1)9+8\\ 9&=(1)8+1 \end{split} \end{align}

Working backwards and substituting, She has

(4)
\begin{align} \begin{split} 1&=9-(1)8\\ &=9-(1)[17-(1)9]\\ &=(2)9-(1)17\\ &=(2)[1080-(63)17]-(1)17\\ &=(2)1080+(-127)17\\ &=(2)1080+(-127)17 \end{split} \end{align}

Next, Chloe mods the last equation by 1080 to get

(5)
\begin{align} 1 \equiv (-127)17 \mod 1080 \implies (953)17 \equiv 1 \mod 1080 \end{align}

To get a positive value for d, she just adds 1080 to -127 to get $(953)17 \equiv 1 \mod 1080$. So her d=953. Not to bad right?

So Chloe knows her private key is (953, 1147).

Now lets see how Chloe uses this info to decrypt a message. Say for example, Jack uses Chloe's public key and sends her another message using a block size of 3. This is what Chloe recieves,

ciphertext: 1 41 203 744 472 947 423 968 718

Chloe has all the information she needs to go through the decryption process to see what Jack sent her. Taking each ciphertext block $C_{1}=1, C_{2}=41, ..., C_{9}=560$, and knowing d=953, and m=1147, she computes $C_{i}^{953} \mod 1147$. The resulting integers are the plaintext $P_{1}, P_{2}, ..., P_{9}$. She gets,

(6)
\begin{align} \begin{split} 1^{953} &\mod 1147 \equiv 1=P_{1}\\ 41^{953} &\mod 1147 \equiv 324=P_{2}\\ 203^{953} &\mod 1147 \equiv 261=P_{3}\\ 744^{953} &\mod 1147 \equiv 620=P_{4}\\ 472^{953} &\mod 1147 \equiv 41=P_{5}\\ 947^{953} &\mod 1147 \equiv 819=P_{6}\\ 423^{953} &\mod 1147 \equiv 81=P_{7}\\ 968^{953} &\mod 1147 \equiv 413=P_{8}\\ 718^{953} &\mod 1147 \equiv 180=P_{9} \end{split} \end{align}

resulting #'s: 1 324 261 620 41 819 81 413 180

So the resulting numbers should be the plaintext in number form. Since Chloe knows that Jack used a block size of 3, she knows to add extra zeros to $P_{1}=1$, $P_{5}=41$, and $P_7=81$. So $P_1$ becomes 001, $P_{5}$ becomes 041, and $P_7$ becomes 081. So Chloe has the plaintext:

plaintext: 001 324 261 620 041 819 081 413 180

Finally she regroups into pairs of two so she can covert each pair back into its alphabetical equivalent.

00 13 24 26 16 20 04 18 19 08 14 13 18 0
A N Y _ Q U E S T I O N S

Why is there a loner "0" at the end of out message, and how do we convert it back to alphabetical form? Since Chloe knows there is no letter of the alphabet paired up with the single digit "0" in their conversion table, she concludes this extra "'0" must be a filler space. Jack simply added a meaningless bit of information to his plaintext in order to divide the plaintext evenly into groups of three. It is important to see here that the original message did not divide evenly into groups of three. If we try and do this, we will be left with the last block of "18", which clearly is not 3 digits as specified. So a filler space is needed, and "0" was randomly chosen to fill in. So finally, we have the message Jack sent to Chloe is,

message: ANY_QUESTIONS

Theory Behind RSA

Why the Decryption Algorithm Works

The decryption algorithm works because of the way we choose the value of $d$. Recall that,$e$ is an integer such that ($e,\phi(m)$)=1

From page 53 in our text we have the following corollary

Corollary 2.8: the linear congruence in one variable $ex \equiv 1 \mod m$ has a solution iff $(e,m)=1$; if $(e,m)=1$, then $ex \equiv 1 \mod m$ has exactly one incongruent solution modulo $m$

So the corollary provides that there exists exactly one integer $d$ satisfying $ed \equiv 1 \mod \phi (m)$. In the encryption algorithm we find for $0\leq C_i < m$

(7)
\begin{align} C_i \equiv (P_i)^e \mod m \end{align}

During decryption we compute $(C_i)^d \mod m$. We wish to show that $(C_i)^d \mod m \equiv P_i$. Consider $(C_i)^d \mod m$. Using what we know from (7), we simplify

(8)
\begin{align} (C_i)^d &\equiv ((P_i)^e)^d &\equiv (P_i)^{ed} \mod m \end{align}

So it will be equivalent to show that $(P_i)^{ed} \equiv P_i \mod m$

To show $(P_i)^{ed} \equiv P_i \mod m$

CASE 1:$(P_i, m)=1$

Because $ed \equiv 1 \mod \phi (m)$ we have equivalently that $ed=\phi(m)*k+1$, where $k$ is some integer. We substitute

(9)
\begin{align} (P_i)^{ed} &\equiv (P_i)^{\phi(m)*k+1} &\equiv (P_i)^{\phi(m)*k}(P_i)^1 &\equiv ((P_i)^{\phi(m)})^k(P_i) \end{align}

Since $(P_i, m)=1$, we can apply Euler's theorem, which provides that $(P_i)^{\phi(m)}\equiv 1 \mod m$, thus

(10)
\begin{align} ((P_i)^{\phi(m)})^k(P_i) &\equiv (1)^k(P_i) &\equiv P_i \mod m \end{align}

This proves that $(P_i)^{ed} \equiv P_i \mod m$ when $(P_i, m)=1$.

CASE 2:$(P_i, m) \neq 1$

Since $m=pq$, where $p,q$ are primes, then if $(P_i, m) \neq 1$ either (a) only $p|P_i$, (b) only $q|P_i$, or (c) both $p$ and $q$ divide $P_i$.

(a)
Now consider when only $p|P_i$. We will examine the following congruences: (I) $P_i^{ed} \mod p$, and (II) $P_i^{ed} \mod q$. If both happen to be congruent to $P_i$, we can apply the CRT to glue them back to show that $(P_i)^{ed} \equiv P_i \mod pq$, where $pq=m$.

So, looking first at (I) we know only $p|P_i$, thus $P_i \equiv 0 \mod p$, and any power of $P_i$, will also be equivalent to 0 modulo $p$. So, $P_i^{ed} \equiv 0 \mod p$. Therefore, $(P_i)^{ed} \equiv P_i \mod p$ as desired.

Switching to equation (II) we must use the fact that $ed \equiv 1 \mod \phi (m)$. By definition this says that $\phi(m)|ed-1$. And since $\phi(m)=\phi(p)\phi(q) \implies \phi(q)|\phi(m)$, then we know that $\phi(q)|ed-1 \implies \exists t \in \mathbb{Z}$ such that $ed=1+t\phi(q)$

(11)
\begin{align} P_i^{ed} \equiv P_i^{1+t\phi(q)} \equiv P_i^{1}P_i^{t\phi(q)} \equiv P_i^{1}(P_i^{\phi(q)})^{t}\equiv P_i^{1}(P_i^{(q-1)})^{t}\mod q \end{align}

Finally, because only $p|P_i$, then $(q ,P_i)=1$ and we can apply FLT because $q$ is pime. So FLT states that $(P_i^{(q-1)}\equiv 1 \mod q)$. Which gives, $P_i^{ed} \equiv P_{i}(1)^{t} \equiv P_i \mod q$. So $(P_i)^{ed} \equiv P_i \mod q$

Now we can apply the CRT to the congruences (I)$(P_i)^{ed} \equiv P_i \mod p$ and (II) $(P_i)^{ed} \equiv P_i \mod q$ to show $(P_i)^{ed} \equiv P_i \mod pq$, where $pq=m$. Which is what we wanted.

(b)
The case where only $q|P_i$ is proven similarly to case (a). Simply switch every $p$ with a $q$, and every $q$ with a $p$.

(c)
If $p$ and $q$ divide $P_i$, then $pq=m|P_i$. Then clearly, $P_i \equiv 0 \mod m$, and any power of $P_i$, will also be equivalent to 0 modulo $m$. Thus, $P_i^{ed} \equiv 0 \mod m$. So, we showed in this case that $(P_i)^{ed} \equiv P_i \mod m$

So in all possible cases, $(P_i)^{ed} \equiv P_i \mod m$ as desired.

Cracking RSA

The main idea that allows RSA to be an effective form of cryptography is that large numbers are very difficult to factor. In order to decrypt messages encoded by RSA one must know $d$, which is the multiplicative inverse of $e$ modulo $\phi(m)$. Since a public key only provides the values of $e$ and $m$, we must find $\phi (m)=(p-1)(q-1)$, in order to determine $d$. So the problem becomes solving for $\phi(m)$.

The only way to find $\phi(m)$ is to solve for $p$ and $q$ by factoring $m$. Since $m$ is very large it will be difficult to factor (ie time consuming), meaning it will be extremely difficult to determine what $\phi(m)$ is. So the security of RSA relies on the difficulty of computing $\phi(m)$ which is related to the difficulty of factoring large numbers. Thus, if someone publishes p and/or q a "hacker" could easly solve $\phi(m)$ and compute d.

It is obvious that one would never want to publish $\phi(m)$ because with the number from the public key (e,m), a 3rd party could easily solve for d in the linear congruence $ed \equiv 1 \mod \phi(m)$. It is interesting to note though, that from knowing $\phi(m)$ and $m$ without any other information, you could easily solve for p and q, where m=pq and p,q are prime. You may be asked to do this in the home work.

Making RSA Computations Easier

As mentioned in class, performing the encryption and decryption computations can be very tedious and time consuming. In the real world, one would be able to rely on computers for these computations. The way our group computed the values shown in the examples was using Mathematica.

Here are the functions we may need for the home work:

PowerMod[a,b,m]

This computes $a^b \mod m$, which will come in handy for computing $P_i^e \mod m$ during encryption and $C_i^d \mod m$ during decryption.

For example to solve $(220)^{17} \mod 1147$, you type

PowerMod[220,17,1147]

And the result should be 611.

PowerMod[a,-1,m]

This is really the same function as the previous one with setting b=-1. This function will be used to solve for the multiplicative inverse of a modulo m. For example, to solve for $d$ in $17d\equiv 1 \mod 1147$, you type

PowerMod[17,-1,1147]

And the result should be 953, as in our example.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License