RSA is one of the first practicable public-key cryptosystems and is widely used for secure data transmission. In such a cryptosystem, the encryption key is public and differs from the decryption key which is kept secret. In RSA, this asymmetry is based on the practical difficulty of factoring the product of two large prime numbers, the factoring problem. RSA stands for Ron Rivest, Adi Shamir and Leonard Adleman, who first publicly described the algorithm in 1977. Clifford Cocks, an English mathematician, had developed an equivalent system in 1973, but it wasn’t declassified until 1997.[1]
A user of RSA creates and then publishes a public key based on the two large prime numbers, along with an auxiliary value. The prime numbers must be kept secret. Anyone can use the public key to encrypt a message, but with currently published methods, if the public key is large enough, only someone with knowledge of the prime factors can feasibly decode the message. Breaking RSA encryption is known as the RSA problem. It is an open question whether it is as hard as the factoring problem.
The RSA algorithm involves three steps: key generation, encryption and decryption.
Key generation
RSA involves a public key and a private key. The public key can be known by everyone and is used for encrypting messages. Messages encrypted with the public key can only be decrypted in a reasonable amount of time using the private key. The keys for the RSA algorithm are generated the following way:
- Choose two distinct prime numbers p and q.
- For security purposes, the integers p and q should be chosen at random, and should be of similar bit-length. Prime integers can be efficiently found using a primality test.
- Compute n = pq.
- n is used as the modulus for both the public and private keys. Its length, usually expressed in bits, is the key length.
- Compute φ(n) = φ(p)φ(q) = (p − 1)(q − 1) = n – (p + q -1), where φ is Euler’s totient function.
- Choose an integer e such that 1 < e < φ(n) and gcd(e, φ(n)) = 1; i.e., e and φ(n) are coprime.
- e is released as the public key exponent.
- e having a short bit-length and small Hamming weight results in more efficient encryption – most commonly 216 + 1 = 65,537. However, much smaller values of e (such as 3) have been shown to be less secure in some settings.[5]
- Determine d as d ≡ e−1 (mod φ(n)); i.e., d is the multiplicative inverse of e (modulo φ(n)).
-
-
- This is more clearly stated as: solve for d given d⋅e ≡ 1 (mod φ(n))
- This is often computed using the extended Euclidean algorithm. Using the pseudocode in the Modular integers section, inputs a and n correspond to e and φ(n), respectively.
- d is kept as the private key exponent.
-
The public key consists of the modulus n and the public (or encryption) exponent e. The private key consists of the modulus n and the private (or decryption) exponent d, which must be kept secret. p, q, and φ(n) must also be kept secret because they can be used to calculated.
- An alternative, used by PKCS#1, is to choose d matching de ≡ 1 (mod λ) with λ = lcm(p − 1, q − 1), where lcm is the least common multiple. Using λ instead of φ(n) allows more choices for d. λ can also be defined using the Carmichael function, λ(n).
- The ANSI X9.31 standard prescribes, IEEE 1363 describes, and PKCS#1 allows, that p and q match additional requirements: being strong primes, and being different enough that Fermat factorization fails.
Encryption
Alice transmits her public key (n, e) to Bob and keeps the private key d secret. Bob then wishes to send message M to Alice.
He first turns M into an integer m, such that 0 ≤ m < n by using an agreed-upon reversible protocol known as a padding scheme. He then computes the ciphertext c corresponding to
This can be done quickly using the method of exponentiation by squaring. Bob then transmits c to Alice.
Note that at least nine values of m will yield a ciphertext c equal to m, but this is very unlikely to occur in practice.
Decryption
Alice can recover m from c by using her private key exponent d via computing
Given m, she can recover the original message M by reversing the padding scheme.
(In practice, there are more efficient methods of calculating cd using the precomputed values below.)
A worked example
Here is an example of RSA encryption and decryption. The parameters used here are artificially small, but one can also use OpenSSL to generate and examine a real keypair.
- Choose two distinct prime numbers, such as
- and
- Compute n = pq giving
- Compute the totient of the product as φ(n) = (p − 1)(q − 1) giving
- Choose any number 1 < e < 3120 that is coprime to 3120. Choosing a prime number for e leaves us only to check that e is not a divisor of 3120.
- Let
- Compute d, the modular multiplicative inverse of e (mod φ(n)) yielding,
- Worked example for the modular multiplicative inverse:
The public key is (n = 3233, e = 17). For a padded plaintext message m, the encryption function is
The private key is (n = 3233, d = 2753). For an encrypted ciphertext c, the decryption function is
For instance, in order to encrypt m = 65, we calculate
To decrypt c = 2790, we calculate
Both of these calculations can be computed efficiently using the square-and-multiply algorithm for modular exponentiation. In real-life situations the primes selected would be much larger; in our example it would be trivial to factor n, 3233 (obtained from the freely available public key) back to the primes p and q. Given e, also from the public key, we could then compute d and so acquire the private key.
Practical implementations use the Chinese remainder theorem to speed up the calculation using modulus of factors (mod pq using mod p and mod q).
The values dp, dq and qinv, which are part of the private key are computed as follows:
Here is how dp, dq and qinv are used for efficient decryption. (Encryption is efficient by choice of public exponent e)
Signing messages
Suppose Alice uses Bob‘s public key to send him an encrypted message. In the message, she can claim to be Alice but Bob has no way of verifying that the message was actually from Alice since anyone can use Bob’s public key to send him encrypted messages. In order to verify the origin of a message, RSA can also be used to sign a message.
Suppose Alice wishes to send a signed message to Bob. She can use her own private key to do so. She produces a hash value of the message, raises it to the power of d (modulo n) (as she does when decrypting a message), and attaches it as a “signature” to the message. When Bob receives the signed message, he uses the same hash algorithm in conjunction with Alice’s public key. He raises the signature to the power of e (modulo n) (as he does when encrypting a message), and compares the resulting hash value with the message’s actual hash value. If the two agree, he knows that the author of the message was in possession of Alice’s private key, and that the message has not been tampered with since.
Proof using Fermat’s little theorem
The proof of the correctness of RSA is based on Fermat’s little theorem. This theorem states that if p is prime and p does not divide an integer a then
We want to show that (me)d ≡ m (mod pq) for every integer m when p and q are distinct prime numbers and e and d are positive integers satisfying
We can write
for some nonnegative integer h.
To check two numbers, like med and m, are congruent mod pq it suffices (and in fact is equivalent) to check they are congruent mod p and mod q separately. (This is part of the Chinese remainder theorem, although it is not the significant part of that theorem.) To showmed ≡ m (mod p), we consider two cases: m ≡ 0 (mod p) and m 0 (mod p).
In the first case med is a multiple of p, so med ≡ 0 ≡ m (mod p). In the second case
where we used Fermat’s little theorem to replace mp−1 mod p with 1.
The verification that med ≡ m (mod q) proceeds in a similar way, treating separately the cases m ≡ 0 (mod q) and m 0 (mod q), using Fermat’s little theorem for modulus q in the second case.
This completes the proof that, for any integer m,
Proof using Euler’s theorem
Although the original paper of Rivest, Shamir, and Adleman used Fermat’s little theorem to explain why RSA works, it is common to find proofs that rely instead on Euler’s theorem.
We want to show that med ≡ m (mod n), where n = pq is a product of two different prime numbers and e and d are positive integers satisfying ed ≡ 1 (mod φ(n)). Since e and d are positive, we can write ed = 1 + hφ(n) for some non-negative integer h. Assuming that m is relatively prime to n, we have
where the second-last congruence follows from the Euler’s theorem.
When m is not relatively prime to n, the argument just given is invalid. This is highly improbable (only a proportion of 1/p + 1/q − 1/(pq) numbers have this property), but even in this case the desired congruence is still true. Either m ≡ 0 (mod p) or m ≡ 0 (mod q), and these cases can be treated using the previous proof.