Integrity

Asymmetric cryptography, key distribution, hybrid cryptosystems, MACs, digital signatures, and certificates

Paul Krzyzanowski

January 3, 2025

Public key cryptography

Public key algorithms, also known as asymmetric ciphers, use one key for encryption and a separate, but related, key for decryption. One of these keys is kept private (known only to the creator) and is known as the private key. The corresponding key is generally made visible to others and is known as the public key.

Content encrypted with the private key can only be decrypted with the corresponding public key. This will be the basis for digital signatures. Anything that is encrypted with a public key can be encrypted only with the corresponding private key. This is the basis for authentication and covert communication.

Although public and private keys are related, because public keys are expected to be shared, it is crucial for security that there is no way to compute a private key from a public key. Public key cryptography is based on trapdoor functions. These are one-way functions: there is no known way to compute the inverse unless you have extra data: the other key.

One example of a trapdoor function is the difficulty of factoring. Suppose you are given the number 10085134437017 and are told that it is the product of two prime numbers. How would you find them? We don’t know of algorithms that would yield an answer, and you would have to perform an exhaustive search. However, if you are told that one of the factors is 3906467 then it becomes trivial to perform a division to find the other factor. For cryptographic security, these numbers would be hundreds of decimal digits long.

RSA public key cryptography

The RSA algorithm was the first public key encryption algorithm. Its security is based on the difficulty of finding the factors of the product of two large prime numbers. Unlike symmetric ciphers, RSA encryption is a matter of performing arithmetic on large numbers. Like symmetric ciphers, it works on a block of data at a time and treats that block as a number. Plaintext is converted to ciphertext by the formula:

c = me mod n

Where m is a block of plaintext, e is the encryption key, and n is an agreed-upon modulus that is the product of two primes. To decrypt the ciphertext, you need the decryption key, d:

m = cd mod n

Given the ciphertext c, e, and n, there is no efficient way to compute the inverse to obtain m. Should an attacker find a way to factor n into its two prime factors, however, the attacker would be able to reconstruct the encryption and decryption keys, e and d.

The math behind RSA - an example

To demystify RSA, here’s the math behind it. We’ll do an example with very small numbers. The results will be completely insecure cryptographically but will more concisely illustrate how the math works.

(1) Select two distinct prime numbers \(p\) and \(q\). Typical numbers are over 300 decimal digits long for a 2048-bit key. Each number is chosen by starting at a random point and searching for the first prime number. The algorithm will typically test to see if the number is divisible by the first few hundred primes and then apply several iterations of the Rabin-Miller Primality Test to get a high confidence that the number really is prime.

  • Let’s choose \(p = 3\) and \(q = 11\).

(2) We multiply the two primes together to get a value \(n = pq\), which will be used as the modulus for both the public and private keys.

  • So, \(n = 3 \times 11 = 33\).

(3) Calculate a value called the totient function \(\phi(n)\), which is \((p-1)(q-1)\)

  • Here, \(\phi(33) = (3-1) \times (11-1) = 2 \times 10 = 20\).

(4) Choose an integer value for the public exponent, \(e\), such that \(1 < e < \phi(n)\), and \(e\) and \(\phi(n)\) are relatively prime, meaning they share no factors other than 1 (i.e., the greatest common divisor of e and phi is one).

  • We choose \(e = 7\).

(5) The last step is to find the secret exponent. This requires applying the extended Euclidean algorithm \(d\) to find a value \(d\) such that \(de \equiv 1 \mod \phi(n)\). In other words, we need \(d\) to be the modular multiplicative inverse of \(e\) modulo \(\phi(n)\).

  • We find a \(d\) such that \(7d \equiv 1 \mod 20\).
  • The calculated value of \(d\) is 3.

(6) Now we have a public-private key pair. We can discard \(\phi(n)\), \(p\), and \(q\). Each key is the exponent and the shared modulus.

  • Public Key: \((e=7, n=33)\)
  • Private Key: \((d=3, n=33)\)

Because our modulus is tiny (33 takes up barely more than 5 bits), we can only show examples of encrypting and decrypting tiny values. Here are a few examples:

Encrypt 18 with the private key: c = me mod n = 187 mod 33 = 6

Decrypt 6 with the public key: d = md mod n = 63 mod 33 = 18

If we encrypt something with the public key, we need to decrypt it with the private key:

Encrypt 29 with the public key: c = md mod n = 293 mod 33 = 2

Decrypt 2 with the private key: m = ce mod n = 27 mod 33 = 29

Elliptic curve cryptography (ECC)

About ten years after RSA was created, mathematicians discovered another trapdoor function that could be used for public key cryptography. About 20 years after that, in 2004, elliptic curve cryptography was introduced. Elliptic curve cryptography (ECC) is an alternative to RSA. It is based on finding points along a prescribed elliptic curve. Elliptic curves are a family of equations that take the form:

y2 = x3 + ax + b mod p

Contrary to its name, elliptic curves have nothing to do with ellipses or conic sections and look like bumpy lines. With elliptic curves, multiplying a point on a given elliptic curve by a number will produce another point on the curve.

All parties need to agree on a prime number that will serve as the modulus for the arithmetic, a specific curve equation, and a base point on the curve (called the generator). For the cryptography to be secure, the curve and generator must be carefully chosen. The U.S. NIST recommends fifteen specific elliptic curves of varying security levels (see FIPS 186–5).

The algorithm then selects a random private key, k that can be any integer within a specific range. It then computes the public key by performing a point multiplication of the key and the generator: P = k * G.

However, given that result, it is difficult to find what number was used. This article walks through the math and an example of ElGamal ECC encryption. Some good descriptions of how ECC works that don’t go too deep into the math are a

The security in ECC rests not on our inability to factor numbers as in RSA but on our inability to compute discrete logarithms in a finite field.

The RSA algorithm is still widely used but ECC has significant advantages:

  • ECC can use far shorter keys for the same degree of security. Security comparable to 256-bit AES encryption requires a 512-bit ECC key but a 15,360-bit RSA key

  • ECC requires less CPU consumption and uses less memory than RSA. It is significantly faster for encryption and decryption than RSA.

  • Generating ECC keys is much faster than RSA. Computing the public key still requires performing the point multiplication and is still slower than symmetric algorithms, where you need only a single key, which is a random number).

On the downside, ECC is 27 years younger than RSA, so it took time for people to get comfortable that it’s been well-vetted and the proper set of cryptographically-hard curves were selected. It also took time for people to get confidence that well-tested libraries were implemented and there’s enough adoption across platforms to enable interoperability.

As a standard, ECC was also tainted because the NSA inserted weaknesses into the ECC random number generator that effectively created a backdoor for decrypting content. This has been fixed, of course. There was also mistrust of the government and the NIST’s choice of the 15 approved elliptic curves and generators. After years of study, ECC is considered the preferred choice over RSA for virtually all applications.

If you are interested, here are a few somewhat easy-to-understand tutorials on ECC:

Quantum computing

Quantum computers are markedly different from conventional computers. Conventional computers store and process information that is represented in bits, with each bit having a distinct value of 0 or 1. Quantum computers use the principles of quantum mechanics, which include superposition and entanglement. Instead of working with bits, quantum computers operate on qubits, which can hold values of “0” and “1” simultaneously via superposition. The superpositions of qubits can be entangled with other objects so that their final outcomes will be mathematically related. A single operation can be carried out on 2n values simultaneously, where n is the number of qubits in the computer.

Some problems can be solved exponentially faster with quantum computers than with conventional computers. Most cannot. In 1994, Peter Shor, a mathematician at MIT, studied the kinds of mathematical problems that could be attacked with a hypothetical quantum computer. The assumptions he made assumed that the quantum state could be maintained as long as needed and that current computers are still too noisy to do this, but he analyzed what could be possible. Unfortunately for the cryptography community, a quantum computer would be able to factor huge numbers exponentially faster than current computers. Shor’s algorithm shows that a suitable quantum computer will be able to find the prime factors of large integers and compute discrete logarithms far more efficiently than is currently possible. Hence, Shor’s algorithm will be able to crack public-key-based systems such as RSA, Elliptic Curve Cryptography, and Diffie-Hellman key exchange.

In December 2024, Google announced an advance in error reduction when scaling to 7x7 grids of quantum bids, allowing their system to maintain quantum states for 100 microseconds - five times longer than other versions. More interestingly, they found that, in their system, adding more qubits reduces errors, which is the opposite of the way previous quantum computers behave. They plan to build a system with one million qubits by the end of the decade. [Also, see the article in Nature]. Note that despite the media attention this received in late 2024, this is still an early prototype. Google researchers don’t consider this to be a true fault-tolerant qubit.

At around the same time, Google demonstrated performance advances in the Random Circuit Sampling (RCS) benchmark using their 105-qubit system. In 2019, they were able to solve an RCS problem in three minutes that would take one of the fastest supercomputers around 10,000 years to solve. In 2024, they were able to run an RCS problem in under five minutes that would take a supercomputer 1025 years. While impressive, the value of this claim of quantum supremacy has been questioned by some. The researchers targeted the same problem that they did five years earlier on a 50-qubit computer, and a group later claimed that they were able to perform the same calculation on a conventional computer in similar time.

So far, quantum computers are very much in their infancy, and it is not clear when – or if – large-scale quantum computers that are capable of solving useful problems will be built. The goal is to achieve quantum supremacy: creating a sufficiently powerful quantum computer that can solve problems that current computers cannot solve in any appreciable amount of time. Amazon, Baidu, Tencent, Huawei, IBM, Google, Amazon, Microsoft, Intel, D-Wave Systems, and many others are all working on trying to create useful quantum computers. So are government spy agencies.

It is unlikely that useful quantum computers capable of attacking cryptography will be built in the next several years, but we expect that there’s a decent chance that they will be built eventually. Creating one is also a type of arms race between China and the U.S., as possessing this technology might allow intelligence agencies to sniff network traffic, such as diplomatic communication, and decrypt it in the future when the technology is available. This technique is referred to as an HNDL attackharvest now, decrypt later.

In 2016, the NSA called for a migration to “post-quantum cryptographic algorithms”. The goal was to find useful trapdoor functions that do not rely on multiplying large primes, computing exponents, or any other mechanisms that can be attacked by quantum computation.

In August 2024, the National Institute of Standards and Technology (NIST) announced that its first set of quantum-resistant encryption algorithms. NIST will continue to evaluate other post-quantum algorithms to serve as backup standards. The algorithms are:

FIPS 203: Module-Lattice-Based Key-Encapsulation Mechanism Standard
This standard defines a cryptographic method for securely sharing keys using lattice-based math. It works by encapsulating a shared secret key using a public key.
FIPS 204: Module-Lattice-Based Digital Signature Standard
This standard provides a quantum-resistant way to create and verify digital signatures using lattice-based cryptographic techniques. Digital signatures generated by this method ensure the authenticity and integrity of messages.
FIPS 205: Stateless Hash-Based Digital Signature Standard
This standard outlines a secure method for digital signatures using cryptographic hash functions. It is particularly useful for signing a limited number of messages, as it avoids the need for state tracking, making it simpler and robust against certain classes of attacks.

Two of these algorithms are based on lattices. Lattices are mathematical structures consisting of a regular grid of points in multidimensional space. Each point is a linear combination of a set of basis vectors. Lattice-based cryptography leverages the difficulty of solving certain problems within these structures, such as the Shortest Vector Problem (SVP) or the Learning with Errors (LWE) problem, which remain computationally hard even for quantum computers. Key encapsulation (FIPS 203) uses lattices to securely exchange keys by embedding errors in lattice points, making it infeasible for attackers to recover the original key. Digital signatures (FIPS 204) rely on finding lattice points to generate unique, secure signatures that are computationally infeasible to forge.

Symmetric cryptosystems, such as AES, are not particularly vulnerable to quantum computing since they rely on flipping and permuting bits rather than applying mathematical functions to the data. The best potential attacks come via Grover’s algorithm [also this], which yields only a quadratic (\(\sqrt(n)\) vs \(n\) operations) rather than an exponential speedup in key searches. This will reduce the effective strength of a key by a factor of two. For instance, a 128-bit key will have the strength of a 64-bit key on a conventional computer. It is easy enough to use a sufficiently long key (256-bit AES keys are currently recommended) so that quantum computing poses no serious threat to symmetric algorithms.

References:

Secure communication

Symmetric cryptography

Communicating securely with symmetric cryptography is easy. All communicating parties must share the same secret key. Plaintext is encrypted with the secret key to create ciphertext and then transmitted or stored. It can be decrypted by anyone who has the secret key.

Asymmetric cryptography

Communicating securely with asymmetric cryptography is a bit different. Anything encrypted with one key can be decrypted only by the other related key. For Alice to encrypt a message for Bob, she encrypts it with Bob’s public key. Only Bob has the corresponding key that can decrypt the message: Bob’s private key.

Key distribution and hybrid cryptography

The biggest problem in secure communication for thousands of years has been key distribution. How do you provide all trusted parties with keys so that they can encrypt their communications? In earlier times, this could only be solved by sharing keys, codebooks, or cipher devices in person. In later eras, keys would be shared in person or via some trusted channel, such as a telephone call (assuming that provided sufficient security and wiretapping wasn’t a concern). With computer networking, the challenge became more challenging: how can you communicate securely with somebody you’ve never met before and may perhaps not even know?

For Alice and Bob to communicate, they need to share a secret key that no adversaries can get. However, Alice cannot send the key to Bob since it would be visible to adversaries. She cannot encrypt it because Alice and Bob do not share a key yet.

Asymmetric (public key) cryptography alleviates the problem of sending a key over a non-secure communication channel. However, public key cryptography is never used to encrypt large amounts of data for several reasons:

  1. Asymmetric algorithms, especially RSA, are considerably slower than symmetric cryptography. AES, for example, is approximately 1,500 times faster for decryption than RSA and 40 times faster for encryption. AES is also much faster than ECC. In one case you’re flipping and shifting bits. In the other case you’re taking exponents of enormously huge numbers.

  2. Key generation is also considerably slower with RSA or ECC than it is with symmetric algorithms, where there is just a single key and that key is just a random number rather than a set of carefully chosen numbers with specific properties. ECC key generation is much faster than RSA since the private key can be an arbitrary random number within a specific range, but it still requires computing the public key.

  3. Because public key cryptography is based on arithmetic properties, certain arithmetic relationships between plaintext may also be present between the resulting ciphertext, which can give an adversary insight into the content of messages.

  4. For a message to be sent securely, it has to be encrypted with the recipient’s public key. Only the recipient will be able to decrypt it with the corresponding private key. If attackers expect some predictable content, they can encrypt all the content they expect and then compare it with messages from the sender.

Because of these factors, asymmetric algorithms, such as RSA and ECC, are not used to encrypt content. Instead, we use a technique called hybrid cryptography, where a public key algorithm is used to encrypt a randomly generated key that will encrypt the message with a symmetric algorithm. This randomly generated key is called a session keysince it is generally used for one communication session and then discarded.

For Alice to send a key to Bob:

  1. Alice generates a random session key.
  2. She encrypts it with Bob’s public key & sends it to Bob.
  3. Bob decrypts the message using his private key and now has the session key.

Bob is the only one who has Bob’s private key and is thus the only one who can decrypt that message and extract the session key.

Diffie-Hellman key exchange

The Diffie-Hellman key exchange algorithm allows two parties to establish a common key without disclosing any information that would allow any other party to compute the same key. Each party generates a private key and a public key. Despite their name, these are not encryption keys; they are just numbers. Diffie-Hellman is a form of a public key algorithm but does not implement encryption. It enables Alice to compute a common key using her private key and Bob’s public key. Similarly, Bob can compute the same common key by using his private key and Alice’s public key.

Diffie-Hellman uses the one-way function abmod c. Its one-wayness is due to our inability to compute the inverse: a discrete logarithm. Anyone may see Alice and Bob’s public keys but will be unable to compute their common key. Although Diffie-Hellman is not a public key encryption algorithm, it behaves like one in the sense that it allows us to exchange keys without having to use a trusted third party.

Here’s how it works:

Suppose Alice and Bob want to communicate:

  1. All arithmetic is performed in a field of integers modulo some large prime number. Alice and Bob agree on a large prime number p and some number α < p.
  2. Alice generates a private key a, which is simply a random number, and a public key, which is derived from the public key, A = αa mod p
  3. Bob generates a private key b, which is also a random number, and a public key, which is derived from his public key, B = αb mod p
  4. They each send their public key to each other.
  5. To compute a common key, one simply takes the remote user’s public key to the power of one’s private key mod p:
    • Alice computes the common key as C = Ba mod p (Bob’s public key to the power of Alice’s private key)
    • Bob computes the common key as C = Ab mod p (Alice’s public key to the power of Bob’s private key)

The magic of the algorithm is that there is no feasible way for an intruder who only sees the public keys, A and B, to be able to compute the common key C.

Forward Secrecy

If an attacker compromises Bob’s private key, they might be able to decrypt old communications—assuming each session began with a session key encrypted using Bob’s public key. Forward secrecy, also called perfect forward secrecy, ensures that the compromise of a long-term key (e.g., someone’s private key) does not allow an attacker to decrypt past session keys. This means that even if an adversary steals a long-term private key, they cannot retroactively decrypt previous communications.

Forward secrecy is critical for securing communication sessions but is not relevant for stored encrypted documents. The key difference is that while past communication sessions should remain confidential even if a long-term key is compromised, users must still be able to decrypt their own stored documents, which necessitates long-term key usage.

Session Keys vs. Ephemeral Keys

Achieving forward secrecy requires the use of ephemeral keys, which are single-use key pairs generated for each session. These ephemeral keys are used to derive or encrypt a session key, a symmetric key that encrypts the actual communication. The key distinction is:

  • Ephemeral key: A temporary asymmetric (public/private) key pair used for key exchange in a single session.

  • Session key: A symmetric key, derived from the key exchange, used to encrypt the communication itself.

The ephemeral key is thrown away as soon as the session key is established since it is no longer needed. The session key is discarded after the communication session is over since a new one will be created for the next session.

For forward secrecy, every session must derive a fresh session key from newly generated ephemeral keys. This prevents the compromise of any long-term key (such as an RSA private key) from affecting past communications.

Diffie-Hellman for Forward Secrecy

The Diffie-Hellman key exchange is a widely used method for achieving forward secrecy. It allows two parties to establish a shared secret over an insecure channel without relying on long-term keys.

Each time Alice and Bob communicate:

  1. They generate a new key pair (a public and private key).
  2. They exchange their Diffie-Hellman public keys.
  3. They compute a shared secret that only they can derive using their private key and the other side’s public key.
  4. This shared secret is used to generate a session key for encryption. As we’ll later see when we look at TLS (Transport Layer Security), in many cases, the shared secret is passed through a key derivation function to generate a secure session key for encryption.

Because these ephemeral key pairs are discarded after the session, past communications remain secure even if an attacker later gains access to a long-term private key.

Elliptic Curve Diffie-Hellman (ECDH)

A modern variant of Diffie-Hellman, Elliptic Curve Diffie-Hellman (ECDH), is widely used because it offers the same level of security as traditional Diffie-Hellman but with significantly smaller key sizes, making it more efficient in both computation and bandwidth. The mechanism is similar to classic Diffie-Hellman, but it relies on elliptic curve mathematics instead of modular exponentiation. Its security is based on the difficulty of solving the Elliptic Curve Discrete Logarithm Problem.

The process works as follows:

  1. Alice and Bob agree on an elliptic curve and a generator point G.
  2. Alice generates a random private key a and computes her public key A = a * G.
  3. Bob generates a random private key b and computes his public key B = b * G.
  4. Alice sends her public key A to Bob, and Bob sends his public key B to Alice.
  5. Alice computes the shared secret: C = a * B.
  6. Bob computes the shared secret: C = b * A.

Since (a * G) * b = (b * G) * a, both Alice and Bob arrive at the same shared secret a * b * G. An attacker who only sees A and B cannot compute this shared secret, ensuring security.

Efficiency of Key Generation

Diffie-Hellman is well-suited for forward secrecy because generating new key pairs is computationally efficient, especially compared to RSA.

RSA key generation is computationally expensive, making it impractical for ephemeral key exchange. ECC key generation is efficient, but Elliptic Curve Diffie-Hellman (ECDH) based key exchanges are even faster, making them the preferred choice for ephemeral keys in forward secrecy.

Summary

  • Forward secrecy ensures that past communications remain secure even if a private key is compromised.
  • Ephemeral key pairs are used to establish fresh session keys for each session, preventing long-term key compromise from affecting past messages.
  • Diffie-Hellman (DH) and Elliptic Curve Diffie-Hellman (ECDH) are the preferred key exchange methods for achieving forward secrecy due to their efficiency in generating new key pairs.
  • ECDH is more secure and computationally efficient than traditional Diffie-Hellman, making it the preferred choice in modern cryptographic protocols.

Message Integrity

One-way functions

A one-way function is a function that is computationally easy to evaluate in one direction but infeasible to reverse—meaning there is no known efficient method to compute its inverse.

A trapdoor function is a specialized form of a one-way function, which allows inversion only if certain secret information (such as a private key) is known. Trapdoor functions are essential in asymmetric cryptography, where they enable secure key exchange and encryption by ensuring that only authorized parties can reverse the computation.

Cryptographic hash functions

An important application of one-way functions is in cryptographic hash functions. A cryptographic hash function maps an arbitrary-length input to a fixed-size output, often 224, 256, 384, or 512 bits. Unlike standard hash functions used in data structures like hash tables, cryptographic hash functions must satisfy additional security properties. These properties make cryptographic hash functions essential for digital signatures, password hashing, and integrity verification in secure communication protocols.

Good cryptographic hash functions (e.g., SHA-2, SHA-3) have several properties:

A strong cryptographic hash function (e.g., SHA-2, *SHA-3) must satisfy several key properties:

  1. Fixed-Length Output
    Like all hash functions, cryptographic hash functions take an input of arbitrary length and produce a fixed-length output (e.g., 256-bit or 512-bit hash values).

  2. Deterministic
    They always produce the same output for the same input. If two computations of a hash function with the same input produced different results, it would undermine their reliability in cryptographic applications.

  3. Preimage Resistance (Hiding Property)
    Given a hash value H, it should be infeasible to find an input M such that H = hash(M). This ensures that hash values do not reveal information about their original inputs.

  4. Avalanche Effect (Output Unpredictability)
    A small change in the input should produce a completely different and unpredictable hash value. This means that even changing a single bit in the input should result in a drastically different hash, making it impossible to infer any information about the original message from the hash.

  5. Collision Resistance
    While hash functions inevitably have collisions (since the output space is finite while the input space is infinite), a good cryptographic hash function makes it computationally infeasible to find two distinct inputs that produce the same hash. This prevents attackers from forging messages with the same hash.

  6. Efficiency
    The function must be computationally efficient, allowing fast hash computations even for large inputs. This ensures practical usability in applications like digital signatures, checksums, and message authentication.

Collisions, the Pigeonhole Principle, and the Birthday Paradox

The Inevitability of Collisions

A cryptographic hash function compresses an arbitrarily large input space into a fixed-size output (e.g., a 256-bit hash). Due to the pigeonhole principle, there must exist multiple distinct inputs that hash to the same output—these are called collisions. However, a well-designed hash function ensures that finding such collisions is computationally infeasible.

For example, with a 256-bit hash, there are 2256 possible hash values, making the probability of an accidental collision extremely small. To put it in perspective, the odds of randomly finding another input that produces the same 256-bit hash are 1 in 2256, or about:

0.00000000000000000000000000000000000000000000000000000000000000000000000000086%

which means a 99.99999999999999999999999999999999999999999999999999999999999999999% chance that no collision will occur.

Collision Attacks: Targeted vs. Generic

For an attacker, the ultimate goal is to construct a meaningful message that produces the same hash as another message. This would allow them to replace an original message (e.g., a financial transaction request) with a fraudulent one. However, finding such a targeted collision—called a second-preimage attack—is much harder than simply finding any two inputs that hash to the same value.

There are two main types of attacks that exploit hash collisions:

Second-Preimage Attack (Targeted Collision): Given a specific message M1, an attacker attempts to find a different message M2 such that:

hash(M1) = hash(M2)

This is very difficult because the attacker is constrained to finding a collision for a particular pre-existing message.

Birthday Attack (Generic Collision): Instead of searching for a collision with a specific message (a pre-image), the attacker tries to find any two different messages M1 and M2 that hash to the same value: ​> hash(M1) = hash(M2)

This is significantly easier due to the birthday paradox, which shows why finding any two inputs with the same hash (a generic collision) is easier than expected. The number of hashes needed to find such a collision is not 2n but roughly \(\sqrt{2^n} = 2^{n/2}\), where \(n\) is the hash length.

This means that a 256-bit hash function provides only 128-bit security against brute-force collision attacks. Thus, while 128-bit security is still infeasible for any foreseeable computation, it highlights why cryptographers continually push for stronger hash functions as computational power advances.

Because of these properties, cryptographic hash functions provide extremely high assurance that even the smallest modification to a message will result in a completely different hash. This makes them invaluable for ensuring data integrity, digital signatures, and password hashing in cryptographic applications.

Popular hash functions include SHA-1 (160 bits), SHA-2 (commonly 256 and 512 bits), and SHA-3 (256 and 512 bits). Google demonstrated the ability to find a collision in SHA-1 in February 2017 but had to run over 9 quintillion SHA-1 computations to perform this attack. At that point, SHA-1 was already being phased out since weaknesses were discovered. Although there is no advantage to using it over SHA-2, for almost any practical purpose, it’s still a secure form of checkup and is still (as of 2024) used in Git and in BitTorrent.

Message Authentication Codes (MACs)

A cryptographic hash helps us ensure message integrity: it serves as a checksum that allows us to determine if a message has been modified. If the message is modified, it no longer hashes to the same value as before. However, if an attacker modifies a message, she may be able to modify the hash value as well. To prevent this, we need a hash that relies on a key for validation. This is a message authentication code, or MAC. A MAC is also called a keyed hash. MACs can be hash-based or block cipher-based:

Hash-based MAC (HMAC)

A hash-based MAC is a specific method, defined in RFC 2104, for converting regular hash functions into MACs by using a cryptographic hash function, such as SHA-256 (the 256-bit version of a SHA-2 hash), to hash the message and the key. Anyone who does not know the key will not be able to recreate the hash. It is computed as:

HMAC(m, k) = H((opadk) || H(ipadk || m))

where
H = cryptographic hash function
opad = outer padding 0x5c5c5c5c … (01011100…)
ipad = inner padding 0x36363636… (00110110…)
k = secret key
m = message
⊕ = XOR
|| = concatenation

Note the extra hash. The simple form of an HMAC would simply be hash(m, k). The HMAC standard devised this extra hashing and padding to strengthen the HMAC against weaker hash functions and the possibility of an attacker being able to exploit a hash collision:

  1. The message is appended to a key that is XORed with a constant (the inner pad). This modified key, together with the message, is then hashed.
  2. The result from step 1 is appended to the key that is XORed with a different constant (the outer pad). This set of data is now hashed to form the HMAC.

Logically, however, you can think of an HMAC as simply the hash of the key and message.

Block cipher-based MAC (CBC-MAC)

Cipher Block Chaining (CBC) ensures that each encrypted block depends on all preceding blocks. CBC-MAC leverages this property by initializing encryption with a zero initialization vector (IV) and processing the message using a block cipher in CBC mode. However, instead of retaining all output blocks, it discards everything except the final block, which serves as the Message Authentication Code (MAC). Any alteration to the message propagates through the encryption process, changing the final block. Since encryption requires the secret key, an attacker cannot forge a valid MAC without it.

CBC-MAC produces a fixed-length output equal to the cipher’s block size. Due to the confusion and diffusion properties of block ciphers, it satisfies the essential characteristics of a cryptographic hash function, ensuring integrity and authenticity.

Digital signatures

Message authentication codes rely on a shared key. Anybody who possesses the key can modify and re-sign a message. There is no assurance that the action was done by the author of the message. Digital signatures have stronger properties than MACs:

  1. Only you can sign a message but anybody should be able to validate it.
  2. You cannot copy the signature from one message and have it be valid on another message.
  3. An adversary cannot forge a signature, even after inspecting an arbitrary number of signed messages.

Digital signatures require three operations:

  1. Key generation: {private_key, verification_key } := gen_keys(keysize)
  2. Signing: signature := sign(message, private_key)
  3. Validation: isvalid := verify(message, signature, verification_key)

Since we trust hashes to be collision-free, it makes sense to apply the signature to the hash of a message instead of the message itself. This ensures that the signature will be a small, fixed size, will be easy to embed in structures that require integrity checking, and will create minimal transmission or storage overhead for verification. It also ensures that the message is not encrypted or altered and avoids the weaknesses of using public key algorithms to encrypt large amounts of content.

We saw how public key cryptography can be used to provide confinentiality: Alice encrypts a message using Bob’s public key to ensure that only Bob could decrypt it with his private key.

We can use the public key backward: Alice can encrypt a message using her private key. Anyone can decrypt the message using her public key but, in doing so, would know that the message was encrypted by Alice.

A digital signature is generated by applying a cryptographic signature algorithm to a message hash using the signer’s private key. This ensures authenticity, integrity, and non-repudiation, as only the corresponding public key can verify the signature.

There are several commonly used digital signature algorithms:

DSA, the Digital Signature Algorithm
The current NIST standard generates key pairs that are secure because of the difficulty of computing discrete logarithms.
ECDSA, Elliptic Curve Digital Signature Algorithm
A variant of DSA that uses elliptic curve cryptography
EdDSA: Edwards-curve Digital Signature Algorithm
Another variant that uses twisted Edwards Curves, which are another family of elliptic curves.
Public key cryptographic algorithms
RSA or Elliptic Curve Cryptography applied to message hashes.

Modern digital signature algorithms, such as RSA, ECDSA, and EdDSA, rely on cryptographic hashing coupled with mathematical trapdoor functions rather than encrypting a hash. They are not data encryption algorithms and are designed solely for digital signatures.

The signer generates the signature using their private key, and anyone with the signer’s public key can verify it by checking that the signature corresponds to the expected hash of the message. Without access to the private key, no other party can generate a valid signature.

Even though the implementation is different, logically, you can think of a digital signature as having the effect of encrypting a hash of a message with your private key – something nobody else can do.

Note that, with a MAC, the recipient or anyone in possession of the shared key can create the same MAC. With a digital signature, the signature can only be created by the owner of the private key. Unlike MACs, digital signatures provide non-repudiation – proof of identity. Alice cannot claim that she did not create a signature because nobody but Alice has her private key. Also unlike MACs, anyone can validate a signature since public keys are generally freely distributed. as with MACs, digital signatures also provide proof of integrity, assurance that the original message has not been modified.

Covert and authenticated messaging

We ignored the encryption of a message in the preceding discussion; our interest was assuring integrity. However, there are times when we may want to keep the message secret and validate that it has not been modified. Doing this involves sending a signature of the message along with encrypting the message.

Alice can send a signed and encrypted message to Bob by using hybrid cryptography. She will:

  1. Create a signature of the message. This can be thought of as the hash of the message encrypted with her private key.

  2. Create a session key for encrypting the message. This is a throw-away key that will not be needed beyond the communication session.

  3. Encrypt the message using the session key. She will use a fast symmetric algorithm to encrypt this message.

  4. Package up the session key for Bob: she encrypts it with Bob’s public key. Since only Bob has the corresponding private key, only Bob will be able to decrypt the session key.

  5. She sends Bob the encrypted message, encrypted session key, and signature.

We will later see how, in communication frameworks such as Transport Layer Security (TLS), we will use separate keys for authenticating both sides, encrypting the content, and authenticating the content. Moreover, if a secure communication channel has been established, there is no compelling reason to use digital signatures since MACs can be computed more efficiently and a temporary shared key can be easily established for the session and used for the MAC.

Identities

A signature verification key (e.g., a public key) can be treated as an identity. You possess the corresponding private key and therefore only you can create valid signatures that can be verified with the public key. This identity is anonymous; it is just a bunch of bits. There is nothing that identifies you as the holder of the key. You can simply assert your identity by being the sole person who can generate valid signatures and decrypt content encrypted with the corresponding public key.

Since you can generate an arbitrary number of key pairs, you can create a new identity at any time and create as many different identities as you want. When you no longer need an identity, you can discard your private key for that corresponding public key. However, as long as you have that private key, you can validate that you are the originator of any messages signed with the key and you are the only one who can decrypt content encrypted with that public key.

Identity binding: digital certificates

While public keys provide a mechanism for asserting integrity via digital signatures, they are themselves anonymous. Alice can Bob’s public key but has no confidence in knowing that the key really belongs to Bob and was not planted by an adversary. Some form of identity binding of the public key must be implemented for you to know that you really have my public key instead of someone else’s. How does Alice really know that she has Bob’s public key?

X.509 digital certificates provide a way to do this. A certificate is a data structure that contains user information (called a distinguished name) and the user’s public key. This data structure also contains a signature of the certification authority. The signature is created by taking a hash of the rest of the data in the structure and encrypting it with the private key of the certification authority. The certification authority (CA) is responsible for setting policies on how they validate the identity of the person who presents the public key for encapsulation in a certificate.

To validate a certificate, you would hash all the certificate data except for the signature. Then you would decrypt the signature using the public key of the issuer. If the two values match, then you know that the certificate data has not been modified since it has been signed. The challenge is how to get the public key of the issuer. Public keys are stored in certificates, so the issuer would have a certificate containing its public key. This certificate can be signed by yet another issuer. This process is called certificate chaining. For example, Alice can have a certificate issued by the Rutgers CS Department. The Rutgers CS Department’s certificate may be issued by Rutgers University. Rutgers University’s certificate could be issued by the State of New Jersey Certification Authority, and so on. At the very top level, we will have a certificate that is not signed by any higher-level certification authority. A certification authority that is not underneath any other CA is called a root CA. In practice, this type of chaining is rarely used. More commonly, there are hundreds of autonomous certification authorities acting as root CAs that issue certificates to companies, users, and services. The certificates for many of the trusted root CAs are preloaded into operating systems or, in some cases, browsers. See here for Microsoft’s trusted root certificate participants and here for Apple’s trusted root certificates.

Every certificate has an expiration time (often a year or more in the future). This provides some assurance that even if there is a concerted attack to find a corresponding private key to the public key in the certificate, such a key will not be found until long after the certificate expires. It is also a commitment from the CA that they will be responsible for tracking the validity of the certificate for that interval of time.

A certificate may become invalid because the private key was leaked, the certificate owner may no longer be considered trustworthy (for example, an employee leaves a company), or the certificate was impropely issued.

In this case, a certificate can be revoked. Each CA publishes a certificate revocation list, or CRL, containing lists of certificates that they have previously issued that should no longer be considered valid. To prevent spoofing the CRL, the list is, of course, signed by the CA. Each certificate contains information on where to obtain revocation information.

For example, in July 2024, DigiCert, a well-known CA, announced that it will revoke over 83,000 certificates because of errors in how it validated the owner of the domain within the certificate.

The challenge with CRLs is timing: not everyone may check the certificate revocation list in a timely manner and some systems may accept a certificate not knowing that it was revoked. Some systems, particularly embedded systems, may not even be configured to handle CRLs.

Code Signing

Code signing is a security practice that applies digital signatures to software, enabling the verification of the software’s authenticity and integrity. It ensures that the software has not been altered or compromised since it was signed.

The primary goals of code signing are to:

  1. Ensure Integrity: Code signing confirms that the software has not been altered or corrupted since it was signed. This is crucial in protecting users from downloading malware-infected or unauthorized modifications of software.

  2. Verify Authenticity: It allows users and systems to verify the identity of the software publisher. This is important for building trust, especially in environments where software is downloaded from the internet.

  3. Non-Repudiation: By signing code, developers and organizations can be uniquely linked to their software, making it difficult for them to deny publishing it. This accountability is vital for both security and legal reasons.

Mechanisms of Code Signing

Code signing uses the following mechanisms:

  1. Key Pair Generation: The software publisher generates a public/private key pair. The private key is kept secure and used for signing the software, while the public key is distributed in the form of a digital certificate for verification.

  2. Hashing: The software to be signed is first hashed to create a small, fixed-length message that is unique to the software.

  3. Signing: The hash is then encrypted with the publisher’s private key, creating the digital signature. This process also often involves the addition of a timestamp to verify when the software was signed.

  4. Verification: When a user downloads the software, their system uses the publisher’s public key to decrypt the signature, re-computes the software’s hash, and compares it to the decrypted hash. If they match, it verifies that the software hasn’t changed since it was signed and confirms the publisher’s identity.

  5. Certificate Authorities (CAs): To establish trust, code signing uses public keys packaged in certificates that are signed by approved CAs. These trusted entities issue digital certificates that validate the publisher’s identity and link them to their public key. When verifying signatures, the presence of a certificate from a recognized CA enables the recipient to validate the origin and authenticity of the public key before using the key to validate the signature on the software..

Implementation Examples

Platforms like Microsoft Windows, Apple macOS, Android, and iOS all implement code signing. This is a very high-level overview (you don’t need to know the distinctions since the general mechanisms for signing and verifying are essentially the same).

Microsoft Windows

Microsoft Windows uses a framework named Authenticode to sign and verify operating system drivers and applications.

  • Certificate acquisition: Obtain a code signing certificate from a trusted Certificate Authority (CA) or use a self-signed certificate. Trusted certificates must be used for operating system drivers.
  • Signing software: Use tools like Microsoft SignTool to sign software binaries (.exe, .dll, etc.) with your private key.
  • Verification: Windows checks the digital signature to ensure software integrity during installation.

Apple macOS

Apple code signing requires developers to sign apps with certificates issued through the Apple Developer Program for their operating system platforms, with an additional notarization process for macOS apps distributed outside the Mac App Store.

  • Certificate acquisition: Obtain a Developer ID certificate by enrolling in the Apple Developer Program. Apple validates their real-word identity before issuing a digital certificate. The certificate allows developers to sign apps and submit them to the App Store for distribution.
  • Signing software: Developers sign applications using Xcode or codesign to attach a digital signature.
  • Verification: Apple reviews and verifies submitted apps prior to distribution in their App Stores. macOS’s Gatekeeper framework verifies the signature to ensure the app comes from an identified developer and has not been altered.

Android

  • Certificates: Android apps do not require using a public CA code signing certificate is not required for signing the APK files. Moreover, they require the use of a certificate that has a validity period of at least 25 years, which is something that public Certification Authorities do not offer. Instead, apps can use a self–signed certificate with minimum of 25 years validity period. Users generate a unique key using keytool and sign the Android application package (APK) with it.
  • Signing software: Sign the APK using jarsigner or Android Studio. The developer can opt into Play App Signing. In this case, developers upload their app signing private key to Google and Google performs the app signing before making the app available.
  • Verification: Android verifies the app’s signature for integrity during installation.

Linux, of course, also supports code signing. Due to the varieties of distributions, different mechanisms exist. Debian distributions, including Ubuntu, use GPG to sign software packages. The APT package manager automatically verifies these signatures when installing packages. Red Hat-based distributions use RPM package manager, which also supports package signing. Similar to APT, YUM and DNF automatically verify the signatures of RPM packages before installation. Tools for signing include gpg, signify, minisign, and openssl.

Runtime Integrity checking

In addition to validating the signature of an entire application, some operating systems support the addition of hashes for individual memory pages of the code. This allows performing runtime integrity checks when a page of code is loaded into memory via the operating system’s demand paging mechanism. As far as I can tell, simple hashes instead of signatures are used for this to ensure high performance. Both Windows and macOS support per-page hashes.

This approach is critical for environments requiring high security, ensuring that even minor alterations are detected and prevented from execution.

References

Last modified February 19, 2025.
recycled pixels