|
|
|
|
Cryptographic AlgorithmsThis page lists commonly used
cryptographic algorithms and methods and explains the basic
underlying concepts. It also tries to give references to
implementations and textbooks. Where available, comments are also
made about the usefulness or other aspects of the algorithms.
Public key cryptosystems were invented in the late 1970's, with
some help from the development of complexity theory around that
time. It was observed that based on a problem so difficult that it
would need thousands of years to solve, and with some luck, a
cryptosystem could be developed which would have two keys, a secret
key and a public key. With the public key one could encrypt
messages, and decrypt them with the private key. Thus the owner of
the private key would be the only one who could decrypt the
messages, but anyone knowing the public key could send them in
privacy.
Another idea that was observed was that of a key exchange. In a
two-party communication it would be useful to generate a common
secret key for bulk encryption using a secret key cryptosystem (e.g.
some block cipher).
Indeed, Whitfield Diffie and Martin Hellman used ideas from
number theory to construct a key exchange protocol that started the
era of public key cryptosystems. Shortly after that Ron Rivest, Adi
Shamir and Leonard Adleman developed a cryptosystem that was the
first real public key cryptosystem capable of encryption and digital
signatures.
Later several public cryptosystems followed using many different
underlying ideas (e.g. knapsack problems, different groups on finite
fields and lattices). Many of them were soon proven to be insecure.
However, the Diffie-Hellman protocol and RSA appear to have remained
two of the strongest up to now.
TerminologyThe basic ingredient in any public key
cryptosystem is a difficult computational problem. The security of
the cryptosystem is based on the fact that the private key can be
computed from the public key only by solving this difficult problem.
We now introduce some relevant terminology used in public key
cryptography.
- Algorithm. An algorithm is an explicit description how
a particular computation should be performed (or a problem
solved). The efficiency of an algorithm can be measured as the
number of elementary steps it takes to solve the problem. So if we
claim that the algorithm takes time O(n) then we mean that
it takes n elementary steps, but we do not specify how long
one step takes.
- Computational complexity. A problem is
polynomial time or in P if it can be
solved by an algorithm which takes less than
O(nt) steps, where t is some finite
number and the variable n measures the size of the problem
instance.
If a guessed solution to a problem can be verified in
polynomial time then the problem is said to be in
NP (non-deterministic polynomial time). The set of problems
that lie in NP is very large, it includes the problem of
integer factorization.
It is NP-hard if there is no other problem in
NP that is easier to solve. There is no known polynomial
time algorithm for any NP-hard problem, and it is believed
that such algorithms in fact do not exist.
In public key cryptography the attacker is interested in
solving particular instances of a problem (factoring some given
number), rather than providing a general solution (an algorithm to
factor any possible number efficiently). This causes some concern
for cryptographers, as some instances of a problem that is
NP-hard in general may be easily solvable.
- Primes. A prime number is a number that has no divisors
except for itself and 1. Thus the integers
2,3,5,7,11,... and so on are primes. There are infinitely
many primes, and (one of) the biggest prime
numbers currently known is (26,972,593)-1.
- Factoring. Every integer can be represented uniquely as
a product of prime numbers. For example, 10 = 2 * 5 (the
notation * is common for multiplication in computer
science) and it is unique (except for the order of the factors
2 and 5). The art of factorization is almost as old
as mathematics itself. However, the study of fast algorithms for
factoring is only a few decades old.
One possible algorithm for factoring an integer is to divide
the input by all small prime numbers iteratively until the
remaining number is prime. This is efficient only for integers
that are, say, of size less than 1016 as this
already requires trying all primes up to 108. In
public key cryptosystems based on the problem of factoring,
numbers are of size 10300 and this would require
trying all primes up to 10150 and there are
about 10147 such prime numbers according to the
prime number theorem. This far exceeds the number of atoms in the
universe, and is unlikely to be enumerated by any effort.
The easy instance of factoring is the case where the given
integer has only small prime factors. For example, 759375
is easy to factor as we can write it as 35*
55. In cryptography we want to use only those
integers that have only large prime factors. Preferably we select
an integer with two large prime factors, as is done in the RSA
cryptosystem.
Currently one of the best factoring algorithms is the number
field sieve algorithm (NFS) that consists of a sieving phase and a
matrix step. The sieving phase can be distributed (and has been
several times) among a large number of participants, but the
matrix step needs to be performed on large supercomputers. The
effectiveness of the NFS algorithm becomes apparent for very large
integers, it can factor any integer of size
10150 in a few months time. The NFS algorithm
takes sub-exponential time (which is still not very efficient).
There is no known proof that integer factorization is an
NP-hard problem nor that it is not polynomial time
solvable. If any NP-hard problem were polynomial time
solvable, then also factoring would, but there is very little hope
that this is the case. It is plausible under current knowledge
that factoring is not polynomial time solvable.
- Discrete logarithms. Another important class of
problems is the problem of finding n given only some
y such that y = gn. The problem is easy
for integers, but when we are working in a slightly different
setting it becomes very hard.
To obscure the nature of n in gn, we
divide the infinite set of integers into a finite set of
remainder classes. Intuitively, we take the string of
integers and wrap it on a circle (which has circumference of
length m).
The numbers 0, m, 2m, 3m, ... all cover the same point
on the circle, and therefore are said to be in the same
equivalence class (we also write "0 = m = 2m = ... (mod
m)"). Each equivalence class has a least representative in
0 .. m-1. So you can write any integer n as t +
km for any integer t, where 0 <= t < m. It
is a convention to write n = t (mod m) in this case. Here
m is said to be the modulus.
It can be shown that you can add, subtract and multiply with
these classes of integers (modulo some m).
This structure, when m = p with p a prime number,
is often called a prime field or a Galois field, GF(p).
When m is prime, the division of nonzero integer classes is
well defined. According to the proper mathematical terminology it
is a finite field of characteristic p, where
p is the modulus. If m is not a prime number then
the structure is called a (finite) ring. More information
about groups, fields and rings can be read from any good
elementary text in algebra.
The discrete logarithm problem in the finite field GF(p)
is then stated as follows: given two positive non-zero integers
a, g (both less than p), compute n such that
a = gn (mod p). We can choose g so that a
solution for n exists for any non-zero a. To make
this problem cryptographically hard p should be a large
prime number (about 10300 and n, in
general, of same magnitude.
This problem is currently considered as hard as factoring. The
best method known at this time is the Number field sieve for
discrete logarithms (which uses similar ideas as the NFS for
factoring). The discrete logarithm problem may appear more
complicated than integer factoring, but in many ways they are
similar. Many of the ideas that work for factoring can be also
applied in the setting of discrete logarithms. There is little
hope to find a polynomial time algorithm for the computation of
discrete logarithms in GF(p). In such a case it would be
likely that factoring problems also could be efficiently solved.
The discrete logarithm problem can be applied in many other
settings, such as elliptic curves. The discrete logarithm problem
over elliptic curves is commonly used in public key cryptography.
One reason for this is that the the Number field sieve algorithm
does not work here. There are other methods for computing discrete
logarithms over elliptic curves but it appears even harder to
solve the discrete over elliptic curves than over GF(p).
This has also the effect that there are some key size benefits for
using elliptic curve based public key cryptosystems as opposed to
factoring based cryptosystems.
- Knapsacks. Given a small set of integers, the knapsack
problem consists of determining a subset of these integers such
that their sum is equal to a given integer. For example, given
(2, 3, 5, 7) and 10, we can find the solution 2 +
3 + 5 = 10, and thus solved the knapsack problem, by brute
force search.
The general knapsack-problem is provably NP-hard, and
thus appears superior to factorization and discrete logarithm used
in public key cryptosystems. Unfortunately, all cryptosystems that
have used this underlying idea have been broken - as the used
instances of the problem have not been really NP-hard.
- Lattices. Now we define a vector basis
wi = < w1, ...,
wn> for i = 1, ..., m, and the lattice
that is generated by the basis. That is, elements of the lattice
are of the form t1w1 +
t2w2 + ... +
tmwm, where ti
are integers.
The problem of finding the shortest vector in a lattice (using
the usual Euclidean distance) is a NP-hard problem (for
lattices of sufficiently large dimension).
However, the celebrated LLL-algorithm by Lenstra, Lenstra and
Lovasz computes an approximate solution in polynomial time. The
effectiveness of the LLL-algorithm comes from the fact that in
many cases approximate solutions are good enough, and that
surprisingly often the LLL-algorithm actually gives the shortest
vector. Indeed, this algorithm has been often used to break
cryptosystems based on lattice problems or knapsacks. It has been
applied also to attacks against RSA and DSS.
Practical cryptosystems The wide interest in public key
cryptography has produced several practically important
cryptosystems. In the following they are listed in order of the
underlying problem.
As a basic guideline, a public key cryptosystem is build from a
difficult problem as follows: take a difficult problem (for example,
NP-hard) for which you can find an instance that can be
solved in polynomial time. To encrypt a message, convert the
message into such an easy instance of the difficult problem, then
use the public key to convert the easy problem into a difficult one.
The result is then sent to the recipient through an insecure
channel. To decrypt use the private key to convert the
difficult problem into the easy one and solve it to regain the
message. All public key systems use this principle, although they
differ significantly in the details (like the underlying problem or
the structure of public and private key).
For good survey on appropriate key lengths see Lenstra and
Verheul's Selecting Cryptographic Key Sizes (appeared in
Public Key Cryptography 2000). They present a complete
analysis of key sizes for almost all cryptosystems.
Below, along with each cryptosystem you will find the current
recommendations for key sizes where appropriate. These
recommendations are not always equal to the Lenstra's and Verheul's.
Factorization: RSA, Rabin
- RSA (Rivest-Shamir-Adleman) is the most
commonly used public key algorithm. It can be used both for
encryption and for digital signatures. The security of RSA is
generally considered equivalent to factoring, although this has
not been proved.
RSA computation takes place with integers modulo n = p *
q, for two large secret primes p, q. To encrypt a
message m, it is exponentiated with a small public exponent
e. For decryption, the recipient of the ciphertext c =
me (mod n) computes the multiplicative reverse d
= e-1 (mod (p-1)*(q-1)) (we require that e
is selected suitably for it to exist) and obtains cd
= m e * d = m (mod n). The private key consists of
n, p, q, e, d (where p and q can be
forgotten); the public key contains only of n, e. The
problem for the attacker is that computing the reverse d of
e is assumed to be no easier than factorizing n.
More details are available in many sources, such as in the Handbook of
Applied Cryptography.
The key size (the size of the modulus) should be greater than
1024 bits (i.e. it should be of magnitude
10300) for a reasonable margin of security. Keys
of size, say, 2048 bits should give security for decades.
Dramatic advances in factoring large integers would make RSA
vulnerable, but other attacks against specific variants are also
known. Good implementations use redundancy (or padding with
specific structure) in order to avoid attacks using the
multiplicative structure of the ciphertext. RSA is vulnerable to
chosen
plaintext attacks and hardware
and fault attacks. Also important attacks against very small
exponents exist, as well as against partially revealed
factorization of the modulus.
The proper implementation of the RSA algorithm with redundancy
is well explained in the PKCS standards (see RFC's 2314,
2315,
2437).
They give detailed explanations about how to implement encryption
and digital signatures, as well as formats to store the keys. The
plain RSA algorithm should not be used in any application. It is
recommended that implementations follow the standard as this has
also the additional benefit of inter-operability with most major
protocols.
RSA is currently the most important public key algorithm. It
was patented in the United States (the patent expired in the year
2000).
- The Rabin cryptosystem may be seen as
a relative of RSA, although it has a quite different decoding
process. What makes it interesting is that breaking Rabin is
provably equivalent to factoring.
Rabin uses the exponent 2 (or any even integer) instead
of odd integers like RSA. This has two consequences. First, the
Rabin cryptosystem can be proven to be equivalent to factoring;
second, the decryption becomes more difficult - at least in some
sense. The latter is due to problems in deciding which of the
possible outcomes of the decryption process is correct.
As it is equivalent to factoring the modulus, the size of the
modulus is the most important security parameter. Moduli of more
than 1024 bits are assumed to be secure.
There are currently no standards for the Rabin algorithm but it
is explained in several books. The IEEE P1363
project might propose a standard and thus make it more widely
used.
The equivalence to factoring means only that being able to
decrypt any message encrypted by the Rabin cryptosystem enables
one to factor the modulus. Thus it is no guarantee of security in
the strong sense.
- References:
Discrete logs: Diffie-Hellman, ElGamal, DSS
- Diffie-Hellman is a commonly
used protocol for key exchange.
In many cryptographical protocols two parties wish to begin
communicating. However, assume they do not initially possess any
common secret and thus cannot use secret key cryptosystems. The
key exchange by Diffie-Hellman protocol remedies this situation by
allowing the construction of a common secret key over an insecure
communication channel. It is based on a problem related to
discrete logarithms, namely the Diffie-Hellman problem. This
problem is considered hard, and it is in some instances as hard as
the discrete logarithm problem.
The Diffie-Hellman protocol is generally considered to be
secure when an appropriate mathematical group is used. In
particular, the generator element used in the exponentiations
should have a large period (i.e. order).
Discrete logarithm algorithms can be used to attack
Diffie-Hellman, and with passive attacks that is the best that is
currently possible - assuming correctly chosen parameters. If
Diffie-Hellman is applied with usual arithmetic modulo a prime
number, then it suffices to select a large enough prime and to
take some care in selecting the generator element. Subtle problems
may be caused by bad choices of the generator. Many papers have
been written on the problems that may occur, one of the more
well-known references is Oorschot and Wiener's On
Diffie-Hellman key agreement with short exponents
(Eurocrypt'96).
Attacks against Diffie-Hellman include also the man-in-the-middle
attack. This attack requires adaptive intervention, but is in
practice very easy if the protocol doesn't use countermeasures
such as digital signatures.
Usually Diffie-Hellman is not implemented on hardware, and thus
hardware attacks are not an important threat. This may change in
the future, when mobile communication becomes more widespread.
- DSS (Digital Signature Standard). A
signature-only mechanism endorsed by the United States Government.
The underlying algorithm DSA (Digital Signature Algorithm) is
similar to the one used by ElGamal or by the Schnorr signature
algorithm. Also it is fairly efficient, although not as efficient
as RSA for signature verification. The standard defines DSS to use
the SHA-1 hash function exclusively to compute message digests.
The main problem with DSS is the fixed subgroup size (the order
of the generator element), which limits the security to around
only 80 bits. Hardware
attacks can be a concern to some implementations of DSS.
However, it is widely used and accepted as a good algorithm.
- The ElGamal public key cipher. This
is a straightforward extension of Diffie/Hellman's original idea
on shared secret generation. Essentially, it generates a shared
secret and uses it as a one-time pad to encrypt one block of data.
ElGamal is the predecessor of DSS and is perfectly usable today,
although no widely known standard has been created for it.
- Elliptic curve cryptosystems are
just another way of implementing discrete logarithm methods. An
elliptic curve is basically a set of points that satisfy the
equation y2 = x3 + ax + b when
considered in finite field of characteristic p (where
p must be larger than 3). A slightly different
equation is needed for the cases of small characteristic, p =
2 and p = 3.
The points on elliptic curves can be added together and they
form a structure called a group (in fact an abelian
group). This is just a way of saying that you can do
arithmetic with them as you can do with integers when using just
addition and subtraction.
With regard to cryptography, elliptic curves have many
theoretical benefits but they also are also very practical. There
is no known sub-exponential algorithm for computing discrete
logarithms of points of elliptic curves unlike discrete logarithms
in (the multiplicative group of) a finite field, in hyperelliptic
curves (of large genus) or in many other groups. One practical
benefit from the non-existence of a fast discrete logarithm
computation for elliptic curves is that the key size, as well as
the produced digital signatures and encrypted messages are small.
Indeed, a very simple way of computing the security limit for the
key size is to take a key size for a secret key cryptosystem in
bits and then just multiply it by 2. This gives a rough
estimate, that is good at the moment for a generic instance of
elliptic curves.
Elliptic curves can be implemented very efficiently in hardware
and software, and they compete well in speed with cryptosystems
such as RSA and DSS. There are several standardization attempts
for elliptic curve cryptosystems (for example, ECDSA by ANSI). At
the moment elliptic curves are widely known, but not very widely
used in practice.
The security of elliptic curve cryptosystems has been rather
stable for years, although significant advances have been achieved
in attacks against special instances. Nevertheless, these have
been conjectured by the leading researchers several years ago and
no great surprises have yet emerged.
The algorithm XTR recently introduced by Lenstra and Verheul
might become a good competitor for elliptic curves. However,
elliptic curves appear to be slightly better in performance, and
definitely scale better in the key size.
- LUC is a public key cryptosystem that
uses a special group based on Lucas sequences (related to
Fibonacci series) as its basic building block. It can implement
all the common discrete logarithm based algorithms, and in a sense
LUC is a class of public key algorithms.
It is possible to view the underlying structure of the
algorithm as a certain multiplicative group of a finite field of
characteristic p with degree 2 (written shortly as
Fp2) - and this can be used to prove
that there exists a sub-exponential algorithm for computing
discrete logarithms and thus attacking the LUC algorithm. Thus it
might seem that LUC is of little interest, as it is basically just
another way of implementing discrete logarithm based algorithms on
finite fields. However, LUC uses the specific arithmetic
operations derived from Lucas sequences that might be slightly
faster (by a constant factor) than what would be a more direct
approach.
The different public key algorithms based on LUC arithmetic are
called LUCDIF (LUC Diffie-Hellman), LUCELG (LUC ElGamal), and
LUCDSA (LUC Digital Signature Algorithm). Several of these are
patented.
The fact that values used in the LUC algorithms can be
represented as a pair of values gives some additional advantage
against just using integers modulo p. The computations only
involve numbers needing half the bits that would be required in
the latter case. As the LUC group operation is easy to compute
this makes LUC algorithms competitive with RSA and DSS.
However, at present there appears to be little reason to use
LUC cryptosystems, as they offer little benefit over elliptic
curves or XTR.
- XTR is a public key cryptosystem
developed by Arjen Lenstra and Eric Verheul. The XTR is similar to
LUC in that it uses a specific multiplicative group of a
particular finite field (in fact Fp6)
as its underlying group.
However, XTR has novel features such as needing only
approximately 1/3 of the bits for signatures and encrypted
messages. This is achieved using a specific trace-representation
of the elements of this group, and performing all computations
using this representation.
All discrete logarithm based public key algorithms can be
implemented with XTR ideas. So in a way XTR is a generic name for
a class of public key algorithms, similarly to LUC.
Perhaps surprisingly, the algorithm is efficient and according
to Lenstra and Verheul it might be a good substitute to elliptic
curves, DSS and even RSA. It has the advantage over elliptic
curves that it is based essentially on the same discrete log
problem as, say, DSS, which may help cryptographers and others to
accept it faster as a strong algorithm.
- References:
Knapsacks There are only a few interesting knapsack public
key cryptosystems, none of which are of practical importance.
- Rivest-Chor cryptosystem is
based on a particular variant of the knapsack problem. This is one
of the knapsack cryptosystems that has best resisted attacks.
- Merkle-Hellman. This was the
first knapsack cryptosystem. It was based on the simple idea of
hiding the easy super-increasing knapsack problem by masking.
However, it was broken already in 1980.
- References:
- M. E. Hellman and R. C. Merkle: Public Key Cryptographic
Apparatus and Method. US Patent 4,218,582, 1980.
- B. Chor and R.L. Rivest: A knapsack type public key
cryptosystem based on arithmetic in finite field, Crypto '84.
Lattices In recent years large interest has been directed
towards lattice based cryptosystems. One of the reasons is that
certain classes of lattice problems are NP-hard, and several
efficient cryptosystems have been proposed and appear strong.
Secret key algorithms use the same key for both encryption
and decryption (or one is easily derivable from the other). This is
the more straightforward approach to data encryption, it is
mathematically less complicated than public key cryptography, and
has been used for many centuries.
The following terminology is often used when symmetric ciphers
are examined.
- S-boxes: lookup tables that map n bits to
m bits (where n and m often are equal).
There are several ways of constructing good S-boxes for
ciphers, as well as several ways of measuring them. Some designers
use a rigorous mathematical approach by using bent
functions (or related) to create S-boxes which can be proven
to be resistant against some particular attacks. Other designers
use heuristic approaches, which may lead to S-boxes that are more
difficult to handle in mathematical proofs, but can have
additional benefits (such as several different implementation
options).
The S-box may even be the only non-linear part of the cipher.
This is the case in the block cipher DES,
and thus may be regarded as the single most important part of the
algorithm. In fact, many consider DES's S-boxes so good that they
use them in their own designs (for example, Serpent).
- Feistel networks: a Feistel network is a general
device of constructing block ciphers from simple functions. The
original idea was used in the block cipher Lucifer, invented by
Horst Feistel. Several variations have been devised from the
original version.
Put simply, the standard Feistel network takes a function from
n bits to n bits and produces an invertible function
from 2n bits to 2n bits. The basic function upon
which the structure is based is often called the round function.
The essential property of Feistel networks that makes them so
useful in cipher design is that the round function need not be
invertible, but the resulting function always is.
If the round function depends on, say, k bits of a key,
then the Feistel cipher requires rk bits of the key where
r is the number of rounds used. The security of the Feistel
structure is not obvious, but analysis of DES has shown that it is
a good way to construct ciphers. It is compulsory that a Feistel
cipher has enough rounds, but just adding more rounds does not
always guarantee security.
- The operation of taking the user key and expanding it into
rk bits for the Feistel rounds is called key
scheduling. This is often a non-linear operation, so that
finding out any of the rk bits of the round keys does not
directly provide any information about the actual user key. There
are many ciphers that have this basic structure; Lucifer, DES, and
Twofish, just to name a few.
- Expansion, Permutation: these are common tools in
mixing bits in a round function. They are linear operations, and
thus not sufficient to guarantee security. However, when used with
good non-linear S-boxes (as in DES) they are vital for the
security because they propagate the non-linearity uniformly over
all bits.
- bit slice operations (bitwise logic operations XOR,
AND, OR, NOT and bit permutations): The idea of bitslice
implementations of block ciphers is due to Eli Biham. It is common
practice in vector machines to achieve parallel operation.
However, Biham applied it on serial machines by using large
registers as available in modern computers. The term "bitslice" is
due to Matthew Kwan.
Basically all block ciphers can be written in bitslice manner,
but operations such as addition and multiplication may become very
slow. On the other hand, permutations are almost free as they just
require naming of the registers again and this can be
done one the coding level. Thus, for example, in DES
exhaustive key search using bitslice techniques, one can increment
the current key in a fraction of the time than is usually needed
for key scheduling.
The AES finalist Serpent
is designed to be implemented using bitslice operations only. This
makes it particularly efficient on modern architectures with many
registers.
The One-Time Pad The one-time pad (OTP) is the only
cipher that has been proven to be unconditionally secure, i.e.
unbreakable in practice. It has also be proven that any unbreakable,
unconditionally secure, cipher must in principle be a one-time pad.
The Vernam cipher (invented by G. Vernam in 1917) is a famous
instance of an OTP. This cipher is very simple: take a stream of
bits that contains the plaintext message, and a secret random
bit-stream of the same length as the plaintext which is considered
the key. To encrypt the plaintext with the key, sequentially
exclusive-or each pair of key bit and plaintext bit to obtain the
ciphertext bit. If the key is truly random, it can be proven that
the attacker has no means of deciding whether some guessed plaintext
is more likely than any other when having only the ciphertext and no
knowledge of the plaintext.
The practical problem is that the key does not have small
constant length, but the same length as the message, and one part of
a key should never be used twice (or the cipher can be broken). So,
we just have traded the problem of exchanging secret data for the
problem of exchanging secret random keys of the same length.
However, this cipher has allegedly been in widespread use since its
invention, and even more since the security proof by C. Shannon in
1949. Although admittedly the security of this cipher had been
conjectured earlier, it was Shannon who actually found a formal
proof for it.
More information can found, for example, in the book by D.
Stinson Cryptography:
Theory and Practice.
DES The Data Encryption Standard (DES) is an
algorithm developed in the mid-1970s. It was turned into a standard
by the US National Institute of
Standards and Technology (NIST), and was also adopted by several
other governments worldwide. It was and still is widely used,
especially in the financial industry.
DES is a block cipher with 64-bit block size. It uses 56-bit
keys. This makes it suspectible to exhaustive key search with modern
computers and special-purpose hardware. DES is still strong enough
to keep most random hackers and individuals out, but it is easily
breakable with special hardware by government, criminal
organizations, or major corporations. DES is getting too weak, and
should not be used in new applications.
A variant of DES, Triple-DES (also 3DES) is based on using
DES three times (normally in an encrypt-decrypt-encrypt sequence
with three different, unrelated keys). The Triple-DES is arguably
much stronger than (single) DES, however, it is rather slow compared
to some new block ciphers.
Nevertheless, even though DES seems to be of little interest for
applications of today there are many reasons for considering it
still important. It was the first block cipher which was widely
deployed in the public sector. Thus it played an important role in
making strong cryptography available to the public.
Also, the design was exceptionally good for a cipher that was
meant to be used only a few years. DES proved to be a very strong
cipher and it took over a decade for any interesting cryptanalytical
attacks against it to develop (not to underestimate the pioneering
efforts that lead to this breakthrough). The development of
differential cryptanalysis and linear cryptanalysis opened ways to
really understand the design of block ciphers.
Although at the time of DES's introduction its design philosophy
was held secret, it did not discourage its analysis - to the
contrary. Some information has been published about its design, and
one of the original designers, Don Coppersmith, has commented that
they discovered ideas similar to differential cryptanalysis already
while designing DES in 1974. However, it was just matter of time
that these fundamental ideas were re-discovered.
Even today, when DES is no longer considered a practical
solution, it is often used to describe new cryptanalytical
techniques. It is remarkable that even today, there are no
cryptanalytical techniques that would completely break DES in a
structural way, indeed, the only real weakness known is the short
key size (and perhaps the small block size).
AES In response to the growing feasibility of attacks
against DES,
NIST launched a call for
proposals for an official successor that meets 21st century security
needs. This successor is to be called the Advanced Encryption
Standard (AES). Five
algorithms made it into the second round, from which Rijndael was
selected to be final standard. We will now have a quick look at each
of them and their cryptographic peculiarities. All the ciphers have
128 bit block size and they support 128, 192
and 256 bit keys. The rather large key sizes are probably
required to give means for construction of efficient hash functions.
- The AES, Rijndael,
the design of two Belgian cryptographers, Joan Daemen and Vincent
Rijmen.
Rijndael follows the tradition of square ciphers (it is based
on ideas similar to the Square cipher). NIST gave as its reasons
for selecting Rijndael that it performs very well in hardware and
software across a wide range of environments in all possible
modes. It has excellent key setup time and has low memory
requirements, in addition its operations are easy to defend
against power and timing attacks.
NIST stated that all five finalists had adequate security and
that there was nothing wrong with the other four ciphers. After
all analysis and received comments were considered, NIST
considered Rijndael the best choice for the AES. The other four
finalists are mentioned below.
- MARS
by Zunic et. al., IBM.
This is an interesting new design (using a special type of a
Feistel network), which depends heavily on the instruction sets
available on modern 32-bit processors. This has the benefit that
on these target machines it is efficient, but it may lead to
implementation difficulties in cheaper architectures like smart
cards.
- RC6
by Rivest, Robshaw and Yin, RSA Laboratories.
RC6 follows the ideas of RC5 - but with many improvements. For
example, it attempts to avoid some of the differential attacks
against RC5's data dependent rotations. However, there are some
attacks that get quite far, and it is unclear whether RC6 is well
enough analyzed yet.
- Serpent
by Anderson, Biham and Knudsen.
Serpent has a basically conservative but in many ways
innovative design. It may be implemented by bitslice (or vector)
gate logic throughout. This makes it rather complicated to
implement from scratch, and writing it in a non-bitslice way
involves an efficiency penalty.
The 32 rounds lead to probably the highest security
margin on all AES candidates, while it is still fast enough for
all purposes.
- Twofish by
Schneier et. al., Counterpane Security.
Twofish is a new block cipher designed by Counterpane (whose
CEO is Bruce Schneier). The design is highly delicate, with many
alternative ways of implementation. It is cryptanalysed in much
detail, by the very authoritative "extended Twofish team". It is
basically a Feistel cipher, but utilizes many different ideas.
This cipher has key dependent S-boxes like Blowfish
(another cipher by Bruce Schneier).
BlowfishBlowfish was designed by Bruce Schneier. It is
a block cipher with 64-bit block size and variable length keys (up
to 448 bits). It has gained a fair amount of acceptance in a number
of applications, including Nautilus and PGPfone.
Blowfish utilizes the idea of randomized S-boxes: while doing key
scheduling, it generates large pseudo-random look-up tables by doing
several encryptions. The tables depend on the user supplied key in a
very complex way. This approach has been proven to be highly
resistant against many attacks such as differential and linear
cryptanalysis. Unfortunately it also means that it is not the
algorithm of choice for environments where large memory space
(something like than 4096 bytes) is not available..
The only known attacks against Blowfish are based on its weak key
classes.
IDEAIDEA (International Data Encryption Algorithm) is
an algorithm developed at ETH Zurich in Switzerland by Xuejia Lai.
It uses a 128 bit key, and it is generally considered to be very
secure. It has been one of the best publicly known algorithms for
some time (before the AES standard started its second round, see
above). It has been around now for several years, and no practical
attacks on it have been published despite of numerous attempts to
analyze it.
One of the best attacks uses the impossible differential idea of
Biham, Shamir and Biryukov. They can attack only 4.5 rounds
of IDEA and this poses no threat to the total of 8.5 rounds
used in IDEA.
IDEA is patented in the United States and in most European
countries.
RC4RC4 is a stream cipher designed by Ron Rivest at RSA
Data Security, Inc. It used to be a trade secret, until someone
posted source code for an algorithm on the usenet, claiming it to be
equivalent to RC4. There is very strong evidence that the posted
algorithm is indeed equivalent to RC4. The algorithm is very fast.
Its security is unknown, but breaking it does not seem trivial
either. Because of its speed, it may have uses in certain
applications. It accepts keys of arbitrary length.
RC4 is essentially a pseudo random number generator, and the
output of the generator is exclusive-ored with the data stream. For
this reason, it is very important that the same RC4 key never be
used to encrypt two different data streams.
Modes Of Operation Many commonly used ciphers are block
ciphers. Block ciphers transform a fixed-size block of data
(usually 64 bits) it into another fixed-size block (possibly
64 bits wide again) using a function selected by the key. If
the key, input block and output block have all n bits, a
block cipher basically defines a one-to-one mapping from
n-bit integers to permutations of n-bit integers.
If the same block is encrypted twice with the same key, the
resulting ciphertext blocks are also the same (this mode of
encryption is called electronic code book, or ECB).
This information could be useful for an attacker. To cause identical
plaintext blocks being encrypt to different ciphertext blocks, two
standard modes are commonly used:
- CBC (cipher block chaining): a ciphertext block is
obtained by first XORing the plaintext block with the previous
ciphertext block, and encrypting the resulting value. This way
leading blocks influence all trailing blocks, which increases the
number of plaintext bits one ciphertext bit depends on, but also
leads to synchronization problems if one block is lost.
- CFB (cipher feedback): the kth ciphertext block
is obtained by encrypting the (k-1)th ciphertext block and
XORing the result onto the plaintext. Interestingly, an CFB
feedback loop can also be used as a pseudo-random number generator
if one simply feeds one block of true random data with trailing
blocks of zeroes into the encryption routine (although the
expected period of this PRNG would be only about
2n/2 where n is the block size of the
cipher).
Block ciphers have also interesting relationships
to other classes of ciphers. For example:
- Stream ciphers. A stream cipher consists of a state
machine that outputs at each state transition one bit of
information. This stream of output bits is commonly called the
running key. The encryption can be implemented by just
exclusively-oring the running key to the plaintext message.
The state machine is nothing more than a pseudo-random number
generator. For example, we can build one from a block cipher by
encrypting repeatedly its own output. Typically, more elaborate
constructions are used for stream ciphers to obtain high-speed.
Some of the more well-known stream ciphers are RC4 and SEAL.
Several stream ciphers are based on linear-feedback shift
registers (LFSR), such as A5/1 used in GSM. These have the benefit
of being very fast (several times faster than usual block
ciphers).
- SSSC (self-synchronizing stream ciphers): The class of
self-synchronizing stream ciphers has the convenient
property that it corrects the output stream after bit flips or
even dropped bits after a short time (say, a few hundred bits).
SSSC's can be constructed using block ciphers in a CFB-related
way. Assume that we have already encrypted n bits of the
message and know that much of the ciphertext (where n
denotes the block length of the cipher). Then we produce a new
running key (see above) bit by encrypting the n ciphertext
bits. Take one bit of the output of the cipher to be the new
running key bit. Now moving one bit further we can iterate this
procedure for the whole length of the message.
It is easy to see that a one bit error in a ciphertext cannot
affect the decrypted plaintext after n bits. This makes the
cipher self-synchronizing.
The block cipher used should have sufficiently large block size
to avoid substitution attacks, for example.
More information on cipher modes can be found in
the Handbook of
Applied Cryptography by Menezes et al.
Cryptography before the 1970's In this section some of the
famous ciphers of the past are listed, with links to more complete
information where possible.
- Fish was used by the German army in
WWII to encipher high-command communications. It was produced by a
stream cipher called the Lorentz machine. Fish was the name given
to it by British cryptanalysts. It was important because it caused
difficulties for British analysts, who finally developed a machine
called Colossus, which was arguably the first, or one of the
first, digital computers.
The Colossus machine may have been an important factor in the
planning and success of the Allied attack on Normandy. Given the
intelligence produced by cryptanalysis of Fish, Allied forces knew
the positions of pretty much every German division.
- Enigma was another cipher used by
the Germans in World War II. The machine used several rotors and
looked like a typing machine. However, first Polish and then later
British mathematicians were able to keep up with the development
of the machine. Most communication using the basic version of
Enigma was deciphered by British analysts at Bletchley Park within
few hours of the interception. One of the strongest Enigma's were
used in submarine communication, but British analysts managed to
break them with great implications for the battle on the Atlantic.
There are several good books about Enigma and Bletchley Park.
In addition, the work of the major figure of British
cryptanalysis, Alan Turing, has been explained in many articles
and books. Recently his original notes about cryptanalysis from
that time has been released for the public.
This cipher, or a variant of it, is used by the unix crypt(1)
program. It is unlikely that any variant of Enigma could be
considered very secure by todays standards.
- Vernam cipher is described in detail
above.
- Substitution cipher. This is
one of the basic pencil-and-paper methods. Make a table by first
listing your alphabet in order on the first row, and then making a
random permutation of the alphabet on the second row. You can then
encrypt any character of the alphabet by first looking it up on
the first row, and the writing down the random character on the
second row. The key of this method is the permutation of the
alphabet on the second row. Decryption works in reverse.
This method is suspectible to frequency analysis, as long as
the alphabet size is small. Modern block ciphers can be seen as a
variant of this idea, in the sense that they try hide the message
under a very large alphabet that depends on the key. In the block
cipher case the permutation is generated by the secret key and the
key space might not cover all the possible permutations.
- Vigenere. This cipher uses clock
arithmetic to add together the key and the message. The difference
between OTP and Vigenere is that in Vigenere we explicitly reuse
the short key several times for one message.
Methods for attacking Vigenere ciphers are the Kasiski test,
index of coincidence etc. These lead to effective methods which
break even very short message (relative to the key size of
course).
- Hill cipher. The Hill cipher uses
matrices in clock arithmetic, and is highly suspectible to known
plaintext attacks.
Cryptographic
hash functions are used in various contexts, for example to compute
the message digest when making a digital signature. A hash function
compresses the bits of a message to a fixed-size hash value
in a way that distributes the possible messages evenly among the
possible hash values. A cryptographic hash function does this in a
way that makes it extremely difficult to come up with a message that
would hash to a particular hash value. Some of the best known and
most widely used hash functions are briefly described below.
- SHA-1 (Secure Hash Algorithm) (also
SHS, Secure Hash Standard): This is a cryptographic hash
algorithm published by the United States Government. It produces
an 160 bit hash value from an arbitrary length string. Many people
consider it quite good.
The official standard text can be found here.
- Tiger is a recent hash algorithm developed by Anderson
and Biham. It is available from ftp://ftp.funet.fi/pub/crypt/hash/tiger.
- RIPEMD-160 is a hash algorithm designed to replace MD4
and MD5 (see below). It produces a digest of 20 bytes
(160 bits, hence the name), reportedly runs at 40
Mb/s on a 90 MHz Pentium and has been placed in the public
domain by its designers. The RIPEMD-160 homepage is at http://www.esat.kuleuven.ac.be/~bosselae/ripemd160.html
- MD5 (Message Digest Algorithm 5) is a
cryptographic hash algorithm developed at RSA Laboratories. It can be
used to hash an arbitrary length byte string into a 128 bit
value.
MD5's ancestor, MD4 has been broken, and there are some
concerns about the safety of MD5 as well. For instance, "keyed
MD5" (typically used for authentication by having a shared secret,
and computing an authentication value by hashing first the secret
(as a key), and then the data to be hashed) has been reported to
be broken. It is also reported that one could build a
special-purpose machine costing a few million dollars to find a
plaintext matching a given hash value in a few weeks.
Despite these problems, MD5 is still in wide use and reasonably
safe for non-cryptographic applications of hash-functions.
- MD2, MD4: These are older hash algorithms from
RSA Data Security. They have known flaws (Hans Dobbertin, FSE'96,
LNCS 1039), and their use is not recommended.
Cryptographic
systems need cryptographically strong (pseudo) random numbers that
cannot be guessed by an attacker. Random numbers are typically used
to generate session keys, and their quality is critical for the
quality of the resulting systems. The random number generator is
easily overlooked, and can easily become the weakest point of the
cryptosystem.
Some machines may have special purpose hardware noise generators.
Noise from the leak current of a diode or transistor, least
significant bits of audio inputs, times between interrupts, etc. are
all good sources of randomness when processed with a suitable
cryptographical hash function. It is a good idea to acquire true
environmental noise whenever possible.
One cryptographical random number generator is Yarrow by Counterpane. A good page to
search for further material on (statistical) pseudo-random number
generators is the pLab
site. Any cryptographically good pseudo-random number generator
should pass all the basic tests for statistical randomness.
Home : Products : Partners : Tech : Support : Company Download : Sales : Contact info
: Feedback :
Terms and Conditions
of Use : Privacy
Policy Copyright © 2001 SSH Communications Security - All
Rights Reserved
| |
|
|
|