首先要了解什么是cyclic group
In group theory, a cyclic group is a group that can be generated by a single element, in the sense that the group has an element g (called a "generator" of the group) such that, when written multiplicatively, every element of the group is a power of g (a multiple of g when the notation is additive)
Crypto Series: Discrete Logarithm
From last post, it becomes clear that at this stage we won't be able to make it without some maths. That's because we are dealing now with public key crypto, which is based on difficult mathematical problems (as in difficult to solve, not as in difficult to understand).
With symmetric crypto, we could understand the concepts of diffusion and confusion without needing to dive into maths. On the other hand, here we will need to understand the problems on which the algorithms rely in order to understand how they work.
In this post, we'll see what's the Discrete Logarithm problem, why it is difficult to solve based on a simple intuition, and finally a method to solve this kind of problems. Of course it's not the only (nor the best) existing method, but in my opinion it is the simplest one to understand.
The Discrete Logarithm problem
As I said before, the Discrete Logarithm problem is formulated as follows. Given a cyclic group G with a generator g, x is called a discrete logarithm of h over this group if it satisfies the following condition:
in G
So this is the equivalent to a logarithm, but instead of computing it over the real numbers it is computed over a finite cyclic group. And now, if you don't have any background on discrete maths, coding theory and the like, you are probably asking something on these lines: what the hell does that mean?
To keep it simple, a finite cyclic group G with a generator g means that the successive powers of g (i.e ) will generate the different elements of the group. At some point, after a finite number of elements ( ) the result will cycle over to the first element (i.e. ), and this is what gives the name to the groups . Now, this value, n, is called the order of the group and is obviously also the number of elements of the group, or cardinality.
I won't go any further with the explanation of the properties of cyclic groups and all the group theory behind this. I'll just say that a simple example of finite cyclic groups is that of the integers modulo some prime number p, excluding the zero element. This groups are usually noted as , where p is our prime number and the group order is p-1.
For instance, say we look at . Then, for this group we get that the group elements are these:
1,2,3,4,5,6
Since those are all the integers modulo 7. Now, a generator of this group would be for instance g=3. You can see that in this case the successive powers of 3 modulo 7 are:
1,3,2,6,4,5,1
And there you have that . Therefore, this is a cyclic group of order p-1=6.
Difficulty of the DL problem
Now, where is the difficulty of the DL problem? We'll just take an intuitive approach to it. When you think of a classical logarithm over the real numbers, it turns out that this is a continuous and monotonous function where \log y " alt="\log x > \log y " src="http://s.wordpress.com/latex.php?latex=%5Clog%20x%20%3E%20%5Clog%20y%20&bg=ffffff&fg=000000&s=0"> if y " alt="x > y " src="http://s.wordpress.com/latex.php?latex=x%20%3E%20y%20&bg=ffffff&fg=000000&s=0">. This means that if you know the logarithm of x, and y is pretty close to it, most likely the logarithm of y will be pretty close to it as well.
But when you look at the discrete logarithm, you can see that the behavior of this problem is not as predictable as that of the logarithm function. For instance, in the example above we have that , but . Extrapolating this to big numbers, you can see that it is probably not very easy to go back from a certain power of a prime number to the exponent itself (i.e., computing the DL).
Solving Discrete Logarithms: Baby Step Giant Step
All right, now we get to look at an actual method to compute discrete logarithms. The method is called Baby Step Giant Step due to the way we approach the problem: we create a table of a few powers of our generator: this are our baby steps. Then we take our target problem, and take big steps (of the size of the generated table) downwards until we hit the table. At that point, we know how many steps we took and we can compute the actual logarithm.
But of course, all this may sound like bullshit until you see an actual example. Let's take the following problem, which uses intentionally small numbers:
Given , compute its discrete logarithm in .
Ok, let's start then. We compute a table of a given size, let's say 100 elements. I used to do it with Mathematica, but I do not have it right now so I'm using for the first time ever the Sage open source program. I advise you to install it, because it will also come handy to verify other examples (such as RSA examples) in the future.
So we start by instantiating our cyclic group, and getting a generator and our value y:
sage: G = IntegerModRing(17627)
sage: g = G(6)
sage: g.multiplicative_order()
17626
sage: y=G(8938)
Now, we build our list of 100 powers and plot it:
sage: powers = [ g^i for i in range(100) ]
sage: list_plot(powers)
Note that sage directly applies modular exponentiation since g was created as an element of the finite field we are using for this problem. Also, note that the behavior is not really predictable and after a big number it often comes a small number, but of course not always. You can observe this behavior in the plot obtained:
Ok, let's continue with our search. First, we know that our number, y, is part of the finite field, and therefore we can write it as follows:
Where of course i is a number below 100. We can further develop this equation and write it the following way:
And here it comes the magic! If you look at this equation, is actually a member of our table of powers. Further, we can compute , which is easy. Then, we can take y and multiply it by a and check if the result is on the table. In that case, it means that and we can easily compute x!
If it was not the case (which is likely), then we will have to multiply again by g as many times as we need until we hit the table. Let's call that number k. At that point, we've found that . Since we know k (it's the number of times we applied our multiplication!) and i (we take it from the table), we can compute x.
All this can be translated into the following fragment of sage commands:
sage: j=1
sage: while not y*a^j in powers: j += 1
....:
sage: j
79
sage: i = powers.index(y*a^j)
sage: i
70
sage: x=100*j+i
sage: x
7970
sage: g^x == y
True
So what you see above means that after 79 steps we have found the value at position 70 in the list. Therefore, the discrete logarithm of y in G is x=7970. After that, I compare the x-th power of g with y to be sure that the result is correct, and sage returns a True. If you happen to know a bit of Python, you can notice that it has a pretty similar syntax (but not identical).
Of course, sage also provides easier ways to do it. You can just type y.log(g) to solve the problem here:
sage: y.log(g)
7970
Closing thoughts
The method explained above is not the only one nor the most efficient. Also, as usual the explanations here do not attempt to be a 100% accurate description of the problem from a mathematical point of view (I'm not a mathematician after all) but rather to explain crypto topics in a simple way so that most people can understand it.
If you want to go further with DL problems, get accurate descriptions of it and understand other methods of solving the problem, you can resort (once again) to the Handbook of Applied Cryptography. Specially chapters 2 and 3 treat this and related subjects, covering maths background and reference problems.