Euclid’s method is a classic algorithm for finding the greatest common divisor (gcd) of two integers. Donald Knuth referred to it as “the granddaddy of all algorithms, because it is the oldest nontrivial algorithm that has survived to the present day.”[1] There exists a more generalized form for Euclid’s method, which is known as the Extended Euclidean Algorithm. Given two integers a and b, this algorithm finds the integer d = gcd(a,b) and two integers x and y, such that a*x + b*y = d. The usefulness of that last result, a*x + b*y = d, may seem a bit questionable at first, but without going into any details it provides a way to compute the modular multiplicative inverse.
Knuth provides an excellent description and proof for this algorithm [2]. It’s a fascinating study in using induction and invariants. Translated into C++, the algorithm he describes takes this form:
template <class T>
void extended_euclidean(const T a, const T b, T* pGcd, T* pX, T* pY)
{
static_assert(std::numeric_limits<T>::is_integer, "");
static_assert(std::numeric_limits<T>::is_signed, "");
T x0=1, y0=0, a0=a;
T x1=0, y1=1, a1=b;
while (a1 != 0) {
T q = a0/a1;
T a2 = a0  q*a1;
T x2 = x0  q*x1;
T y2 = y0  q*y1;
x0=x1; y0=y1; a0=a1;
x1=x2; y1=y2; a1=a2;
}
*pX = x0;
*pY = y0;
*pGcd = a0;
}
Figure 1: C++ Extended Euclidean Function.
This is the best we can do for program code, but there seems to be a serious problem. The algorithm itself presents all variables as mathematical integers, and that’s an infinite set. We have to question whether the variables in Figure 1 are bounded to any finite maximum and minimum values. Particularly for x2 and y2, it’s nonobvious from Figure 1 that those variables can’t grow without bound. Until proven otherwise, we must assume that any of the variables in Figure 1 could overflow any type T we choose. (If you’d like to know the answer to this question of overflow without proof, skip ahead to the Conclusion.)
It’s fairly easy to establish bounds for a2 at least. Informally, if a and b are nonnegative, we can recognize that within the loop
a2 = a0  q*a1
is equivalent to
a2 = a0 % a1
For this case it’s easy to see that on every iteration of the loop a2 will decrease, until it reaches zero, at which point the loop terminates.
Other areas of potential overflow are harder to analyze. Potential overflow could occur for x2 and y2, and also the interim calculations q*a1, q*x1, and q*y1.
In his second edition book[3] Knuth provides an exercise for the reader in which he states two bounds, which he attributes to George E. Collins. Most likely this refers to Collins’ paper Computing Multiplicative Inverses in GF(p)[4]. Collins’ paper is the best source for bounds I’ve seen for the Extended Euclidean algorithm. It presents most of the invariants and observations needed to bound the algorithm in a full proof, and states bounds for the variables (though no interim calculation bounds). It’s restricted to 0 < b < a, and doesn’t explicitly prove its invariants, but identifying the invariants is the hardest part of this kind of proof. Collins observes that the following equalities remain invariant throughout all of the algorithm:

gcd(x0, y0) == 1

gcd(x1, y1) == 1

abs(x0*y1  y0*x1) == 1

x0*y1  y0*x1 == (x1*y2  y1*x2)

a*x0 + b*y0 == a0

a*x1 + b*y1 == a1
With the precondition 0 <= b < a, you can read a proof that these invariants are maintained in extended_euclidean_collins.h.
Collins also observes that x0, x1, and x2 alternate in sign, as do y0, y1, and y2. We’ll need to restate his observation as an invariant in order to use it. We might try something like
(x0 >= 0 ? x1 < 0 : x1 <= 0)
initially, but a conditional invariant complicates a proof – plus it just feels a bit ugly. We’ll achieve better success by introducing a sign variable, which we can call s0. Initializing s0 to 1 before the loop and negating it at the end of each loop iteration, will produce the following change to the C++ code of Figure 1 (solely for the purpose of a proof):
template <class T>
void extended_euclidean(const T a, const T b, T* pGcd, T* pX, T* pY)
{
T x0=1, y0=0, a0=a;
T x1=0, y1=1, a1=b;
T s0 = 1;
while (a1 != 0) {
T q = a0/a1;
T a2 = a0  q*a1;
T x2 = x0  q*x1;
T y2 = y0  q*y1;
x0=x1; y0=y1; a0=a1;
x1=x2; y1=y2; a1=a2;
s0 = s0;
}
*pX = x0;
*pY = y0;
*pGcd = a0;
}
Figure 2: C++ Extended Euclidean Function With Sign Variable.
Since s0 switches between the values 1 and 1 on each iteration, we can now write invariants that express the alternating sign of x and y variables on each iteration:

abs(s0) == 1

x0 == s0*abs(x0)

x1 == s0*abs(x1)

y0 == s0*abs(y0)

y1 == s0*abs(y1)
A sign variable lends itself well to a proof. It’s relatively simple, and it can be manipulated with the distributive property. For example, in extended_euclidean_collins.h (at annotation 57), the proof takes this assignment from Figure 2:
y2 = y0  q*y1
and substitutes the expressions of invariants 10 and 11:
y2 == s0*abs(y0)  q*(s0*abs(y1))
and then uses the distributive property:
y2 == s0*(abs(y0) + q*abs(y1))
The (Nearly) Full Proof
So long as we require a >= 0 and b >= 0, we can prove bounds for the Extended Euclidean algorithm. We’ll do it in three parts, and we’ve already discussed some of the first part – the case of a > b. The second part is the case a < b, and the third part is the case a == b. It would be nice to also consider cases where a or b could be negative, but I expect the proof’s size would grow by a factor of two or three. It’s convenient instead to restrict the input range to nonnegative values, given that even this restriction may be more general than applications need. For example, the very restricted case of 0 < b < a is sufficient to apply the Extended Euclidean algorithm to calculating the modular multiplicative inverse.
In case a reader wishes to explore the cases where a and/or b are negative, it’s worth noting that a proof that includes the negative cases will still need restrictions. If we define the constant min = std::numeric_limits<T>::min(), such a proof would at the least need to forbid the specific case where both a == min and b == min. A simple explanation is that given our definition of min, we can see that gcd(min, min) == min, yet the value of min is too large to represent with type T when T is represented with two’s complement.
Proof Part 1: a > b
For the case a > b, extended_euclidean_collins.h proves that all variables and interim calculations have specific bounds. This header file proves a few helper theorems at the top of the header, and then provides the Extended Euclidean function with annotations on every statement and assertion, to form the proof of bounds. Every bound is given either as a constant or in terms of a variable which is itself bounded. C asserts are used as a convenience in stating each proof step, since the asserts are testable. However the proof is analytic. There is a CMake file in the parent directory that will create the full proof’s test project if you wish to run it.
Due to the proof’s length, it’s convenient to prune everything from the header file but the algorithm statements and the essential assertion results, leaving only the results of the proof for Part 1. These essential results are the proven bounds assertions and the proven assertions that the outputs are indeed the gcd and the Bézout coefficients as required by the Extended Euclidean algorithm.
Proof Part 2: a < b
For the case a < b, a second header file proves that all variables and interim calculations have specific bounds. The logic of this proof is extremely similar to Part 1, with one large difference: the first iteration of the loop is peeledout and placed before the loop. Peeling out the first iteration is required to state some loop invariants that only become valid for case a < b after the second loop iteration begins.
Similarly to Part 1, it helps to have a pruned header file leaving only the results of the proof. However for this case of a < b, the proof modified the original Extended Euclidean function by peelingout the first loop iteration, so this simple pruning also uses that modified function. Ideally at this point we shouldn’t be exposed to the modification, since it’s just an artifact of the proof. Since we only care about the results of the proof and not the proof itself, we can rejoin the first iteration back into the loop. This leaves us the much more useful original function with proof results for Part 2.
Proof Part 3: a == b
The only tricky aspect to case a == b is how to define gcd(0,0). Otherwise, this part consists just of two very short proofs.
One way d = gcd(a,b) can be defined is that d is the integer that divides both a and b, such that for any integer c that divides both a and b, d >= c. Under this definition, gcd(0,0) would be undefined. A second way d = gcd(a,b) can be defined is that d is the nonnegative integer that divides both a and b, such that for any integer c that divides both a and b, c divides d. Under this definition, gcd(0,0) == 0. [5] If a caller of the Extended Euclidean function expects the first definition, calling with a == 0 and b == 0 would be an implicit precondition violation, since it would be a request for the undefined gcd(0,0). We will assume that callers will never violate their implicit preconditions, and therefore we assume a caller must expect the second definition if calling with arguments a == 0 and b == 0. If the last reasoning feels unsatisfactory or too fragile, we could simply say that this Extended Euclidean function defines gcd(0,0) == 0 (or we could as a precondition explicitly disallow the special case 0 == a == b).
There is a very short header file proof for special case 0 == a == b, along with a pruned version containing only the essential proof results. The more general case 0 < a == b also has a short proof, along with a trimmed version containing only essential proof results.
Combined Proof Results
Combining the proof results for all three cases, we have general proof results for the Extended Euclidean algorithm, reproduced below. As before, the bounds are given in terms of constants and other bounded variables (the next section expresses all bounds in terms of the input parameters a and b).
template <class T>
void extended_euclidean_results(const T a, const T b, T* pGcd, T* pX, T* pY)
{
static_assert(std::numeric_limits<T>::is_integer, "");
static_assert(std::numeric_limits<T>::is_signed, "");
assert(a >= 0 && b >= 0); // preconditions
T x0 = 1, y0 = 0, a0 = a;
T x1 = 0, y1 = 1, a1 = b;
while (a1 != 0) {
T q = a0/a1;
assert(0 <= q && q <= a0);
T a2 = a0  q*a1;
assert(0 <= (q*a1) && (q*a1) <= a0);
assert(0 <= a2 && a2 < a1);
if (a2 != 0) assert(q <= a0/2);
T x2 = x0  q*x1;
assert(abs(q*x1) <= abs(x2));
assert(abs(x1) <= abs(x2));
T y2 = y0  q*y1;
assert(abs(q*y1) <= abs(y2));
assert(y1 == 1  abs(y1) <= abs(y2));
x0=x1; y0=y1; a0=a1;
x1=x2; y1=y2; a1=a2;
}
if (a == 0 && b == 0) {
assert(x1 == 0);
assert(y1 == 1);
assert(x0 == 1);
assert(y0 == 0);
} else {
assert(gcd(a,b) >= 1);
assert(abs(x1) == b/gcd(a,b));
assert(abs(y1) == a/gcd(a,b));
assert(x0 == 1  abs(x0) <= (b/gcd(a,b))/2);
assert(y0 == 1  abs(y0) <= (a/gcd(a,b))/2);
}
assert(a0 == gcd(a,b));
assert(a*x0 + b*y0 == gcd(a,b));
*pX = x0;
*pY = y0;
*pGcd = a0;
}
Figure 3: Combined Proof Results, C++ Extended Euclidean Function.
It’s worthwhile noting that the combined results show that at the end of the algorithm, all three parts of the proof achieved a0 == gcd(a,b), and a*x0 + b*y0 == gcd(a,b) (showing that the final values of x0 and y0 are Bézout coefficients). These two assertions may be incidental to the proof of bounds, but they are important because they show that the algorithm remained valid for all three cases we considered.
Final Bounds
What we’d like best is if we could bound all variables and calculations in terms of constant values. We can can come very close to doing this. We can bound all variables and calculations in terms of the two input parameters a and b, which remain constant throughout the algorithm.
We’ll use two steps to get the desired bounds, starting with the results from Figure 3.
For the first step we replace the somewhat complicated if/else clause at the end of Figure 3 with simple assertions, and improve the bounds for a2, q, and (q*a1), showing that all throughout the algorithm they remain less than or equal to max(a,b). We can form the proof with these improvements, and then prune it as usual to leave only the essential assertion results.
For the second step we continue on from the assertion results of the first step, though by using a different proof technique from any used until now. Our method will be to start at the very end of the function and work our way backward to form new logical assertions. We can observe that at the end of the first step’s results, the final values for x0, x1, y0, and y1 are all bounded in terms of the input parameters a and b. We can propagate those bounds backward for all variables and calculations in the algorithm, by working our way backward through the loop iterations.
For example, consider x1 at the end of the first step results. We can see that at the function end,
abs(x1) <= b
Reasoning backward to the final loop iteration and working from the end of that iteration, we see the reassignment x1=x2, so we can reason that prior to that reassignment, it must have been true that
abs(x2) <= b
A few lines earlier in the loop, we can see an already established assertion that
abs(x1) <= abs(x2)
So we know that at that point in the code
abs(x1) <= abs(x2) <= b
or more simply,
abs(x1) <= b
Given this knowledge, we can easily reason that at the start of the final loop iteration, the absolute value of x1 was bounded by b.
Continuing to work backward to the end of the second to last loop iteration (if the loop is taken at least twice), we can use the same reasoning to again bound x2 and x1 for that loop iteration. Continuing to work backward through all remaining iterations bounds x2 and x1 for all iterations.
This method is used in the final proof to bound all variables and calculations in terms of the input parameters a and b. Pruning this proof to leave only the basic algorithm and bounds assertions, we have our final desired result, reproduced here:
template <class T>
void extended_euclidean_final(const T a, const T b, T* pGcd, T* pX, T* pY)
{
static_assert(std::numeric_limits<T>::is_integer, "");
static_assert(std::numeric_limits<T>::is_signed, "");
assert(a >= 0 && b >= 0); // precondition
T x0 = 1, y0 = 0, a0 = a;
T x1 = 0, y1 = 1, a1 = b;
while (a1 != 0) {
T q = a0/a1;
assert(0 <= q && q <= max(a,b));
T a2 = a0  q*a1;
assert(0 <= (q*a1) && (q*a1) <= max(a,b));
assert(0 <= a2 && a2 < max(a,b));
if (a2 != 0) assert(q <= max(a,b)/2);
T x2 = x0  q*x1;
assert(abs(q*x1) <= b);
assert(abs(x2) <= b);
if (a2 != 0) assert(abs(q*x1) <= max(1,b/2));
if (a2 != 0) assert(abs(x2) <= max(1,b/2));
T y2 = y0  q*y1;
assert(abs(q*y1) <= max(1,a));
assert(abs(y2) <= max(1,a));
if (a2 != 0) assert(abs(q*y1) <= max(1,a/2));
if (a2 != 0) assert(abs(y2) <= max(1,a/2));
x0=x1; y0=y1; a0=a1;
x1=x2; y1=y2; a1=a2;
}
assert(a0 == gcd(a,b));
assert(a*x0 + b*y0 == gcd(a,b));
*pX = x0;
*pY = y0;
*pGcd = a0;
}
Figure 4: Final Bounds in Terms of the Inputs, C++ Extended Euclidean Function.
Conclusion
Distilling the final proof results from Figure 4, we can see that all the variables and calculations in the Extended Euclidean algorithm are bounded to finite maximum and minimum values. More specifically, throughout every loop iteration, the variables and calculations of the algorithm (identically in Figure 1 and 4) have these following bounds:
0 <= q <= max(a,b)
0 <= (q*a1) <= max(a,b)
0 <= a2 < max(a,b)
abs(q*x1) <= b
abs(x2) <= b
abs(q*y1) <= max(1,a)
abs(y2) <= max(1,a)
And for every loop iteration other than the final iteration (when a2 == 0), a subset of those variables and calculations have these following even tighter bounds:
0 <= q <= max(a,b)/2
abs(q*x1) <= max(1,b/2)
abs(x2) <= max(1,b/2)
abs(q*y1) <= max(1,a/2)
abs(y2) <= max(1,a/2)
Sources
1. Donald Knuth, The Art of Computer Programming, Volume 2 (3rd ed.): Seminumerical Algorithms (Boston: AddisonWesley Professional, 1997), p.335.
2. Donald Knuth, The Art of Computer Programming, Volume 1 (3rd ed.): Fundamental Algorithms (Redwood City: AddisonWesley Professional, 1997), pp.1315.
3. Donald Knuth, The Art of Computer Programming, Volume 2 (2nd ed.): Seminumerical Algorithms (Reading: AddisonWesley, 1981), p.595 (answer 14).
4. George E. Collins, Computing multiplicative inverses in GF(p), Mathematics of Computation Vol. 23 (1969), pp. 197200.
5. https://math.stackexchange.com/a/495457
Pingback: Implementing the Extended Euclidean Algorithm with Unsigned Inputs  Jeff Hurchalla
we can see that gcd(min, min) == min, yet the value of min is too large to represent with type T.
Isn’t min ok if the computer is using one’s complement? I don’t know of any computers that use one’s complement though, so this is probably safe to assume.
LikeLike
Good catch – I updated that phrase to specify two’s complement. I was making conclusions from two’s complement without stating that. That said, an amusing (and cool) new law coming into effect in the C++20 standard: “There is One True Representation for signed integers, and that representation is two’s complement.”
LikeLike
Pingback: Computing the Modular Multiplicative Inverse  Jeffrey Hurchalla