Implementing the Extended Euclidean Algorithm with Unsigned Inputs

As the previous post showed, it’s possible to correctly implement the Extended Euclidean Algorithm using one signed integral type for all input parameters, intermediate variables, and output variables. None of the calculations will overflow. The implementation was given as follows:

template <class T>
void extended_euclidean(const T a, const T b, T* pGcd, T* pX, T* pY)
{
   static_assert(std::numeric_limits<T>::is_integer, "");
   static_assert(std::numeric_limits<T>::is_signed, "");
   assert(a >= 0 && b >= 0);   // precondition
   T x0=1, y0=0, a0=a;
   T x1=0, y1=1, a1=b;

   while (a1 != 0) {
      T q = a0/a1;
      T a2 = a0 - q*a1;
      T x2 = x0 - q*x1;
      T y2 = y0 - q*y1;
      x0=x1; y0=y1; a0=a1;
      x1=x2; y1=y2; a1=a2;
   }
   *pX = x0;
   *pY = y0;
   *pGcd = a0;
}

Figure 1: C++ Extended Euclidean Function, Signed Integer Inputs.

The precondition a >= 0 && b >= 0 is required because the proof from the last post didn’t consider any cases where a or b was negative. I haven’t investigated whether the Extended Euclidean algorithm might be valid for any and all cases of negative inputs, so I won’t speculate about what might be possible for those cases. Instead let’s focus on the proven case a >= 0 && b >= 0.

It seems a bit wasteful to use signed integers for inputs a and b when the function precondition disallows a and b from ever being negative. Why not just use unsigned integer inputs that have the same precision as the signed integer inputs would have had? The one potential problem with doing this is that when we look at the bounds, it’s not quite as obvious as the all-signed case that overflow can’t occur. Displaying the bounds proven in the last post, throughout every loop iteration in Figure 1 we know the following:
0 <= q <= max(a,b)
0 <= (q*a1) <= max(a,b)
0 <= a2 < max(a,b)
abs(q*x1) <= b
abs(x2) <= b
abs(q*y1) <= max(1,a)
abs(y2) <= max(1,a)

Assuming we go ahead and change the type for inputs a and b to be unsigned, it should be immediately clear from the bounds and Figure 1 that in order to avoid overflow, q, a2, a1, a0, and *pGcd will all need to be changed to use the same unsigned integer type as the input parameters.

More problematic are x2 and y2 (and likewise x1, y1, x0, y0, *pX, and *pY). On the one hand, these variables need to remain signed types since the bounds above don’t restrict them from taking on negative values. But on the other hand, the bounds above allow them to be as large as a or b, and since a and b are now unsigned types, the largest possible value for a or b would overflow the corresponding signed type. Fortunately we can resolve this conflict, by using less generally applicable but tighter bounds for these variables. Let’s review the second set of proven bounds from the last post. For every loop iteration other than the final iteration (when a2 == 0), we know the following tighter bounds hold:
0 <= q <= max(a,b)/2
abs(q*x1) <= max(1,b/2)
abs(x2) <= max(1,b/2)
abs(q*y1) <= max(1,a/2)
abs(y2) <= max(1,a/2)

Superficially, it seems like these bounds should be enough to guarantee we can use signed types for x2 and y2 without any problems. The only loop iteration where they could overflow is the final iteration, and their values don’t affect anything on the last iteration, so seemingly overflow (if it happened) would be fine at that point. But in truth it’s unacceptable, because overflow of a signed integral type is undefined behavior in C and C++. Upon overflow on the last iteration, a reasonable possibly is everything would be okay, but another reasonable possibility is a trap would occur that crashes the program. Anything is allowed after invoking undefined behavior, so in the end it’s hopeless to try to reason about whether the program would be okay. The only safe thing to do is to view any undefined behavior as a fatal bug that must be fixed.

Since we can not allow undefined behavior, we must ensure signed integer overflow is completely impossible in the function. Fortunately this is easy. We can simply test if we are in the final loop iteration (by checking if a1 == 0), and break out if so. Most of the work in the final iteration is irrelevant since it’s never used later. The only exception is the reassignments (x0=x1; y0=y1; a0=a1), which still need to be done before causing a break.

Incorporating all the discussed changes, we get the following potential implementation:

template <class S, class U>
void extended_euclidean(const U a, const U b, U* pGcd, S* pX, S* pY)
{
   static_assert(std::numeric_limits<S>::is_integer, "");
   static_assert(std::numeric_limits<S>::is_signed, "");
   static_assert(std::numeric_limits<U>::is_integer, "");
   static_assert(!(std::numeric_limits<U>::is_signed), "");
   static_assert(std::is_same<std::make_signed<U>::type, S>::value, "");
   S x0=1, y0=0;
   U a0=a;
   S x1=0, y1=1;
   U a1=b;

   while (a1 != 0) {
      U q = a0/a1;
      U a2 = a0 - q*a1;
      if (a2 == 0) {
         x0=x1; y0=y1; a0=a1;
         break;
      }
      S x2 = x0 - q*x1;
      S y2 = y0 - q*y1;
      x0=x1; y0=y1; a0=a1;
      x1=x2; y1=y2; a1=a2;
   }
   *pX = x0;
   *pY = y0;
   *pGcd = a0;
}

Figure 2: Potential (But Broken) Implementation, Unsigned Integer Inputs.

But there’s still a problem. We determined that q must be unsigned and x1 must be signed, and Figure 2 reflects this. Therefore during the calculation in Figure 2 of (q*x1), the compiler will implicitly convert x1 to the unsigned type of q in order to carry out the calculation. Quoting from the C++17 standard section 8 [expr] (11.5.3):
“Many binary operators that expect operands of arithmetic type cause conversions… The following rules shall be applied… if the operand that has unsigned integer type has rank equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.”
[The C standard specifies very similar behavior.] The problem is we didn’t want to convert x1 to unsigned. And even putting that aside, the conversion causes further conversions which can eventually result in implementation defined behavior. What we really wanted and intended is to convert q from unsigned to signed, which we can do with an explicit cast.

Explicitly casting q to signed integral type works well in the code of Figure 2. We know from the tight bounds given above that on every loop iteration except the final iteration,
0 <= q <= max(a,b)/2
Figure 2 has a break on the last loop iteration before it gets to the calculations (q*x1 and q*y1) which would require type conversion of q to signed integral. So the conversions will never be performed on the final loop iteration, and therefore no conversion of q will ever overflow the signed result type.

Incorporating the explicit casts of q, we get the following implementation:

template <class S, class U>
void extended_euclidean(const U a, const U b, U* pGcd, S* pX, S* pY)
{
   static_assert(std::numeric_limits<S>::is_integer, "");
   static_assert(std::numeric_limits<S>::is_signed, "");
   static_assert(std::numeric_limits<U>::is_integer, "");
   static_assert(!(std::numeric_limits<U>::is_signed), "");
   static_assert(std::is_same<std::make_signed<U>::type, S>::value, "");
   S x0=1, y0=0;
   U a0=a;
   S x1=0, y1=1;
   U a1=b;

   while (a1 != 0) {
      U q = a0/a1;
      U a2 = a0 - q*a1;
      if (a2 == 0) {
         x0=x1; y0=y1; a0=a1;
         break;
      }
      S x2 = x0 - static_cast<S>(q)*x1;
      S y2 = y0 - static_cast<S>(q)*y1;
      x0=x1; y0=y1; a0=a1;
      x1=x2; y1=y2; a1=a2;
   }
   *pX = x0;
   *pY = y0;
   *pGcd = a0;
}

Figure 3: Correct Implementation, Unsigned Integer Inputs.

This implementation in Figure 3 should be correct, but the conditional branch inside the loop isn’t ideal from a performance standpoint. It would be nice if we could remove the conditional branch. Indeed we can, by re-arranging the code.

Optimized Implementation

We’ll remove the conditional branch inside the loop of Figure 3 by rotating statements from the bottom of the loop to the top of the loop. Each rotation will require changing the code that precedes and follows the loop (and sometimes the loop condition too), so that the meaning of the function stays the same. Aside from the rotation, nothing inside the loop should change. The end goal of the rotations is to get the position of the nested conditional branch to be at the end of the loop, so that we can merge it into the loop condition.

Let’s first rotate the reassignments from the end of the loop in Figure 3 to the top of the loop. [If this description is hard to follow, it may help to compare Figure 3 with Figure 4.] The rotation affects the code after the loop, since x0, y0, and a0 won’t be reassigned upon the end of the final loop iteration. So to compensate, *pX, *pY, and *pGcd will now need to be set to the values of x1, y1, and a1. This change to *pX, *pY, and *pGcd will need to also be valid if the loop is never taken at all, so we’ll need to change the initial values of x1, y1, and a1 to the initial values in Figure 3 given to x0, y0, and a0. Fortunately, since we will have already done the rotation that places the loop reassignments at the top of the loop, this last change also ensures the correct values will get reassigned upon the first entry to the loop. At least in part. We haven’t discussed a second part of the effect of having the reassignments at the top of the loop: the reassignments use x2, y2, and a2, which are variables that haven’t been declared or initialized yet. We’ll fix this by adding declarations and initializations to the start of the function for x2, y2, and a2, setting them to the initial values that had formerly been given to x1, y1, and a1 in Figure 3. As a result, upon first entry of the loop, the reassignments at the top of the loop will re-set the correct initial values, leaving the meaning of the function unchanged.

There’s one last detail. We’ll need to change the loop condition to refer to a2 instead of a1, again due to the rotation of the reassignment statements. Since we’ll have already declared and initialized a2 at the top of the function (to the initial value formerly used in Figure 3 by a1), this preserve the meaning of the function perfectly.

It’s easiest to see all this in actual code. By rotating the loop’s reassignments in Figure 3, we’ll have changed the implementation to the following code:

template <class S, class U>
void extended_euclidean(const U a, const U b, U* pGcd, S* pX, S* pY)
{
   static_assert(std::numeric_limits<S>::is_integer, "");
   static_assert(std::numeric_limits<S>::is_signed, "");
   static_assert(std::numeric_limits<U>::is_integer, "");
   static_assert(!(std::numeric_limits<U>::is_signed), "");
   static_assert(std::is_same<std::make_signed<U>::type, S>::value, "");
   S x1=1, y1=0;
   U a1=a;
   S x2=0, y2=1;
   U a2=b;

   while (a2 != 0) {
      S x0=x1;
      S y0=y1;
      U a0=a1;
      x1=x2; y1=y2; a1=a2;

      U q = a0/a1;
      a2 = a0 - q*a1;
      if (a2 == 0)
         break;
      x2 = x0 - static_cast<S>(q)*x1;
      y2 = y0 - static_cast<S>(q)*y1;
   }
   *pX = x1;
   *pY = y1;
   *pGcd = a1;
}

Figure 4: Optimization Step 1, Unsigned Integer Inputs.

We haven’t yet removed the conditional branch inside the loop, but we’re closer to the goal. We still need to rotate the calculations of x2 and y2 from the bottom of the loop to the top, and again fix the code before and after the loop to preserve the meaning of the function. As it turns out, there’s no need to change any code after the function. But now that the calculations are at the top of the loop, the variable q gets used before it has been declared or initialized. In order to preserve the meaning of the function, we’ll need to add a declaration and initialization of q at the start of the function. Initializing q to zero works quite well. On the first entry of the loop, the calculations of x2 and y2 will thereby contain a multiplication by zero (since q == 0), effectively reducing those calculations on the first iteration to reassignments from x0 and y0. This means we’ll need to add to the start of the function declarations and initializations for x0 and y0, setting them to the initial values that Figure 4 had used for x2 and y2.

After making these changes to Figure 4, we have the following code:

template <class S, class U>
void extended_euclidean(const U a, const U b, U* pGcd, S* pX, S* pY)
{
   static_assert(std::numeric_limits<S>::is_integer, "");
   static_assert(std::numeric_limits<S>::is_signed, "");
   static_assert(std::numeric_limits<U>::is_integer, "");
   static_assert(!(std::numeric_limits<U>::is_signed), "");
   static_assert(std::is_same<std::make_signed<U>::type, S>::value, "");
   S x1=1, y1=0;
   U a1=a;
   S x0=0, y0=1;
   U a2=b;
   U q=0;

   while (a2 != 0) {
      S x2 = x0 - static_cast<S>(q)*x1;
      S y2 = y0 - static_cast<S>(q)*y1;
      x0=x1;
      y0=y1;
      U a0=a1;
      x1=x2; y1=y2; a1=a2;

      q = a0/a1;
      a2 = a0 - q*a1;
      if (a2 == 0)
         break;
   }
   *pX = x1;
   *pY = y1;
   *pGcd = a1;
}

Figure 5: Optimization Step 2, Unsigned Integer Inputs.

We can now get rid of the conditional branch at the end of the function, since it’s redundant. Making this change and cleaning up the code a little, we get the following optimized function for the Extended Euclidean Algorithm with unsigned integer inputs:

template <class S, class U>
void extended_euclidean(const U a, const U b, U* pGcd, S* pX, S* pY)
{
   static_assert(std::numeric_limits<S>::is_integer, "");
   static_assert(std::numeric_limits<S>::is_signed, "");
   static_assert(std::numeric_limits<U>::is_integer, "");
   static_assert(!(std::numeric_limits<U>::is_signed), "");
   static_assert(std::is_same<std::make_signed<U>::type, S>::value, "");
   S x1=1, y1=0;
   U a1=a;
   S x0=0, y0=1;
   U a2=b, q=0;

   while (a2 != 0) {
      S x2 = x0 - static_cast<S>(q)*x1;
      S y2 = y0 - static_cast<S>(q)*y1;
      x0=x1; y0=y1;
      U a0=a1;
      x1=x2; y1=y2; a1=a2;

      q = a0/a1;
      a2 = a0 - q*a1;
   }
   *pX = x1;
   *pY = y1;
   *pGcd = a1;
}

Figure 6: Optimized Implementation, Unsigned Integer Inputs.

Conclusion

The Extended Euclidean algorithm can be correctly and very efficiently implemented using unsigned integer inputs and signed integer outputs, so long as all the function parameters have the same precision. Figure 6 provides the optimized implementation. There is a header file with the complete implementation, and a CMake file that will create a test project for the implementation if you wish to run it.

Posted in Uncategorized | 7 Comments

Proof of Bounds for the Extended Euclidean Algorithm

Euclid’s method is a classic algorithm for finding the greatest common divisor (gcd) of two integers. Donald Knuth referred to it as “the granddaddy of all algorithms, because it is the oldest nontrivial algorithm that has survived to the present day.”[1] There exists a more generalized form for Euclid’s method, which is known as the Extended Euclidean Algorithm. Given two integers a and b, this algorithm finds the integer d = gcd(a,b) and two integers x and y, such that a*x + b*y = d. The usefulness of that last result, a*x + b*y = d, may seem a bit questionable at first, but without going into any details it provides a way to compute the modular multiplicative inverse.

Knuth provides an excellent description and proof for this algorithm [2]. It’s a fascinating study in using induction and invariants. Translated into C++, the algorithm he describes takes this form:

template <class T>
void extended_euclidean(const T a, const T b, T* pGcd, T* pX, T* pY)
{
   static_assert(std::numeric_limits<T>::is_integer, "");
   static_assert(std::numeric_limits<T>::is_signed, "");
   T x0=1, y0=0, a0=a;
   T x1=0, y1=1, a1=b;

   while (a1 != 0) {
      T q = a0/a1;
      T a2 = a0 - q*a1;
      T x2 = x0 - q*x1;
      T y2 = y0 - q*y1;
      x0=x1; y0=y1; a0=a1;
      x1=x2; y1=y2; a1=a2;
   }
   *pX = x0;
   *pY = y0;
   *pGcd = a0;
}

Figure 1: C++ Extended Euclidean Function.

This is the best we can do for program code, but there seems to be a serious problem. The algorithm itself presents all variables as mathematical integers, and that’s an infinite set. We have to question whether the variables in Figure 1 are bounded to any finite maximum and minimum values. Particularly for x2 and y2, it’s non-obvious from Figure 1 that those variables can’t grow without bound. Until proven otherwise, we must assume that any of the variables in Figure 1 could overflow any type T we choose. (If you’d like to know the answer to this question of overflow without proof, skip ahead to the Conclusion.)

It’s fairly easy to establish bounds for a2 at least. Informally, if a and b are non-negative, we can recognize that within the loop
a2 = a0 - q*a1
is equivalent to
a2 = a0 % a1
For this case it’s easy to see that on every iteration of the loop a2 will decrease, until it reaches zero, at which point the loop terminates.

Other areas of potential overflow are harder to analyze. Potential overflow could occur for x2 and y2, and also the interim calculations q*a1, q*x1, and q*y1.

In his second edition book[3] Knuth provides an exercise for the reader in which he states two bounds, which he attributes to George E. Collins. Most likely this refers to Collins’ paper Computing Multiplicative Inverses in GF(p)[4]. Collins’ paper is the best source for bounds I’ve seen for the Extended Euclidean algorithm. It presents most of the invariants and observations needed to bound the algorithm in a full proof, and states bounds for the variables (though no interim calculation bounds). It’s restricted to 0 < b < a, and doesn’t explicitly prove its invariants, but identifying the invariants is the hardest part of this kind of proof. Collins observes that the following equalities remain invariant throughout all of the algorithm:

  1.  gcd(x0, y0) == 1
  2.  gcd(x1, y1) == 1
  3.  abs(x0*y1 - y0*x1) == 1
  4.  x0*y1 - y0*x1 == -(x1*y2 - y1*x2)
  5.  a*x0 + b*y0 == a0
  6.  a*x1 + b*y1 == a1

With the precondition 0 <= b < a, you can read a proof that these invariants are maintained in extended_euclidean_collins.h.

Collins also observes that x0, x1, and x2 alternate in sign, as do y0, y1, and y2. We’ll need to restate this observation as an invariant in order to use it. We might try something like
(x0 >= 0 ? x1 < 0 : x1 <= 0)
initially, but a conditional invariant complicates a proof – plus it just feels a bit ugly. We’ll achieve better success by introducing a sign variable, which we can call s0. Initializing s0 to 1 before the loop and negating it at the end of each loop iteration, will produce the following change to the C++ code of Figure 1 (solely for the purpose of a proof):

template <class T>
void extended_euclidean(const T a, const T b, T* pGcd, T* pX, T* pY)
{
   T x0=1, y0=0, a0=a;
   T x1=0, y1=1, a1=b;
   T s0 = 1;

   while (a1 != 0) {
      T q = a0/a1;
      T a2 = a0 - q*a1;
      T x2 = x0 - q*x1;
      T y2 = y0 - q*y1;
      x0=x1; y0=y1; a0=a1;
      x1=x2; y1=y2; a1=a2;
      s0 = -s0;
   }
   *pX = x0;
   *pY = y0;
   *pGcd = a0;
}

Figure 2: C++ Extended Euclidean Function With Sign Variable.

We can now write invariants that express the alternating sign of x and y variables:

  1.  abs(s0) == 1
  2.  x0 == s0*abs(x0)
  3.  x1 == -s0*abs(x1)
  4.  y0 == -s0*abs(y0)
  5.  y1 == s0*abs(y1)

A sign variable lends itself well to a proof. It’s relatively simple, and it can be manipulated with the distributive property. For example, in extended_euclidean_collins.h (at annotation 57), the proof takes this assignment from Figure 2:
y2 = y0 - q*y1
and substitutes the expressions of invariants 10 and 11:
y2 == -s0*abs(y0) - q*(s0*abs(y1))
and then uses the distributive property:
y2 == -s0*(abs(y0) + q*abs(y1))

The (Nearly) Full Proof

So long as we require a >= 0 and b >= 0, we can prove bounds for the Extended Euclidean algorithm. We’ll do it in three parts, and we’ve already discussed some of the first part – the case of a > b. The second part is the case a < b, and the third part is the case a == b. It would be nice to also consider cases where a or b could be negative, but I expect the proof’s size would grow by a factor of two or three. It’s convenient instead to restrict the input range to non-negative values, given that even this restriction may be more general than applications need. For example, the very restricted case of 0 < b < a is sufficient to apply the Extended Euclidean algorithm to calculating the modular multiplicative inverse.

In case a reader wishes to explore the cases where a and/or b are negative, it’s worth noting that a proof that includes the negative cases will still need restrictions. If we define the constant min = std::numeric_limits<T>::min(), such a proof would need to restrict a and b to cases where a > min and b > min. A simple explanation is that given our definition of min, we can see that gcd(min, min) == -min, yet -min is impossible to represent with type T.

Proof Part 1: a > b

For the case a > b, extended_euclidean_collins.h proves that all variables and interim calculations have specific bounds. This header file proves a few helper theorems at the top of the header, and then provides the Extended Euclidean function with annotations on every statement and assertion, to form the proof of bounds. Every bound is given either as a constant or in terms of a variable which is itself bounded. C asserts are used as much as possible to make the proof testable. There is a CMake file in the parent directory that will create the full proof’s test project if you wish to run it.

Due to the proof’s length, it’s convenient to prune everything from the header file but the algorithm statements and the essential assertion results, leaving only the results of the proof for Part 1. These essential results are the proven bounds assertions and the proven assertions that the outputs are indeed the gcd and the Bézout coefficients as required by the Extended Euclidean algorithm.

Proof Part 2: a < b

For the case a < b, a second header file proves that all variables and interim calculations have specific bounds. The logic of this proof is extremely similar to Part 1, with one large difference: the first iteration of the loop is peeled-out and placed before the loop. Peeling out the first iteration is required to state some loop invariants that only become valid for case a < b after the second loop iteration begins.

Similarly to Part 1, it helps to have a pruned header file leaving only the results of the proof. However for this case of a < b, the proof modified the original Extended Euclidean function by peeling-out the first loop iteration, so this simple pruning also uses that modified function. Ideally at this point we shouldn’t be exposed to the modification, since it’s just an artifact of the proof. Since we only care about the results of the proof and not the proof itself, we can rejoin the first iteration back into the loop. This leaves us the much more useful original function with proof results for Part 2.

Proof Part 3: a == b

The only tricky aspect to case a == b is how to define gcd(0,0). Otherwise, this part consists just of two very short proofs.

One way d = gcd(a,b) can be defined is that d is the integer that divides both a and b, such that for any integer c that divides both a and b, d >= c. Under this definition, gcd(0,0) would be undefined. A second way d = gcd(a,b) can be defined is that d is the non-negative integer that divides both a and b, such that for any integer c that divides both a and b, c divides d. Under this definition, gcd(0,0) == 0. [5] If a caller of the Extended Euclidean function expects the first definition, calling with a == 0 and b == 0 would be an implicit precondition violation, since it would be a request for the undefined gcd(0,0). We will assume that callers will never violate their implicit preconditions, and therefore we assume a caller must expect the second definition if calling with arguments a == 0 and b == 0. If the last reasoning feels unsatisfactory or too fragile, we could simply say that this Extended Euclidean function defines gcd(0,0) == 0 (or we could as a precondition explicitly disallow the special case 0 == a == b). Nevertheless the last reasoning is mostly in keeping with how compilers treat undefined behavior.

There is a very short header file proof for special case 0 == a == b, along with a pruned version containing only the essential proof results. The more general case 0 < a == b also has a short proof, along with a trimmed version containing only essential proof results.

Combined Proof Results

Combining the proof results for all three cases, we have general proof results for the Extended Euclidean algorithm, reproduced below. As before, the bounds are given in terms of constants and other bounded variables (the next section expresses all bounds in terms of the input parameters a and b).

template <class T>
void extended_euclidean_results(const T a, const T b, T* pGcd, T* pX, T* pY)
{
   static_assert(std::numeric_limits<T>::is_integer, "");
   static_assert(std::numeric_limits<T>::is_signed, "");
   assert(a >= 0 && b >= 0);   // preconditions
   T x0 = 1, y0 = 0, a0 = a;
   T x1 = 0, y1 = 1, a1 = b;

   while (a1 != 0) {
      T q = a0/a1;
         assert(0 <= q && q <= a0);
      T a2 = a0 - q*a1;
         assert(0 <= (q*a1) && (q*a1) <= a0);
         assert(0 <= a2 && a2 < a1);
         if (a2 != 0) assert(q <= a0/2);
      T x2 = x0 - q*x1;
         assert(abs(q*x1) <= abs(x2));
         assert(abs(x1) <= abs(x2));
      T y2 = y0 - q*y1;
         assert(abs(q*y1) <= abs(y2));
         assert(y1 == 1 || abs(y1) <= abs(y2));
      x0=x1; y0=y1; a0=a1;
      x1=x2; y1=y2; a1=a2;
   }
      if (a == 0 && b == 0) {
         assert(x1 == 0);
         assert(y1 == 1);
         assert(x0 == 1);
         assert(y0 == 0);
      } else {
         assert(gcd(a,b) >= 1);
         assert(abs(x1) == b/gcd(a,b));
         assert(abs(y1) == a/gcd(a,b));
         assert(x0 == 1 || abs(x0) <= (b/gcd(a,b))/2);
         assert(y0 == 1 || abs(y0) <= (a/gcd(a,b))/2);
      }
      assert(a0 == gcd(a,b));
      assert(a*x0 + b*y0 == gcd(a,b));
   *pX = x0;
   *pY = y0;
   *pGcd = a0;
}

Figure 3: Combined Proof Results, C++ Extended Euclidean Function.

It’s worthwhile noting that the combined results show that at the end of the algorithm, all three parts of the proof achieved a0 == gcd(a,b), and a*x0 + b*y0 == gcd(a,b) (showing that the final values of x0 and y0 are Bézout coefficients). These two assertions may be incidental to the proof of bounds, but they are important because they show that the algorithm remained valid for all three cases we considered.

Final Bounds

What we’d like best is if we could bound all variables and calculations in terms of constant values. We can can come very close to doing this. We can bound all variables and calculations in terms of the two input parameters a and b, which remain constant throughout the algorithm.

We’ll use two steps to get the desired bounds, starting with the results from Figure 3.

For the first step we replace the somewhat complicated if/else clause at the end of Figure 3 with simple assertions, and improve the bounds for a2, q, and (q*a1), showing that all throughout the algorithm they remain less than or equal to max(a,b). We can form the proof with these improvements, and then prune it as usual to leave only the essential assertion results.

For the second step we continue on from the assertion results of the first step, though by using a different proof technique from any previous discussions. Our method will be to start at the very end of the function and work our way backward. We can observe that at the end of the first step’s results, the final values for x0, x1, y0, and y1 are all bounded in terms of the input parameters a and b. We can propagate those bounds backward for all variables and calculations in the algorithm, by working our way backward through the loop iterations.

For example, consider x1 at the end of the first step results. We can see that at the function end,
abs(x1) <= b
Reasoning backward to the final loop iteration and working from the end of that iteration, we see the reassignment x1=x2, so we can reason that prior to that reassignment, it must have been true that
abs(x2) <= b
A few lines earlier in the loop, we can see an already established assertion that
abs(x1) <= abs(x2)
So we know that at that point in the code
abs(x1) <= abs(x2) <= b
or more simply,
abs(x1) <= b
Given this knowledge, we can easily reason that at the start of the final loop iteration, the absolute value of x1 was bounded by b.

Continuing to work backward to the end of the second to last loop iteration (if the loop is taken at least twice), we can use the same reasoning to again bound x2 and x1 for that loop iteration. Continuing to work backward through all remaining iterations bounds x2 and x1 for all iterations.

This method is used in the final proof to bound all variables and calculations in terms of the input parameters a and b. Pruning this proof to leave only the basic algorithm and bounds assertions, we have our final desired result, reproduced here:

template <class T>
void extended_euclidean_final(const T a, const T b, T* pGcd, T* pX, T* pY)
{
   static_assert(std::numeric_limits<T>::is_integer, "");
   static_assert(std::numeric_limits<T>::is_signed, "");
   assert(a >= 0 && b >= 0);   // precondition
   T x0 = 1, y0 = 0, a0 = a;
   T x1 = 0, y1 = 1, a1 = b;

   while (a1 != 0) {
      T q = a0/a1;
         assert(0 <= q && q <= max(a,b));
      T a2 = a0 - q*a1;
         assert(0 <= (q*a1) && (q*a1) <= max(a,b));
         assert(0 <= a2 && a2 < max(a,b));
         if (a2 != 0) assert(q <= max(a,b)/2);
      T x2 = x0 - q*x1;
         assert(abs(q*x1) <= b);
         assert(abs(x2) <= b);
         if (a2 != 0) assert(abs(q*x1) <= max(1,b/2));
         if (a2 != 0) assert(abs(x2) <= max(1,b/2));
      T y2 = y0 - q*y1;
         assert(abs(q*y1) <= max(1,a));
         assert(abs(y2) <= max(1,a));
         if (a2 != 0) assert(abs(q*y1) <= max(1,a/2));
         if (a2 != 0) assert(abs(y2) <= max(1,a/2));

      x0=x1; y0=y1; a0=a1;
      x1=x2; y1=y2; a1=a2;
   }
      assert(a0 == gcd(a,b));
      assert(a*x0 + b*y0 == gcd(a,b));
   *pX = x0;
   *pY = y0;
   *pGcd = a0;
}

Figure 4: Final Bounds in Terms of the Inputs, C++ Extended Euclidean Function.

Conclusion

Distilling the final proof results, we can see that all the variables and calculations in the Extended Euclidean algorithm are bounded to finite maximum and minimum values. More specifically, throughout every loop iteration, the variables and calculations of the algorithm (as expressed in Figure 1) have these following bounds:

0 <= q <= max(a,b)
0 <= (q*a1) <= max(a,b)
0 <= a2 < max(a,b)
abs(q*x1) <= b
abs(x2) <= b
abs(q*y1) <= max(1,a)
abs(y2) <= max(1,a)

And for every loop iteration other than the final iteration (when a2 == 0), a subset of those variables and calculations have these following even tighter bounds:

0 <= q <= max(a,b)/2
abs(q*x1) <= max(1,b/2)
abs(x2) <= max(1,b/2)
abs(q*y1) <= max(1,a/2)
abs(y2) <= max(1,a/2)

Sources

1. Donald Knuth, The Art of Computer Programming, Volume 2 (3rd ed.): Seminumerical Algorithms (Boston: Addison-Wesley Professional, 1997), p.335.
2. Donald Knuth, The Art of Computer Programming, Volume 1 (3rd ed.): Fundamental Algorithms (Redwood City: Addison-Wesley Professional, 1997), pp.13-15.
3. Donald Knuth, The Art of Computer Programming, Volume 2 (2nd ed.): Seminumerical Algorithms (Reading: Addison-Wesley, 1981), p.595 (answer 14).
4. George E. Collins, Computing multiplicative inverses in GF(p), Mathematics of Computation Vol. 23 (1969), pp. 197-200.
5. https://math.stackexchange.com/a/495457

Posted in Uncategorized | 1 Comment

Exhaustive Testing (And Certainty)

Exhaustive testing is the idea of testing every single possible combination of inputs to a function. It’s great if you can do it, but there’s a related idea we need to discuss alongside it. The idea is certainty in testing and certainty through testing, or just in the rawest form – absolute certainty that what you created was/is/and will be correct.

Any sort of testing can pretty easily prove that a program has an error, but it’s probably no surprise that testing won’t prove the absence of any errors. There might even be catastrophic bugs left in a program that’s passed diligent and thorough testing.  Formal proofs seem like they might be a hope for gaining complete control and certainty, but a proof itself could contain an error, just like a program. Even with automated verification tools for proofs, the tool used for verification could have a bug.

So we come back to exhaustive testing. It involves testing every single possible combination of inputs, and verifying each and every output result is correct. It seems like exhaustive testing should be the one and maybe only way to prove, with 100% certainty, that some portion of a program is correct. Unfortunately, it too has insurmountable limitations.  An obvious problem is that the tests themselves could have bugs. But even ignoring this, passing results under exhaustive testing are no guarantee of correctness. Significant bugs could still remain in the tested code, due to undefined behavior.  When program code invokes undefined behavior, one possible result is that everything proceeds exactly as the author planned.  This sounds encouraging… at first. Yet upon changing the compiler, or the hardware used for testing, or the time of day, or upon changing anything at all really, the executable could begin to fail.  Even if program code invokes no undefined behavior at all, a bug in the compiler or a fault in the hardware could conceivably produce false negatives on one or more tests.  In a sense, the tests might lie about correctness. And as for hardware faults, strange thing can happen. One story I read was about a program created via machine learning that found the most optimal solution was to use a short circuit that existed due to a defect in the substrate of the development board.  The program worked perfectly on that one defective board, and nowhere else.

Absolute certainty is impossible. It’s also a distraction. What we really want to gain is better results and more confidence, given the same amount of time (or less). So in that light, if we can do exhaustive testing and it’s fast, why not do it?

Exhaustive testing provides the best test coverage possible, and when it’s possible at all, it’s typically very easy to write the exhaustive tests.  As for feasibility, with today’s hardware it’s usually possible to exhaustively test any pure function that has just a single 32 bit input parameter, or any pure function that has two 16 bit input parameters.  In fact, so long as the combined bit depth of all the input parameters isn’t much beyond 32 bits, a pure function is usually possible to exhaustively test.  Perhaps the test might take fifteen minutes or more, but it could be done when you want to run long tests.  More encouragingly, if you have a function with total input parameter bit depth of 16 bits (or less), you might reasonably expect exhaustive testing of that function to take 10 milliseconds or less.

Posted in Uncategorized | Leave a comment

Google Test Projects in Visual Studio

This is a continuation of the blog entry on a quick start guide to setting up Google Test, this time specific to Visual Studio.

First Things First

If you used the quick start guide to set up Google Test, you may need to be aware of a bug in Visual Studio that occurs when using tests in static libraries or DLLs with Google Test. I’d recommend not using libraries for your tests, but if you do, Google provides some simple information on what you’ll need to know and how to work around the bug. If you don’t place your tests in libraries, then you won’t need to worry about any of this. I personally started out by putting my tests in libraries, since it would let me compose larger tests out of small test groups. However, I ultimately found it unnecessary since I could do the same thing with multiple test executables and CTest. Just keep in mind if you plan to create any tests in a static library (or DLL), the extra steps documented by google will probably be required in order for your tests to run.

Adding a Project

Here’s how to add a Google Test project to the Visual Studio solution that you created when you used CMake in the quick start guide.

In Visual Studio, open the solution and add a new project of type Visual C++ > Windows Desktop > Windows Console Application or Static Library. In the new project’s properties, change the following settings:

  1. Configuration Properties > C/C++ > General > “Additional Include Directories”:  Add
    $(GTEST_DIR)\googletest\include;$(GTEST_DIR)\googletest
  2. Configuration Properties > C/C++ > General > “Warning Level”:  Choose
    Level4 (/W4)
  3. Configuration Properties > C/C++ > General > “Treat Warnings as Errors”:  Choose
    Yes
  4. Configuration Properties > C/C++ > Code Generation > “Runtime Library”:  Depending on release/debug configuration, choose either
    Multi-threaded (/MT)   or  Multi-threaded Debug (/MTd)
  5. Configuration Properties > C/C++ > Precompiled Headers > “Precompiled Headers”:  Choose
    Not Using Precompiled Headers
  6. Configuration Properties > C/C++ > Precompiled Headers > “Precompiled Header File”: Remove any filename, so that this setting is blank.

 

If the project you added is a Windows console application (executable):

Expand the project name in the Solution Explorer window, and right click on References, and choose “Add Reference…” and select gtest, gtest_main, and any other static library projects from the solution that this project will depend upon. [At the time of this writing, the Google Test CMake creates a “gtest” static library/project that contains all the essential google test library functions, and a “gtest_main” static library/project that contains the main() function.]

In this project’s properties, set the following:

  1. Configuration Properties > Linker > General > “Link Library Dependencies”:  Choose
    Yes
  2. Configuration Properties > Linker > Optimization > “References”:  Choose
    No (/OPT:NOREF)
Posted in Uncategorized | Leave a comment

Quick Start Guide to Setting Up Google Test

There are a few different ways that you can get Google Test and then set it up to begin using it. The Google Test documentation skims a bit over the basic starting steps, then presents perhaps a choice overload for a lot of later steps. I have no doubt it’s the best overall reference available, but it’s not necessarily the best guide for someone who just wants to get started. You might find Google Test integrated into some of your tools (for example Visual Studio 17) – so that may be easier to use than this guide, depending on your needs. The method in this guide gives you flexibility and control over how you use Google Test.
Since my own interest was in setting up Google Test on Windows, this guide adds Windows specific details in parentheses. This may help clarify some steps regardless of OS. But aside from the parenthetical information, the steps should (hopefully) work for any OS.

Quick Steps:

  1. Get the git repository for googletest.
    1. Open a command prompt and navigate to the directory where you want the googletest repository to exist (On Windows, I created a new directory for repositories, called C:\Users\Jeff\Documents\repositories, which I then cd’d into)
    2. On the command line, type
      git clone https://github.com/google/googletest.git
  2. Set an environment variable called GTEST_DIR to the path of the new “googletest” directory that was just created by git clone.
    (On Windows, to reflect my own googletest path, I set GTEST_DIR to C:\Users\Jeff\Documents\repositories\googletest. You should set GTEST_DIR as a permanent environment variable, which on Windows you can do via Control Panel > System > Advanced system settings > Environment Variables… then “New…” under System variables. If you still have the command window open, you’ll need to relaunch it in order to use the GTEST_DIR variable.)
  3. If you don’t have CMake, download and install it.
  4. For an expanded version of this final step, see Building Google Test as a Standalone CMake Project.
    On the command line, navigate to a directory where you would like to place a new build of Google Test. On the command line, type
    mkdir mybuild
    cd mybuild
    Then if you are on Unix or OSX, type
    cmake -Dgtest_build_samples=ON $GTEST_DIR
    If you are on Windows, type
    cmake -Dgtest_build_samples=ON %GTEST_DIR%If you don’t want any of the google test sample projects, you can omit -Dgtest_build_samples=ON from the cmake command above.

You should now have Google Test built on your system. To go from here, if you are using Visual Studio, I’d suggest continuing on to read Google Test projects in Visual Studio. Otherwise -and in general- I’d suggest reading the Google Test documentation.

Posted in Uncategorized | Leave a comment