*“There are far too many integer types, there are far too lenient rules for mixing them together, and it’s a major bug source, which is why I’m saying stay as simple as you can, use [signed] integers til you really really need something else.”* -Bjarne Stroustrup, (Q&A at 43:00)

*“Use [signed] ints unless you need something different, then still use something signed until you really need something different, then resort to unsigned.”* -Herb Sutter, (same Q&A)

This is good and easy advice. Though if you’re curious, you probably also want to know ** why** it’s good advice. And if you really need an unsigned type, you probably want to know if there’s anything you can or should do to avoid bugs; for solutions, just skip ahead to the Recommendations.

## Surprises

You could run into a few problems with unsigned integer types. The following code might be quite surprising:

```
#include <limits>
#include <iostream>
int main()
{
// assume this static assert passes
static_assert(sizeof(unsigned short) < sizeof(int));
unsigned short one = 1;
unsigned short max = std::numeric_limits<unsigned short>::max();
unsigned short sum = one + max;
if (sum == one + max)
std::cout << "sum = one + max, and sum == one + max\n";
else
std::cout << "sum = one + max, but sum != one + max\n";
return 0;
}
```

*Figure 1*

If you run it you’ll get the output

```
sum = one + max, but sum != one + max
```

Here’s a link to it on wandbox if you want to try it out. There’s no undefined behavior in the program or compiler bugs.

The surprising result occurs due to “integral promotion”. You can see the provided link for the precise integer promotion rules, but in practice, the rules mean that during a math operation or comparison, any integer types smaller (in bit-width) than type *int* will be implicitly converted by the compiler to type *int*.

This means that in Figure 1, if the static_assert passes, the assignment

```
unsigned short sum = one + max;
```

will be translated by the compiler into

```
unsigned short sum = (unsigned short)((int)one + (int)max);
```

Let’s work with concrete numbers and assume your compiler uses a 16 bit *unsigned short* type and a 32 bit *int* type – this is very common, though not universal. In Figure 1, the *unsigned short* variable **max*** *will be assigned the value 65535, and will retain this value when converted to type *int*. The variable **one** will be assigned the value 1, and will retain this value after being converted to type *int*. The addition of these two (converted/promoted) type *int* values results in the value 65536, which is easily representable in a 32 bit *int* type, and so there won’t be any overflow or undefined behavior from the addition. The compiler will cast that result from type *int* to type *unsigned short* in order to assign it to variable **sum**. The value 65536 isn’t representable in a 16 bit *unsigned short* (**sum**‘s type), but the conversion is well-defined in C and C++; the conversion gets performed modulo 2* ^{N}*, where

*N*is the bit width of type

*unsigned short*. In this example, N=16 and thus the conversion of 65536 will result in the value 0, which will be assigned to

**sum**.

A similar process occurs on the line

```
if (sum == one + max)
```

except that there isn’t any final narrowing conversion back to *unsigned short.* Here’s what happens: As before, **one** and **max** are promoted to type *int* prior to addition, resulting in a type *int* summation value of 65536. When evaluating the conditional, the left hand side (**sum**) is promoted to type *int*, and the right hand side (the summation 65536) is already type *int*. 65536 compares as unequal to **sum**, since **sum** was assigned the value 0 earlier. A narrowing conversion to *unsigned short* took place when **sum** was assigned. However, the conditional operator works with operands of type *int*, and so the right hand side summation never gets a similar conversion down to *unsigned short*. It stays as type *int*. We end up with the unexpected output “`sum = one + max, but sum != one + max"`

Hidden integral promotions and narrowing conversions are subtle, and the results can be surprising, which is usually a very bad thing. For *signed* integral types, there generally isn’t any problem with promotion. It’s the promotion of *unsigned* integral types that’s problematic and bug-prone.

Let’s look at a second surprise from unsigned integer promotion:

```
#include <limits>
#include <iostream>
int main()
{
unsigned short one = 1;
unsigned short max = std::numeric_limits<unsigned short>::max();
unsigned int sum = one + max;
std::cout << "sum == " << sum << "\n";
return 0;
}
```

*Figure 2*

If you run Figure 2 on a system where *unsigned short* and *int* are both 16bit types, the program will output “sum == 0”. Since *unsigned short* and *int* are the same size, the operands **one **and **max **will not be promoted, and the addition will overflow in a well defined manner resulting in 0. If on the other hand you run Figure 2 on a system where *unsigned short* is a 16bit type and *int* is a 32 bit type, the operands **one **and **max** will be promoted to type *int* prior to the addition and no overflow will occur; the program will output “sum == 65536”. The integral promotion results in non-portable code.

## Undefined Behavior

Now that we’re familiar with integral promotion, let’s look at a simple function:

```
unsigned short multiply(unsigned short x, unsigned short y)
{
// assume this static assert passes
static_assert(sizeof(unsigned short) * 2 == sizeof(int));
unsigned short result = x * y;
return result;
}
```

*Figure 3*

Despite all lines seeming to involve only type *unsigned short*, there is a potential for undefined behavior in Figure 3 on line 6 due to possible signed integer overflow on type *int*. The compiler will implicitly perform integral promotion on line 6, so that the multiplication will involve two (promoted/converted) operands of type *int*, not of type *unsigned short*. If for our compiler *unsigned short* is 16 bit and *int* is 32 bit, then any product of **x** and **y** larger than 2^31 will overflow the signed type *int.* And unfortunately, signed integral overflow is undefined behavior. It doesn’t matter that overflow of unsigned integral types is well-defined behavior in C and C++. No multiplication of values of type *unsigned short *ever occurs in this function.

Let’s finally look at a contrived toy function:

```
unsigned short toy_shift(unsigned short x, unsigned short y)
{
// assume this static assert passes
static_assert(sizeof(unsigned short) < sizeof(int));
unsigned short result = (x-y) << 1;
if (x >= y)
return 0;
return result;
}
```

*Figure 4*

The subtraction operator in Figure 4 has two unsigned short operands **x** and **y**, both of which will be promoted to type *int*. If **x** is less than **y** then the result of the subtraction will be a negative number, and left shifting a negative number is undefined behavior. Keep in mind that if the subtraction had involved unsigned integral types (as it would appear on the surface), the result would have underflowed in a well-defined manner and wrapped around to become a large positive number, and the left shift would have been well-defined. But since integral promotion occurs, the result of a left shift when **x** is less than **y** would be undefined behavior.

An interesting consequence of the potential for undefined behavior in Figure 4 is that any compiler would be within its rights to generate “optimized” object code for the function (if the static_assert succeeds) that is very fast and almost certainly unintended by the programmer, equivalent to

```
unsigned short toy_shift(unsigned short x, unsigned short y)
{
return 0;
}
```

To see why, we need to understand how modern compilers can use undefined behavior. For better or worse, modern C/C++ compilers commonly use undefined behavior to optimize, by taking advantage of the fact that undefined behavior is impossible in any valid code. It’s somewhat controversial whether compilers *really ought to* ever do this, but the reality is that in the present day it’s an extremely common optimization technique, and nothing in the C/C++ standards forbids it. With regard to Figure 4, this means a compiler could assume the conditional **(x >= y)** in toy_shift() will always succeed – because the alternative would be that the function had undefined behavior from left shifting a negative number, and the compiler knows that undefined behavior is impossible for valid code. The compiler always assumes that we have written valid code unless it can prove otherwise (in which case we’d get a compiler error message). We might incorrectly think that the compiler can’t make any assumptions about the arguments to the toy_shift() function because it can’t predict what arbitrary calling code might do, but the compiler can make some limited predictions. It can assume that calling code will never use any arguments that result in undefined behavior, because getting undefined behavior would be impossible from * valid* calling code. The compiler can therefore conclude that with valid code, there is no scenario in which the conditional could possibly fail, and it could use this knowledge to “optimize” the function, producing object code that simply returns 0. [For scant reassurance, I haven’t seen a compiler do this (yet) for Figure 4.]

## The Integral Types Which May be Promoted

Integral promotion involves some implementation-defined behavior. It’s up to the compiler to define the exact sizes for the types *char, unsigned char, signed char, short, **unsigned short*, *int, unsigned int, long, unsigned long, long long, *and* unsigned long long*. The only way to know if one of these types has a larger bit-width than another is to check your compiler’s documentation, or to compile/run a program that outputs the sizeof() result for the types. Thus it’s implementation defined whether *int *has a larger bit width than *unsigned short*, and by extension it’s implementation defined whether *unsigned short *will be promoted to type *int*. The standard does effectively guarantee that types *int, unsigned int, long, unsigned long, long long, *and *unsigned long long* will never be promoted. Floating point types of course are never subjected to integral promotion.

But this leaves far more integral types than you might expect which may (at least in principle) be promoted. A non-exhaustive list of types that might be promoted is

*char, unsigned char, signed char, short, unsigned short, int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, int_fast8_t, uint_fast8_t, int_least8_t, uint_least8_t, int_fast16_t, uint_fast16_t, int_least16_t, uint_least16_t, int_fast32_t, uint_fast32_t, int_least32_t, uint_least32_t, int_fast64_t, uint_fast64_t, int_least64_t, uint_least64_t*

Surprisingly, all the sized integral types (int32_t, uint64_t, etc) are open to possible integral promotion, dependent upon the implementation-defined size of *int*. For example, it’s plausible that there could someday be a compiler that defines *int* as a 64 bit type, and if so, *int32_t *and* uint32_t* will be subject to promotion to that larger *int* type. Compiler writers are likely to understand this could break older code that implicitly assumes *uint32_t* won’t be promoted, but there’s no guarantee. In theory there’s nothing in the standard that would prevent a future compiler from defining *int* as even a 128 bit type, and so we have to include *int64_t *and* uint64_t* in the list of types that could at least in theory be promoted, all dependent on how the compiler defines type *int*.

Very realistically in code today, *unsigned char, unsigned short, uint8_t *and *uint16_t *(and also *uint_least8_t, uint_least16_t, uint_fast8_t, uint_fast16_t*) should be considered a minefield for programmers and maintainers. On most compilers (defining *int* as at least 32 bit), these types don’t behave as expected. They will usually be promoted to type *int* during operations and comparisons, and so they will be vulnerable to all the undefined behavior of the signed type *int.* They won’t be protected by any well-defined behavior of the original unsigned type, since after promotion the types are no longer unsigned.

## Recommendations

Sometimes you really do need unsigned integers. *Unsigned int*, *unsigned long*, and *unsigned long long* are all more or less safe since they’re never promoted. But if you use an unsigned type from the last section, or if you use generic code that expects an unsigned integer type of unknown size, that type can be dangerous to use due to promotion.

For C, there seems to only be a partial solution to the unsigned promotion problem. When you use an unsigned type smaller than *unsigned int*, explicitly cast it to *unsigned int* during mathematical operations and comparisons so that it won’t get implicitly promoted to type *int*. This doesn’t perfectly help with fixed width integer types (e.g. uint16_t, uint32_t) since *unsigned int* has implementation defined size; you can’t strictly assume that a fixed width type is larger or smaller than *unsigned int*, though worst case you could use something like _Static_assert(sizeof(unsigned int) == 4) to cause your program to fail to compile if your assumption is invalid.

For C++, there’s a fairly good solution. You can use the following helper class to get a safe type that you can use as a destination type for explicit casts on your unsigned (or generic) integer types during mathematical operations and comparisons. Explicitly casting prevents implicit promotion of unsigned types to *int.* The effect is that unsigned types smaller than *unsigned int* will be (manually) promoted to *unsigned int*, and signed types smaller than *int* will be promoted to *int*. Types larger than *int* or *unsigned int* will be unchanged. This helper class provides a safe and relatively easy way to achieve well-defined behavior with all unsigned and signed integer types, as we’ll see by example.

```
#include <type_traits>
#include <limits>
template <class T>
struct safely_promote {
static_assert(std::numeric_limits<T>::is_integer, "");
// if T is unsigned and can be promoted, then 'type'
// will be the unsigned version of T's promoted type.
// Otherwise 'type' will be the same as T.
private:
using UT = std::make_unsigned_t<decltype((T)1*(T)1)>;
public:
using type = std::conditional_t<
std::is_unsigned_v<T>, UT, T>;
};
template <class T>
using safely_promote_t = typename safely_promote<T>::type;
```

To illustrate the use of *safely_promote_t*, let’s write a template function version of Figure 3 that is free from any undefined behavior:

```
template <class T>
T multiply(T x, T y)
{
static_assert(std::numeric_limits<T>::is_integer, "");
using U = safely_promote_t<T>;
T result = static_cast<U>(x) * static_cast<U>(y);
return result;
}
```

Of course the best solution of all came from the introductory advice: use a signed integral type instead of unsigned types whenever you can.

## Reference

The C++17 standard has multiple sections that involve integral promotion. For reference, here are the excerpts/summaries from the relevant parts of the C++17 standard draft:

7.6 Integral promotions [conv.prom]

1 A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (7.15) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.

8 Expressions [expr]

11 Many binary operators that expect operands of arithmetic or enumeration type cause conversions […These are] called the usual arithmetic conversions.

[… If neither operand has scoped enumeration type, type long double, double, or float,] the integral promotions (7.6) shall be performed on both operands.

8.3.1 Unary operators [expr.unary.op] (parts 7, 8, 10)

[For the unary operators +, -, ~, the operands are subject to integral promotion.]

8.6 Multiplicative operators [expr.mul]

[Binary operators *, /, %]

2
The usual arithmetic conversions are performed on the operands and determine the type of the result.

8.7 Additive operators [expr.add]

1 The additive [binary] operators + and – group left-to-right. The usual arithmetic conversions are performed for operands of arithmetic or enumeration type.

8.8 Shift operators [expr.shift]

[For the binary operators << and >>, the operands are subject to integral promotion.]

8.9 Relational operators [expr.rel]

[<, <=, >, >=]

2 The usual arithmetic conversions are performed on operands of arithmetic or enumeration type

8.10 Equality operators [expr.eq]

[==, !=]

6 If both operands are of arithmetic or enumeration type, the usual arithmetic conversions are performed on both operands

8.11 Bitwise AND operator [expr.bit.and]

1 The usual arithmetic conversions are performed;

8.12 Bitwise exclusive OR operator [expr.xor]

1 The usual arithmetic conversions are performed;

8.13 Bitwise inclusive OR operator [expr.or]

1 The usual arithmetic conversions are performed;

Bjarne Stroustrup and Herb Sutter both give absolutely awful advice. No surprises that they’re designer and advocate of one of the worst languages ever created

LikeLike

They give great advice, but I have mixed feelings on the language and pragmatically sometimes it’s a good choice and sometimes it’s not. For yet another language creator… Dennis Ritchie once called C “quirky, flawed, and an enormous success”. Putting aside the success, C++ has all the quirks and flaws and many many more. And yet it makes a lot of sense to use from a practical point of view for the particular work I do. I’m open to change. Rust is interesting, but it’s a chicken and egg problem where I don’t want to invest time into something that won’t yet have large impact due to few people using it.

LikeLike