gcc -O0 still optimizes out “unused” code that should raise an FP exception. Is there a compile flag to change that?

Why does gcc not emit the specified instruction?

A compiler produces code that must have the observable behavior specified by the Standard. Anything that is not observable can be changed (and optimized) at will, as it does not change the behavior of the program (as specified).

How can you beat it into submission?

The trick is to make the compiler believe that the behavior of the particular piece of code is actually observable.

Since this a problem frequently encountered in micro-benchmark, I advise you to look how (for example) Google-Benchmark addresses this. From benchmark_api.h we get:

template <class Tp>
inline void DoNotOptimize(Tp const& value) {
    asm volatile("" : : "g"(value) : "memory");
}

The details of this syntax are boring, for our purpose we only need to know:

  • "g"(value) tells that value is used as input to the statement
  • "memory" is a compile-time read/write barrier

So, we can change the code to:

asm volatile("" : : : "memory");

__m128 result = _mm_div_ss(s1, s2);

asm volatile("" : : "g"(result) : );

Which:

  • forces the compiler to consider that s1 and s2 may have been modified between their initialization and use
  • forces the compiler to consider that the result of the operation is used

There is no need for any flag, and it should work at any level of optimization (I tested it on https://gcc.godbolt.org/ at -O3).

Leave a Comment