Empty loop is slower than a non-empty one in C

Assuming that your code uses a 32 bit integer int type (which your system probably does), then nothing can be determined from your code. Instead, it exhibits undefined behavior.

foo.c:5:5: error: first parameter of 'main' (argument count) must be of type 'int'
int main (char argc, char * argv[]) {
    ^
foo.c:13:26: warning: overflow in expression; result is 2147483647 with type 'int' [-Winteger-overflow]
    for (i = 0; i<(1<<31)-1; i++);
                         ^
foo.c:19:26: warning: overflow in expression; result is 2147483647 with type 'int' [-Winteger-overflow]
    for (i = 0; i<(1<<31)-1; i++) {
                         ^

Let’s try to fix that:

#include <stdint.h>
#include <stdio.h>
#include <time.h>
#include <limits.h>

int main (int argc, char * argv[]) {
    time_t begin, end;
    uint64_t i;
    double total_time, free_time;
    int A = 1;
    int B = 1;

    begin = clock();
    for (i = 0; i<INT_MAX; i++);
    end = clock();
    free_time = (double)(end-begin)/CLOCKS_PER_SEC;
    printf("%f\n", free_time);

    begin = clock();
    for (i = 0; i<INT_MAX; i++) {
        A += B%2;
    }
    end = clock();
    free_time = (double)(end-begin)/CLOCKS_PER_SEC;
    printf("%f\n", free_time);

    return(0);
}

Now, let’s look at the assembly output of this code. Personally, I find LLVM’s internal assembly very readable, so I’m going to show that. I’ll produce it by running:

clang -O3 foo.c -S -emit-llvm -std=gnu99

Here’s the relevant portion of the output (the main function):

define i32 @main(i32 %argc, i8** nocapture readnone %argv) #0 {
  %1 = tail call i64 @"\01_clock"() #3
  %2 = tail call i64 @"\01_clock"() #3
  %3 = sub nsw i64 %2, %1
  %4 = sitofp i64 %3 to double
  %5 = fdiv double %4, 1.000000e+06
  %6 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), double %5) #3
  %7 = tail call i64 @"\01_clock"() #3
  %8 = tail call i64 @"\01_clock"() #3
  %9 = sub nsw i64 %8, %7
  %10 = sitofp i64 %9 to double
  %11 = fdiv double %10, 1.000000e+06
  %12 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), double %11) #3
  ret i32 0
}

Note that there are no operations between the calls to clock() for either case. So they are both compiled to the exact same thing.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)