Which is faster: while(1) or while(2)?

Both loops are infinite, but we can see which one takes more instructions/resources per iteration.

Using gcc, I compiled the two following programs to assembly at varying levels of optimization:

int main(void) {
    while(1) {}
    return 0;
}

int main(void) {
    while(2) {}
    return 0;
}

Even with no optimizations (-O0), the generated assembly was identical for both programs. Therefore, there is no speed difference between the two loops.

For reference, here is the generated assembly (using gcc main.c -S -masm=intel with an optimization flag):

With -O0:

    .file   "main.c"
    .intel_syntax noprefix
    .def    __main; .scl    2;  .type   32; .endef
    .text
    .globl  main
    .def    main;   .scl    2;  .type   32; .endef
    .seh_proc   main
main:
    push    rbp
    .seh_pushreg    rbp
    mov rbp, rsp
    .seh_setframe   rbp, 0
    sub rsp, 32
    .seh_stackalloc 32
    .seh_endprologue
    call    __main
.L2:
    jmp .L2
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

With -O1:

    .file   "main.c"
    .intel_syntax noprefix
    .def    __main; .scl    2;  .type   32; .endef
    .text
    .globl  main
    .def    main;   .scl    2;  .type   32; .endef
    .seh_proc   main
main:
    sub rsp, 40
    .seh_stackalloc 40
    .seh_endprologue
    call    __main
.L2:
    jmp .L2
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

With -O2 and -O3 (same output):

    .file   "main.c"
    .intel_syntax noprefix
    .def    __main; .scl    2;  .type   32; .endef
    .section    .text.startup,"x"
    .p2align 4,,15
    .globl  main
    .def    main;   .scl    2;  .type   32; .endef
    .seh_proc   main
main:
    sub rsp, 40
    .seh_stackalloc 40
    .seh_endprologue
    call    __main
.L2:
    jmp .L2
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

In fact, the assembly generated for the loop is identical for every level of optimization:

 .L2:
    jmp .L2
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

The important bits being:

.L2:
    jmp .L2

I can’t read assembly very well, but this is obviously an unconditional loop. The jmp instruction unconditionally resets the program back to the .L2 label without even comparing a value against true, and of course immediately does so again until the program is somehow ended. This directly corresponds to the C/C++ code:

L2:
    goto L2;

Edit:

Interestingly enough, even with no optimizations, the following loops all produced the exact same output (unconditional jmp) in assembly:

while(42) {}

while(1==1) {}

while(2==2) {}

while(4<7) {}

while(3==3 && 4==4) {}

while(8-9 < 0) {}

while(4.3 * 3e4 >= 2 << 6) {}

while(-0.1 + 02) {}

And even to my amazement:

#include<math.h>

while(sqrt(7)) {}

while(hypot(3,4)) {}

Things get a little more interesting with user-defined functions:

int x(void) {
    return 1;
}

while(x()) {}

#include<math.h>

double x(void) {
    return sqrt(7);
}

while(x()) {}

At -O0, these two examples actually call x and perform a comparison for each iteration.

First example (returning 1):

.L4:
    call    x
    testl   %eax, %eax
    jne .L4
    movl    $0, %eax
    addq    $32, %rsp
    popq    %rbp
    ret
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

Second example (returning sqrt(7)):

.L4:
    call    x
    xorpd   %xmm1, %xmm1
    ucomisd %xmm1, %xmm0
    jp  .L4
    xorpd   %xmm1, %xmm1
    ucomisd %xmm1, %xmm0
    jne .L4
    movl    $0, %eax
    addq    $32, %rsp
    popq    %rbp
    ret
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

However, at -O1 and above, they both produce the same assembly as the previous examples (an unconditional jmp back to the preceding label).

TL;DR

Under GCC, the different loops are compiled to identical assembly. The compiler evaluates the constant values and doesn’t bother performing any actual comparison.

The moral of the story is:

  • There exists a layer of translation between C++ source code and CPU instructions, and this layer has important implications for performance.
  • Therefore, performance cannot be evaluated by only looking at source code.
  • The compiler should be smart enough to optimize such trivial cases. Programmers should not waste their time thinking about them in the vast majority of cases.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)