The interactions with optimisations are explained about halfway down the “Assembler Instructions with C Expression Operands” page in the documentation.
GCC doesn’t try to understand any of the actual assembly inside the asm; the only thing it knows about the content is what you (optionally) tell it in the output and input operand specification and the register clobber list.
In particular, note:
An
asminstruction without any output operands will be treated identically to a volatileasminstruction.
and
The
volatilekeyword indicates that the instruction has important side-effects […]
So the presence of the asm inside your loop has inhibited a vectorisation optimisation, because GCC assumes it has side effects.