In both cases the arguments and results are passed only in registers, as per the respective calling conventions on Windows and GNU/Linux.
In the GNU/Linux variant, the xmm1 is used for accumulating the sum. Since it’s a call-clobbered register (a.k.a caller-saved) it’s stored (and restored) in the stack frame of the caller on each call.
In the Windows variant, the xmm6 is used for accumulating the sum. This register is callee-saved in the Windows calling convention (but not in the GNU/Linux one).
So, in summary, the GNU/Linux version saves/restores both xmm0 (in the callee[1]) and xmm1 (in the caller), whereas the Windows version saves/restores only xmm6 (in the callee).
[1] need to look at std::errf to figure out why.