What’s the purpose of the rotate instructions (ROL, RCL on x86)?

Rotates are required for bit shifts across multiple words. When you SHL the lower word, the high-order bit spills out into the carry. To complete the operation, you need to shift the higher word(s) while bringing in the carry to the low-order bit. RCL is the instruction that accomplishes this. High word Low word CF … Read more

Why isn’t the instruction pointer a normal register usable with MOV or ADD?

You can’t access it directly because there’s no legitimate use case. Having any arbitrary instruction change eip would make branch prediction very difficult, and would probably open up a whole host of security issues. You can edit eip using jmp, call or ret. You just can’t directly read from or write to eip using normal … Read more

Why reading a volatile and writing to a field member is not scalable in Java?

This is what I think is happening (keep in mind I’m not familiar with HotSpot): 0xf36c9fd0: mov 0x6c(%ecx),%ebp ; vfoo 0xf36c9fd3: test %ebp,%ebp ; vfoo is null? 0xf36c9fd5: je 0xf36c9ff7 ; throw NullPointerException (I guess) 0xf36c9fd7: movl $0x1,0x8(%ebp) ; vfoo.x = 1 0xf36c9fde: mov 0x68(%ecx),%ebp ; sz 0xf36c9fe1: inc %ebx ; i++ 0xf36c9fe2: test %edi,0xf7725000 … Read more

What are the best instruction sequences to generate vector constants on the fly?

All-zero: pxor xmm0,xmm0 (or xorps xmm0,xmm0, one instruction-byte shorter.) There isn’t much difference on modern CPUs, but on Nehalem (before xor-zero elimination), the xorps uop could only run on port 5. I think that’s why compilers favour pxor-zeroing even for registers that will be used with FP instructions. All-ones: pcmpeqw xmm0,xmm0. This is the usual … Read more

Homoiconic and “unrestricted” self modifying code + Is lisp really self modifying?

In the first version (+ 1 2 3) is raw code, whereas in the second version it is data. By assuming the truth of this statement it can be argued that Lisp isn’t even homiconic. The code has the same representation as data in the sense that they are both lists/trees/S-expressions. But the fact that … Read more

Modern x86 cost model

The best reference is the Intel Optimization Manual, which provides fairly detailed information on architectural hazards and instruction latencies for all recent Intel cores, as well as a good number of optimization examples. Another excellent reference is Agner Fog’s optimization resources, which have the virtue of also covering AMD cores. Note that specific cost models … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)