n is negative, positive or zero? return 1, 2, or 4

Question

First, if this variable is to be updated after (nearly) every instruction, the obvious piece of advice is this:

don’t

Only update it when the subsequent instructions need its value. At any other time, there’s no point in updating it.

But anyway, when we update it, what we want is this behavior:

R < 0  => CR0 == 0b001 
R > 0  => CR0 == 0b010
R == 0 => CR0 == 0b100

Ideally, we won’t need to branch at all. Here’s one possible approach:

Set CR0 to the value 1. (if you really want speed, investigate whether this can be done without fetching the constant from memory. Even if you have to spend a couple of instructions on it, it may well be worth it)
If R >= 0, left shift by one bit.
If R == 0, left shift by one bit

Where steps 2 and 3 can be transformed to eliminate the “if” part

CR0 <<= (R >= 0);
CR0 <<= (R == 0);

Is this faster? I don’t know. As always, when you are concerned about performance, you need to measure, measure, measure.

However, I can see a couple of advantages of this approach:

we avoid branches completely
we avoid memory loads/stores.
the instructions we rely on (bit shifting and comparison) should have low latency, which isn’t always the case for multiplication, for example.

The downside is that we have a dependency chain between all three lines: Each modifies CR0, which is then used in the next line. This limits instruction-level parallelism somewhat.

To minimize this dependency chain, we could do something like this instead:

CR0 <<= ((R >= 0) + (R == 0));

so we only have to modify CR0 once, after its initialization.

Or, doing everything in a single line:

CR0 = 1 << ((R >= 0) + (R == 0));

Of course, there are a lot of possible variations of this theme, so go ahead and experiment.

Leave a Comment Cancel reply