I’d put my money on Intel’s branch predictor. http://en.wikipedia.org/wiki/Branch_predictor
The processor assumes (time – cb_last_orbital_update > 5000000) to be false most of the time and loads up the execution pipeline accordingly.
Once the condition (time – cb_last_orbital_update > 5000000) comes true. The misprediction delay is hitting you. You may loose 10 to 20 cycles.
if (time - cb_last_orbital_update > 5000000)
{
cb_last_orbital_update = time;
}