2013-06-25

izard: (Default)
2013-06-25 06:10 pm
Entry tags:

Bugz [2/2]

In a previous post, the code sequence I used to illustrate an issue was buggy on its own.

Here is the right one:
volatile float f1 = 2e-40f;
volatile float f2 = 3000000000.0f;
volatile float f3;

f3 = f1/f2;
f3 = f2/f1;

Regardless of x87 or SSE, the code above takes ~800 cycles without factor X, and 800 cycles to 5k cycles with factor X.
(Same for REP MOV which is another example of a complex microcode).

P.S. Factor X is very rare and so the issue is not too bad. Besides, 5k cycles is less than 2 microseconds so not a big deal for common PC/server uses either.