I have a good customer, I really like the guy and especially the code we optimize together. But sometimes he complains about the compiler:
1. The compiler is stupid! The loop is so simple, and yet compiler vectorized it much worse than my intrinsics did.
2. The compiler is too smart! I made an intrinsics version of the loop, and it is only 5% faster than compiler generated.
:)
1. The compiler is stupid! The loop is so simple, and yet compiler vectorized it much worse than my intrinsics did.
2. The compiler is too smart! I made an intrinsics version of the loop, and it is only 5% faster than compiler generated.
:)