memcpy strikes back
Jan. 16th, 2015 04:58 pm3 years ago I posted an article on habr regarding memcpy performance with a trivial finding.
Just received an interesting link in a comment. Good measurements, but results are trivial - invoking microcode has fixed cost that has to be amortized over the data volume. If copy cycle does not fit LSD, there will be front end stalls make hand coded sequence slower than microcode.
Just received an interesting link in a comment. Good measurements, but results are trivial - invoking microcode has fixed cost that has to be amortized over the data volume. If copy cycle does not fit LSD, there will be front end stalls make hand coded sequence slower than microcode.