int f(int x, int y) { return x > y ? x : y; } When compiling with -O3 -mtune=core2 -msse4.1, GCC outputs this : f(int, int): movd xmm0, edi movd xmm1, esi pmaxsd xmm0, xmm1 movd eax, xmm0 ret It would seem rather doubtful that this is the optimal solution for doing max over a simple compare+cmov
We now can move integer operations over to SSE which in some cases (max/min vs. cmov can be quite a bit faster). This isolated case is probably not of such kind (but how can you know w/o benchmarking ...). On AMD archs the GPR <-> xmm moves make this unprofitable but those are "free" on intel which makes costing prefer the pmaxsd variant.
Confirmed.