Bug 94915 - MAX_EXPR weirdly optimized on x86 with -mtune=core2
Summary: MAX_EXPR weirdly optimized on x86 with -mtune=core2
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 10.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2020-05-02 08:20 UTC by Gabriel Ravier
Modified: 2023-08-24 21:44 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-08-16 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Gabriel Ravier 2020-05-02 08:20:48 UTC
int f(int x, int y)
{
    return x > y ? x : y;
}

When compiling with -O3 -mtune=core2 -msse4.1, GCC outputs this :

f(int, int):
  movd xmm0, edi
  movd xmm1, esi
  pmaxsd xmm0, xmm1
  movd eax, xmm0
  ret

It would seem rather doubtful that this is the optimal solution for doing max over a simple compare+cmov
Comment 1 Richard Biener 2020-05-04 06:35:30 UTC
We now can move integer operations over to SSE which in some cases (max/min vs.
cmov can be quite a bit faster).  This isolated case is probably not of such
kind (but how can you know w/o benchmarking ...).  On AMD archs the
GPR <-> xmm moves make this unprofitable but those are "free" on intel which
makes costing prefer the pmaxsd variant.
Comment 2 Andrew Pinski 2021-08-16 21:32:00 UTC
Confirmed.