This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug tree-optimization/80854] New: hot path is slowed down when the cold return path is merged into it

From: "nsz at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Mon, 22 May 2017 10:11:33 +0000
Subject: [Bug tree-optimization/80854] New: hot path is slowed down when the cold return path is merged into it
Auto-submitted: auto-generated

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80854

            Bug ID: 80854
           Summary: hot path is slowed down when the cold return path is
                    merged into it
           Product: gcc
           Version: 7.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: nsz at gcc dot gnu.org
  Target Milestone: ---

i see subomptimal code gen for

float foo (float x)
{
  if (__builtin_expect (x > 0, 0))
    if (x>2) return 0;
  return x*x;
}

because the return path merge causes extra register move in the hot path
https://godbolt.org/g/AZxxrR

x86_64:

foo:
        pxor    %xmm1, %xmm1
        ucomiss %xmm1, %xmm0
        ja      .L8
.L2:
        movaps  %xmm0, %xmm1  // extra reg move
        mulss   %xmm0, %xmm1
.L1:
        movaps  %xmm1, %xmm0  // extra reg move
        ret
.L8:
        ucomiss .LC1(%rip), %xmm0
        jbe     .L2
        jmp     .L1           // need not jmp back
.LC1:
        .long   1073741824


aarch64:

foo:
        fcmpe   s0, #0.0
        bgt     .L8
.L2:
        fmul    s1, s0, s0
.L1:
        fmov    s0, s1   // extra reg move
        ret
        .p2align 3
.L8:
        fmov    s2, 2.0e+0
        movi    v1.2s, #0
        fcmpe   s0, s2
        ble     .L2
        b       .L1    // need not jmp back

i wonder if gcc could do better if there is information about hot/cold paths
(by not merging the hot/cold return paths in some cases).

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]