This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/77468] [7 Regression] C-ray regression on Aarch64


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77468

--- Comment #19 from James Greenhalgh <jgreenhalgh at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #18)
> (In reply to Aldy Hernandez from comment #17)
> > Created attachment 40573 [details]
> > preprocessed testcase
> 
> Here's the preprocessed testcase generated on:
> 
> openSUSE Leap 42.1 (aarch64)

It would be interesting to see the x86_64 analysis, but if shade is the hot
function in your reduced testcase (I still can't run the binary, so would
appreciate your help with analysis here), then I think the problem is sched1
moving a sqrt out from inside an if in a loop. i.e. something like:

  while (foo)
    if (bar) {
      j = sqrt(x)
      [...]
    }

Becomes:

  while (foo)
  {
    j = sqrt(x)
    if (bar) {
      [...]
    }
  }

GCC might do that if it thinks the if statement is mostly taken. The basic
block frequencies suggest that entry in to the if is 80.9%.

Making that transformation is not helpful as a sqrt is one of the more
expensive operations, and certainly too expensive to pull from conditionally
executed to always executed on a core like Cortex-A53.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]