This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: GCC missing -flto optimizations? SPEC lbm benchmark

From: Joel Sherrill <joel at rtems dot org>
To: Ian Lance Taylor <iant at golang dot org>
Cc: Hi-Angel <hiangel999 at gmail dot com>, Jun Ma <majun4950646 at gmail dot com>, "Bin.Cheng" <amker dot cheng at gmail dot com>, Steve Ellcey <sellcey at marvell dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
Date: Fri, 15 Feb 2019 12:09:45 -0600
Subject: Re: GCC missing -flto optimizations? SPEC lbm benchmark
References: <92bfe075168981ee45e525875ac6a15f5e318034.camel@marvell.com> <CAHFci2_tSRtnA38KJjG+kWDDh387NGvY2owyUrmfZjS03def0Q@mail.gmail.com> <CABT63J4+=ihYHkEWy6aZwawYu5Z6Y4wErCmZVLkzLBLv3tVE9w@mail.gmail.com> <CAHGDjgB+jtuum0u7RF5-BeamO+hAsUvSyiXdsPKEmJBLaZQb_Q@mail.gmail.com> <CAKOQZ8yKbiLZdnBwDjXPsDXgwYoBChsOh6qHpnSmKEr2LckNog@mail.gmail.com>
Reply-to: joel at rtems dot org

On Fri, Feb 15, 2019 at 9:02 AM Ian Lance Taylor <iant@golang.org> wrote:

> On Fri, Feb 15, 2019 at 4:46 AM Hi-Angel <hiangel999@gmail.com> wrote:
> >
> > I never could understand, why field reordering was removed from GCC? I
> > mean, I know that it's prohibited in C and C++, but, sure, GCC can
> > detect whether it possibly can influence application behavior, and if
> > not, just do the reorder.
> >
> > The veto is important to C/C++ as programming languages, but not to
> > machine code that is being generated from them. As long as app can't
> > detect that its fields were reordered through means defined by C/C++,
> > field reordering by compiler is fine, isn't it?
>
> In my opinion field reordering is very hard for the compiler to do
> correctly and trivial for a human programmer to do correctly.  So in
> practice the best approach is for the compiler, or some other tool, to
> say "you should reorder the fields here."  As far as I can see, the
> only real reason to implement field reordering in a compiler is for
> benchmark cracking, since benchmarks typically don't let you modify
> the source code.  It's not a useful optimization in practice other
> than for benchmarks.
>

Hasn't GNAT sorted Ada elements in records (e.g. structures) by size
since near its initial addition to GCC in the mid-90s? This results in the
largest elements up front and minimizes the need for alignment gaps.

I know Ada is traditionally more strongly typed than C/C++, but tf it can
be done for Ada programs reliably, why could it not be reliable in C?

>
> (Array transformations and struct splitting, on the other hand, can be
> useful.)
>

--joel

>
> Ian
>
>
>
> > On Fri, 15 Feb 2019 at 12:49, Jun Ma <majun4950646@gmail.com> wrote:
> > >
> > > Bin.Cheng <amker.cheng@gmail.com> 于2019年2月15日周五 下午5:12写道：
> > >
> > > > On Fri, Feb 15, 2019 at 3:30 AM Steve Ellcey <sellcey@marvell.com>
> wrote:
> > > > >
> > > > > I have a question about SPEC CPU 2017 and what GCC can and cannot
> do
> > > > > with -flto.  As part of some SPEC analysis I am doing I found that
> with
> > > > > -Ofast, ICC and GCC were not that far apart (especially spec int
> rate,
> > > > > spec fp rate was a slightly larger difference).
> > > > >
> > > > > But when I added -ipo to the ICC command and -flto to the GCC
> command,
> > > > > the difference got larger.  In particular the 519.lbm_r was more
> than
> > > > > twice as fast with ICC and -ipo, but -flto did not help GCC at all.
> > > > >
> > > > > There are other tests that also show this type of improvement with
> -ipo
> > > > > like 538.imagick_r, 544.nab_r, 525.x264_r, 531.deepsjeng_r, and
> p> > > > 548.exchange2_r, but none are as dramatic as 519.lbm_r.  Anyone
> have
> > > > > any idea on what ICC is doing that GCC is missing?  Is GCC just not
> > > > > agressive enough with its inlining?
> > > >
> > > > IIRC Jun did some investigation before? CCing.
> > > >
> > > > Thanks,
> > > > bin
> > > > >
> > > > > Steve Ellcey
> > > > > sellcey@marvell.com
> > >
> > > ICC is doing much more than GCC in ipo, especially memory layout
> > > optimizations. See https://software.intel.com/en-us/node/522667.
> > > ICC is more aggressive in array transposition/structure splitting
> > > /field reordering. However, these optimizations have been removed
> > > from GCC long time ago.
> > > As for case lbm_r, IIRC a loop with memory access which stride is 20 is
> > > most time-consuming.  ICC will optimize the array(maybe structure?)
> > > and vectorize the loop under ipo.
> > >
> > > Thanks
> > > Jun
>

Follow-Ups:
- Re: GCC missing -flto optimizations? SPEC lbm benchmark
  - From: Richard Kenner
- Re: GCC missing -flto optimizations? SPEC lbm benchmark
  - From: Eric Botcazou

References:
- GCC missing -flto optimizations? SPEC lbm benchmark
  - From: Steve Ellcey
- Re: GCC missing -flto optimizations? SPEC lbm benchmark
  - From: Bin.Cheng
- Re: GCC missing -flto optimizations? SPEC lbm benchmark
  - From: Jun Ma
- Re: GCC missing -flto optimizations? SPEC lbm benchmark
  - From: Hi-Angel
- Re: GCC missing -flto optimizations? SPEC lbm benchmark
  - From: Ian Lance Taylor

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]