This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Enabling -ftree-slp-vectorize on -O2/Os
- From: Segher Boessenkool <segher at kernel dot crashing dot org>
- To: Allan Sandfeld Jensen <linux at carewolf dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Sat, 26 May 2018 20:23:36 -0500
- Subject: Re: Enabling -ftree-slp-vectorize on -O2/Os
- References: <2659301.XPQk3P0qmd@twilight> <20180526220532.GS17342@gate.crashing.org> <20109354.MqXXt4BNHg@twilight>
On Sun, May 27, 2018 at 01:25:25AM +0200, Allan Sandfeld Jensen wrote:
> On Sonntag, 27. Mai 2018 00:05:32 CEST Segher Boessenkool wrote:
> > On Sat, May 26, 2018 at 11:32:29AM +0200, Allan Sandfeld Jensen wrote:
> > > I brought this subject up earlier, and was told to suggest it again for
> > > gcc 9, so I have attached the preliminary changes.
> > >
> > > My studies have show that with generic x86-64 optimization it reduces
> > > binary size with around 0.5%, and when optimizing for x64 targets with
> > > SSE4 or better, it reduces binary size by 2-3% on average. The
> > > performance changes are negligible however*, and I haven't been able to
> > > detect changes in compile time big enough to penetrate general noise on
> > > my platform, but perhaps someone has a better setup for that?
> > >
> > > * I believe that is because it currently works best on non-optimized code,
> > > it is better at big basic blocks doing all kinds of things than tightly
> > > written inner loops.
> > >
> > > Anythhing else I should test or report?
> >
> > What does it do on other architectures?
> >
> I believe NEON would do the same as SSE4, but I can do a check. For
> architectures without SIMD it essentially does nothing.
Sorry, I wasn't clear. What does it do to performance on other
architectures? Is it (almost) always a win (or neutral)? If not, it
doesn't belong in -O2, not for the generic options at least.
(We'll test it on Power soon, it's weekend now :-) ).
Segher