This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code
- From: "wilco at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 06 Mar 2018 13:00:52 +0000
- Subject: [Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code
- Auto-submitted: auto-generated
- References: <bug-84114-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114
--- Comment #8 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Steve Ellcey from comment #6)
> (In reply to Wilco from comment #5)
> > (In reply to Steve Ellcey from comment #4)
> > > While teaching the reassociation pass about fma's seems like the right
> > > answer would it be reasonable (and simpler) to do the fma pass
> > > (pass_optimize_widening_mul) before
> > > the reassociation pass (pass_reassoc) to get the most fma's?
> > >
> > > That fixes my small test case but I haven't done a bigger performance check
> > > to see what the overall impact would be.
> >
> > I don't know what else that would affect since the reassociation phase runs
> > very early - and it's late at this stage. My patch seems much safer. Even
> > easier might be to return 1 for FLOAT_MODE PLUS_EXPR in
> > aarch64_reassociation_width. Then we can fix the reassociation phase in GCC9.
>
> Moving the fma phase did not have a good performance impact (it was worse).
So it looks like it's best to teach the reassociation phase about FMA for GCC9.
> Your patch of setting the reassociation width to 1 did help performance on
> ThunderX2.
Can you let me know if my workaround helped? If useful I could backport it to
GCC7 as well.