This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR46728 (move pow/powi folds to tree phases)
On Fri, 2011-05-13 at 17:26 +0200, Richard Guenther wrote:
> On Fri, May 13, 2011 at 5:01 PM, Nathan Froyd <froydnj@codesourcery.com> wrote:
> > On 05/13/2011 10:52 AM, William J. Schmidt wrote:
> >> This patch addresses PR46728, which notes that pow and powi need to be
> >> lowered in tree phases to restore lost FMA opportunities and expose
> >> vectorization opportunities.
> >>
> >> +struct gimple_opt_pass pass_lower_pow =
> >> +{
> >> + {
> >> + GIMPLE_PASS,
> >> + "lower_pow", /* name */
> >> + NULL, /* gate */
> >
> > Please make this controlled by an option; this pass doesn't need to be run all
> > the time.
> >
> > IMHO, the pass shouldn't run at anything less than -O3, but that's for other
> > people to decide.
>
> It was run unconditionally before, so unless we preserve the code at
> expansion time we have to do it here.
Right. A number of tests fail at -O0 if it's not done unconditionally.
This seemed better than having duplicate code remain in expand.
>
> I will have a closer look at the patch early next week.
Much obliged!
> Btw, I thought
> of adding a POW_EXPR tree code that can take mixed-mode operands
> to make foldings (eventually) simpler, but I'm not sure it's worth the
> trouble.
>
> The position of the pass is odd - why did you place it there? I would
> have placed it alongside pass_cse_sincos and pass_optimize_bswap.
That was where I wanted it initially also, but this seems necessary for
the pass to run unconditionally. If I recall correctly,
gate_all_optimizations() was kicking in at -O0, so I had to move it
earlier.
> The foldings should probably be done via fold-stmt only (where they
> should already apply), and you won't catch things like pow(sqrt(...))
> there because you only see the outer call. That said, I'd be happier
> if the patch just did the powi expansion and left the rest to somebody
> else.
I'm not sure I understand this last part. The original concern of
PR46728 regarded __builtin_pow(x, 0.75) being lowered too late for the
FMA optimization to kick in, so I needed to address that. I'm probably
misunderstanding you.
>
> Richard.