This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- From: Peter Bergner <bergner at vnet dot ibm dot com>
- To: "H.J. Lu" <hjl at lucon dot org>
- Cc: Jakub Jelinek <jakub at redhat dot com>, bonzini at gnu dot org, Pat Haugen <pthaugen at us dot ibm dot com>, Dave Korn <dave dot korn at artimi dot com>, gcc-patches at gcc dot gnu dot org, Ian Lance Taylor <iant at google dot com>, Rask Ingemann Lambertsen <rask at sygehus dot dk>, Richard Guenther <richard dot guenther at gmail dot com>, michael dot meissner at amd dot com
- Date: Fri, 13 Jul 2007 23:00:00 -0500
- Subject: Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- References: <20070711223208.GA31985@lucon.org> <1184203327.7983.60.camel@otta> <46971979.1080104@lu.unisi.ch> <1184348142.8319.75.camel@otta> <20070713184737.GA32407@lucon.org> <4697D3B1.3040304@lu.unisi.ch> <20070713194935.GA32665@lucon.org> <20070713195833.GN2063@devserv.devel.redhat.com> <20070713201841.GA379@lucon.org> <1184360613.8319.104.camel@otta> <20070713214537.GA790@lucon.org>
On Fri, 2007-07-13 at 14:45 -0700, H.J. Lu wrote:
> Why not break it into 2 parts with simplify_plus_minus_op_data_cmp
> calling swap_commutative_operands_p?
That would only make sense if we were to remove the REGNO sorting from
swap_commutative_operands_p. I don't think you or anyone else has
determined that the REGNO sorting is causing any real performance
problems. By "real", I mean anything other than alignment related
slowdowns.
>> The remainder of the patch isn't really a collection of independent
>> parts. If we remove one part, then we'll starting failing to correctly
>> order the operands for indexed load/store operands. As mentioned in
>
> What do you mean by "failing"?
By "failing", I mean we'll get the order of the indexed load/store
operands in the non optimal order. Either order is correct in the
sense that the instruction will still load/store data from the same
address. It's just that the optimal order (base_reg + offset_reg)
is MUCH faster than the non optimal order (offset_reg + base_reg).
> What other changes does this
>
> Index: tree-ssa-address.c
> ===================================================================
> --- tree-ssa-address.c (revision 125863)
> +++ tree-ssa-address.c (working copy)
> @@ -125,7 +125,7 @@ gen_addr_rtx (rtx symbol, rtx base, rtx
> if (base)
> {
> if (*addr)
> - *addr = gen_rtx_PLUS (Pmode, *addr, base);
> + *addr = simplify_gen_binary (PLUS, Pmode, base, *addr);
> else
> *addr = base;
> }
>
> depend on?
Paolo suggested using simplify_gen_binary instead of gen_rtx_PLUS in
one of the _many_ previous notes. To be honest, I'm not sure why.
I'm sure he can explain. However, for me, the important part of that
part of the patch was the swapping of the order of the function args
with "base" now preceding "*addr". Without that change, we again fail
to correctly order the indexed load/store operands in the optimal order.
> I assume by "wrong", you meant very bad performance on Power 6, not
> nessarily codes which generate incorrect results.
Correct.
> Changes in commutative_operand_precedence may be partially responsible
> for performance degradation on x86. How did you come up with those
> numbers? What are the absolute minium changes for
> commutative_operand_precedence? Your ChangeLog says
>
> * rtlanal.c (commutative_operand_precedence): Prefer both REG_POINTER
> and MEM_POINTER operands over REG and MEM operands.
>
> It looks more than that.
Actually, that is all that part of the patch does. The actual return
values of commutative_operand_precedence are unimportant. All that is
important is whether one return value of c_o_p is higher or lower than
another return value of c_o_p. If you look closely at the changes,
you'll notice all I have done is adjust the values of everything with a
lower precedence than RTX_OBJ by exactly -3. This in itself doesn't
change the precedence of anything relative to one another. All it does,
is give me 4 consecutive values for RTX_OBJ's to use so that I can now
sort the different RTX_OBJ's (ie, REG, REG_POINTER, MEM and MEM_POINTER)
between themselves, while still holding that anything that had a lower
precedence to a RTX_OBJ previously, still has a lower precedence and
anything that previously has a higher precedence than a RTX_OBJ, still
has a higher precedence.
So the changing of the actual return values is just cosmetic. The only
real important part of that hunk of the patch is:
+ if (REG_P (op))
+ return (REG_POINTER (op)) ? -1 : -3;
+ else
+ return (MEM_P (op) && MEM_POINTER (op)) ? -2 : -4;
Peter
- References:
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p
- Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p