[PATCH] Add (x & ~m) | (y & m) folding (PR middle-end/63568)
Richard Biener
rguenther@suse.de
Wed Dec 17 13:05:00 GMT 2014
On December 17, 2014 12:29:44 PM CET, Marek Polacek <polacek@redhat.com> wrote:
>On Wed, Dec 17, 2014 at 10:12:06AM +0100, Richard Biener wrote:
>> On Wed, 17 Dec 2014, Jakub Jelinek wrote:
>>
>> > On Wed, Dec 17, 2014 at 09:46:44AM +0100, Marek Polacek wrote:
>> > > This adds a transformation of (x & ~m) | (y & m), which (on
>GIMPLE)
>> > > has 4 ops to x ^ ((x ^ y) & m) that has 3 ops on GIMPLE. In
>fact,
>> > > the latter is then transformed to (a ^ b) & m ^ a, which also has
>3
>> > > ops.
>> >
>> > So why don't you transform it to ((x ^ y) & m) ^ x directly (i.e.
>swap
>> > @0 with the (bit_and ...) ?
>>
>> Yeah, I'd prefer that as well. Also you probably should make sure
>> the (x & ~m) and (y & m) exprs are single-use (see other examples
>> in match.pd using has_single_use).
>
>Ok. We have only one use of has_single_use so far in match.pd, so I
>hope what I have is correct... I suppose the point is not to hinder
>CSEing those two subexpressions if they have multiple uses.
>
>> > BTW, the advantage of (x & ~m) | (y & m) form is that there are
>fewer
>> > dependencies, at least if the target has andn instruction (e.g.
>SPARC,
>> > Alpha, PA, MMIX, IA64 have them), just the final or depends on the
>result of
>> > both and and andnot instructions. While in the 2 xor forms, and
>> > depends on the first xor result and the second xor depends on the
>and
>> > result, so if there are multiple ALU units available, the and |
>andn
>> > form can use both the units, while the second one is unnecessarily
>> > serialized.
>>
>> Right - I've pointed this out as well. Of course this simply
>> asks for more clever expansion of ((x ^ y) & m) ^ x rather
>> than disabling this transform. Should be doable with TER
>> in the BIT_XOR_EXPR expansion code. Or figure out if teaching
>> combine/simplify RTX is better. Of course this form requires
>> one more register ... which means with high register pressure
>> or on register starved machines the GIMPLE canonical form might
>> be better (in some cases). Which means that LRA should know
>> how to transform it back? (interesting kind of "reload" ;))
>
>I'd rather leave the expansion/combine stuff to someone else ;).
>
>Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk?
OK.
Thanks,
Richard.
>2014-12-17 Marek Polacek <polacek@redhat.com>
>
> PR middle-end/63568
> * match.pd: Add (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x pattern.
>
> * gcc.dg/pr63568.c: New test.
>
>diff --git gcc/match.pd gcc/match.pd
>index dbca99e..4d4bc9f 100644
>--- gcc/match.pd
>+++ gcc/match.pd
>@@ -382,6 +382,13 @@ along with GCC; see the file COPYING3. If not see
> (bit_not (bit_not @0))
> @0)
>
>+/* (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x */
>+(simplify
>+ (bit_ior:c (bit_and:c@3 @0 (bit_not @2)) (bit_and:c@4 @1 @2))
>+ (if ((TREE_CODE (@3) != SSA_NAME || has_single_use (@3))
>+ && (TREE_CODE (@4) != SSA_NAME || has_single_use (@4)))
>+ (bit_xor (bit_and (bit_xor @0 @1) @2) @0)))
>+
>
> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */
> (simplify
>diff --git gcc/testsuite/gcc.dg/pr63568.c
>gcc/testsuite/gcc.dg/pr63568.c
>index e69de29..fb42bea 100644
>--- gcc/testsuite/gcc.dg/pr63568.c
>+++ gcc/testsuite/gcc.dg/pr63568.c
>@@ -0,0 +1,54 @@
>+/* PR middle-end/63568 */
>+/* { dg-do compile } */
>+/* { dg-options "-fdump-tree-original" } */
>+
>+int
>+fn1 (int a, int b, int m)
>+{
>+ return (a & ~m) | (b & m);
>+}
>+
>+int
>+fn2 (int a, int b, int m)
>+{
>+ return (a & ~m) | (m & b);
>+}
>+
>+int
>+fn3 (int a, int b, int m)
>+{
>+ return (~m & a) | (m & b);
>+}
>+
>+int
>+fn4 (int a, int b, int m)
>+{
>+ return (~m & a) | (b & m);
>+}
>+
>+int
>+fn5 (int a, int b, int m)
>+{
>+ return (b & m) | (a & ~m);
>+}
>+
>+int
>+fn6 (int a, int b, int m)
>+{
>+ return (m & b) | (a & ~m);
>+}
>+
>+int
>+fn7 (int a, int b, int m)
>+{
>+ return (m & b) | (~m & a);
>+}
>+
>+int
>+fn8 (int a, int b, int m)
>+{
>+ return (b & m) | (~m & a);
>+}
>+
>+/* { dg-final { scan-tree-dump-not " \\| " "original" } } */
>+/* { dg-final { cleanup-tree-dump "original" } } */
>
> Marek
More information about the Gcc-patches
mailing list