This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Add (x & ~m) | (y & m) folding (PR middle-end/63568)


On December 17, 2014 12:29:44 PM CET, Marek Polacek <polacek@redhat.com> wrote:
>On Wed, Dec 17, 2014 at 10:12:06AM +0100, Richard Biener wrote:
>> On Wed, 17 Dec 2014, Jakub Jelinek wrote:
>> 
>> > On Wed, Dec 17, 2014 at 09:46:44AM +0100, Marek Polacek wrote:
>> > > This adds a transformation of (x & ~m) | (y & m), which (on
>GIMPLE)
>> > > has 4 ops to x ^ ((x ^ y) & m) that has 3 ops on GIMPLE.  In
>fact,
>> > > the latter is then transformed to (a ^ b) & m ^ a, which also has
>3
>> > > ops.
>> > 
>> > So why don't you transform it to ((x ^ y) & m) ^ x directly (i.e.
>swap
>> > @0 with the (bit_and ...) ?
>> 
>> Yeah, I'd prefer that as well.  Also you probably should make sure
>> the (x & ~m) and (y & m) exprs are single-use (see other examples
>> in match.pd using has_single_use).
> 
>Ok.  We have only one use of has_single_use so far in match.pd, so I
>hope what I have is correct...  I suppose the point is not to hinder
>CSEing those two subexpressions if they have multiple uses.
>
>> > BTW, the advantage of (x & ~m) | (y & m) form is that there are
>fewer
>> > dependencies, at least if the target has andn instruction (e.g.
>SPARC,
>> > Alpha, PA, MMIX, IA64 have them), just the final or depends on the
>result of
>> > both and and andnot instructions.  While in the 2 xor forms, and
>> > depends on the first xor result and the second xor depends on the
>and
>> > result, so if there are multiple ALU units available, the and |
>andn
>> > form can use both the units, while the second one is unnecessarily
>> > serialized.
>> 
>> Right - I've pointed this out as well.  Of course this simply
>> asks for more clever expansion of ((x ^ y) & m) ^ x rather
>> than disabling this transform.  Should be doable with TER
>> in the BIT_XOR_EXPR expansion code.  Or figure out if teaching
>> combine/simplify RTX is better.  Of course this form requires
>> one more register ... which means with high register pressure
>> or on register starved machines the GIMPLE canonical form might
>> be better (in some cases).  Which means that LRA should know
>> how to transform it back? (interesting kind of "reload" ;))
>
>I'd rather leave the expansion/combine stuff to someone else ;).
>
>Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk?

OK.

Thanks,
Richard.

>2014-12-17  Marek Polacek  <polacek@redhat.com>
>
>	PR middle-end/63568
>	* match.pd: Add (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x pattern.
>
>	* gcc.dg/pr63568.c: New test.
>
>diff --git gcc/match.pd gcc/match.pd
>index dbca99e..4d4bc9f 100644
>--- gcc/match.pd
>+++ gcc/match.pd
>@@ -382,6 +382,13 @@ along with GCC; see the file COPYING3.  If not see
>   (bit_not (bit_not @0))
>   @0)
> 
>+/* (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x */
>+(simplify
>+  (bit_ior:c (bit_and:c@3 @0 (bit_not @2)) (bit_and:c@4 @1 @2))
>+  (if ((TREE_CODE (@3) != SSA_NAME || has_single_use (@3))
>+	&& (TREE_CODE (@4) != SSA_NAME || has_single_use (@4)))
>+   (bit_xor (bit_and (bit_xor @0 @1) @2) @0)))
>+
> 
> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
> (simplify
>diff --git gcc/testsuite/gcc.dg/pr63568.c
>gcc/testsuite/gcc.dg/pr63568.c
>index e69de29..fb42bea 100644
>--- gcc/testsuite/gcc.dg/pr63568.c
>+++ gcc/testsuite/gcc.dg/pr63568.c
>@@ -0,0 +1,54 @@
>+/* PR middle-end/63568 */
>+/* { dg-do compile } */
>+/* { dg-options "-fdump-tree-original" } */
>+
>+int
>+fn1 (int a, int b, int m)
>+{
>+  return (a & ~m) | (b & m);
>+}
>+
>+int
>+fn2 (int a, int b, int m)
>+{
>+  return (a & ~m) | (m & b);
>+}
>+
>+int
>+fn3 (int a, int b, int m)
>+{
>+  return (~m & a) | (m & b);
>+}
>+
>+int
>+fn4 (int a, int b, int m)
>+{
>+  return (~m & a) | (b & m);
>+}
>+
>+int
>+fn5 (int a, int b, int m)
>+{
>+  return (b & m) | (a & ~m);
>+}
>+
>+int
>+fn6 (int a, int b, int m)
>+{
>+  return (m & b) | (a & ~m);
>+}
>+
>+int
>+fn7 (int a, int b, int m)
>+{
>+  return (m & b) | (~m & a);
>+}
>+
>+int
>+fn8 (int a, int b, int m)
>+{
>+  return (b & m) | (~m & a);
>+}
>+
>+/* { dg-final { scan-tree-dump-not " \\| " "original" } } */
>+/* { dg-final { cleanup-tree-dump "original" } } */
>
>	Marek



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]