[PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p simplification [PR103417]

Richard Biener rguenther@suse.de
Thu Nov 25 08:31:18 GMT 2021


On Thu, 25 Nov 2021, Tamar Christina wrote:

> Hi Jakub,
> 
> > -----Original Message-----
> > From: Jakub Jelinek <jakub@redhat.com>
> > Sent: Thursday, November 25, 2021 8:19 AM
> > To: Richard Biener <rguenther@suse.de>
> > Cc: Tamar Christina <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org
> > Subject: [PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p
> > simplification [PR103417]
> > 
> > Hi!
> > 
> > The following testcase is miscompiled since the r12-5489-g0888d6bbe97e10
> > changes.
> > The simplification triggers on
> > (x & 4294967040U) >= 0U
> > and turns it into:
> > x <= 255U
> > which is incorrect, it should fold to 1 because unsigned >= 0U is always true
> > and normally the
> > /* Non-equality compare simplifications from fold_binary  */
> >      (if (wi::to_wide (cst) == min)
> >        (if (cmp == GE_EXPR)
> >         { constant_boolean_node (true, type); }) simplification folds that, but
> > this simplification was done earlier.
> > 
> > The simplification correctly doesn't include lt which has the same reason why
> > it shouldn't be handled, we'll fold it to 0 elsewhere.
> 
> 
> Yes this was a bug, sorry I'm not sure why I didn't catch it...
> 
> > 
> > But, IMNSHO while it isn't incorrect to handle le and gt there, it is
> > unnecessary.  Because (x & cst) <= 0U and (x & cst) > 0U should never appear,
> > again in
> > /* Non-equality compare simplifications from fold_binary  */ we have a
> > simplification for it:
> >        (if (cmp == LE_EXPR)
> >         (eq @2 @1))
> >        (if (cmp == GT_EXPR)
> >         (ne @2 @1))))
> > This is done for
> >   (cmp (convert?@2 @0) uniform_integer_cst_p@1) and so should be done
> > for both integers and vectors.
> > As the bitmask_inv_cst_vector_p simplification only handles eq and ne for
> > signed types, I think it can be simplified to just following patch.

Note that would mean the transform should be ordered _after_ the above,
even if we retain it for vector le/gt.

> As I mentioned on the PR I don't think LE and GT should be removed, the patch
> Is attempting to simplify the bitmask used because most vector ISAs can create
> the simpler mask much easier than the complex mask.
> 
> It. 0xFFFFFF00 is harder to create than 0xFF.   So while for scalar it doesn't matter
> as much, it does for vector code.
>
> Regards,
> Tamar
> 
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > I wonder if (for cst (VECTOR_CST INTEGER_CST) is good for the best size of *-
> > match.c, wouldn't accepting just CONSTANT_CLASS_P@1 and then just say in
> > bitmask_inv_cst_vector_p return NULL_TREE if it isn't INTEGER_CST or
> > VECTOR_CST?

In the end that should be recoverable by genmatch.  I do have some ideas
to improve it for size in this area, maybe during stage4.  Originally
genmatch was trying to optimize for matching speed but with now honoring
ordering of patterns that very much became secondary (note re-ordering
patterns in match.pd can also improve *-match.c size greatly!  maybe
some script can try to brute-force the "optimal" order - but note some
pattern order matters ;))

> > Also, without/with this patch I see on i686-linux (can be reproduced with
> > RUNTESTFLAGS="--target_board=unix/-m32/-mno-sse dg.exp='bic-bitmask*
> > signbit-2*'"
> > too):
> > FAIL: gcc.dg/bic-bitmask-10.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}"
> > FAIL: gcc.dg/bic-bitmask-11.c scan-tree-dump dce7 ">\\\\s*.+{ 255,.+}"
> > FAIL: gcc.dg/bic-bitmask-12.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}"
> > FAIL: gcc.dg/bic-bitmask-2.c scan-tree-dump-times dce7
> > "<=\\\\s*.+{ 255,.+}" 1
> > FAIL: gcc.dg/bic-bitmask-23.c (test for excess errors)
> > FAIL: gcc.dg/bic-bitmask-23.c scan-tree-dump dce7 "<=\\\\s*.+{ 255, 15, 1,
> > 65535 }"
> > FAIL: gcc.dg/bic-bitmask-3.c scan-tree-dump-times dce7
> > "<=\\\\s*.+{ 255,.+}" 1
> > FAIL: gcc.dg/bic-bitmask-4.c scan-tree-dump-times dce7 "=\\\\s*.+{ 1,.+}" 1
> > FAIL: gcc.dg/bic-bitmask-5.c scan-tree-dump-times dce7 ">\\\\s*.+{ 255,.+}"
> > 1
> > FAIL: gcc.dg/bic-bitmask-6.c scan-tree-dump-times dce7
> > "<=\\\\s*.+{ 255,.+}" 1
> > FAIL: gcc.dg/bic-bitmask-8.c scan-tree-dump-times dce7 ">\\\\s*.+{ 1,.+}" 1
> > FAIL: gcc.dg/bic-bitmask-9.c scan-tree-dump dce7
> > "&\\\\s*.+{ 4294967290,.+}"
> > FAIL: gcc.dg/signbit-2.c scan-tree-dump optimized "\\\\s+>\\\\s+{ 0(, 0)+ }"
> > Those tests use vect_int effective target, but AFAIK that can be used only in
> > *.dg/vect/ because it relies on vect.exp enabling options to support
> > vectorization on the particular target (e.g. for i686-linux that -msse2).
> > I think there isn't other way to get the DEFAULT_VECTCFLAGS into dg-
> > options other than having the test driven by vect.exp.
> > 
> > And, finally, I've noticed incorrect formatting in the new
> > bitmask_inv_cst_vector_p routine:
> >   do {
> >     if (idx > 0)
> >       cst = vector_cst_elt (t, idx);
> > ...
> >     builder.quick_push (newcst);
> >   } while (++idx < nelts);
> > It should be
> >   do
> >     {
> >       if (idx > 0)
> > 	cst = vector_cst_elt (t, idx);
> > ...
> >       builder.quick_push (newcst);
> >     }
> >   while (++idx < nelts);
> > 
> > 2021-11-25  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR tree-optimization/103417
> > 	* match.pd ((X & Y) CMP 0): Only handle eq and ne.  Commonalize
> > 	common tests.
> > 
> > 	* gcc.c-torture/execute/pr103417.c: New test.
> > 
> > --- gcc/match.pd.jj	2021-11-24 11:46:03.191918052 +0100
> > +++ gcc/match.pd	2021-11-24 22:33:43.852575772 +0100
> > @@ -5214,20 +5214,16 @@ (define_operator_list SYNC_FETCH_AND_AND
> >  /* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
> >     where ~Y + 1 == pow2 and Z = ~Y.  */  (for cst (VECTOR_CST INTEGER_CST)
> > - (for cmp (le eq ne ge gt)
> > -      icmp (le le gt le gt)
> > - (simplify
> > -  (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
> > -   (with { tree csts = bitmask_inv_cst_vector_p (@1); }
> > -     (switch
> > -      (if (csts && TYPE_UNSIGNED (TREE_TYPE (@1))
> > -	   && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
> > -       (icmp @0 { csts; }))
> > -      (if (csts && !TYPE_UNSIGNED (TREE_TYPE (@1))
> > -	   && (cmp == EQ_EXPR || cmp == NE_EXPR)
> > -	   && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
> > + (for cmp (eq ne)
> > +      icmp (le gt)
> > +  (simplify
> > +   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
> > +    (with { tree csts = bitmask_inv_cst_vector_p (@1); }
> > +     (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
> > +      (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
> > +       (icmp @0 { csts; })
> >         (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); }
> > -	(icmp (convert:utype @0) { csts; }))))))))
> > +	 (icmp (convert:utype @0) { csts; }))))))))
> > 
> >  /* -A CMP -B -> B CMP A.  */
> >  (for cmp (tcc_comparison)
> > --- gcc/testsuite/gcc.c-torture/execute/pr103417.c.jj	2021-11-24
> > 22:36:00.732626424 +0100
> > +++ gcc/testsuite/gcc.c-torture/execute/pr103417.c	2021-11-24
> > 22:35:43.964865218 +0100
> > @@ -0,0 +1,11 @@
> > +/* PR tree-optimization/103417 */
> > +
> > +struct { int a : 8; int b : 24; } c = { 0, 1 };
> > +
> > +int
> > +main ()
> > +{
> > +  if (c.b && !c.b)
> > +    __builtin_abort ();
> > +  return 0;
> > +}
> > 
> > 	Jakub
> 
> 


More information about the Gcc-patches mailing list