This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Remove redundant AND from count reduction loop

From: Richard Sandiford <richard dot sandiford at arm dot com>
To: Richard Biener <richard dot guenther at gmail dot com>
Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Wed, 24 Jun 2015 10:57:18 +0100
Subject: Re: Remove redundant AND from count reduction loop
Authentication-results: sourceware.org; auth=none
References: <87pp4m8mkp dot fsf at e105548-lin dot cambridge dot arm dot com> <alpine dot DEB dot 2 dot 20 dot 1506232307180 dot 1715 at laptop-mg dot saclay dot inria dot fr> <CAFiYyc0F3YqAjr2+EMf8STUBrVN+bE3aYpC+1nnstSaf2oiaDg at mail dot gmail dot com>

Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Jun 23, 2015 at 11:27 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Tue, 23 Jun 2015, Richard Sandiford wrote:
>>
>>> +/* Vector comparisons are defined to produce all-one or all-zero results.
>>> */
>>> +(simplify
>>> + (vec_cond @0 integer_all_onesp@1 integer_zerop@2)
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> +   (convert @0)))
>>
>>
>> I am trying to understand why the test tree_nop_conversion_p is the right
>> one (at least for the transformations not using VIEW_CONVERT_EXPR). By
>> definition of VEC_COND_EXPR, type and TREE_TYPE (@0) are both integer vector
>> types of the same size and number of elements. It thus seems like a
>> conversion is always fine. For vectors, tree_nop_conversion_p apparently
>> only checks that they have the same mode (quite often VOIDmode I guess).
>
> The only conversion we seem to allow is changing the signed vector from
> the comparison result to an unsigned vector (same number of elements
> and same mode of the elements).  That is, a check using
> TYPE_MODE (type) == TYPE_MODE (TREE_TYPE (@0)) would probably
> be better (well, technically a TYPE_VECTOR_SUBPARTS && element
> mode compare should be better as generic vectors might not have a vector mode).

OK.  The reason I was being paranoid was that I couldn't see anywhere
where we enforced that the vector condition in a VEC_COND had to have
the same element width as the values being selected.  tree-cfg.c
only checks that rhs2 and rhs3 are compatible with the result.
There doesn't seem to be any checking of rhs1 vs. the other types.
So I wasn't sure whether anything stopped us from, e.g., comparing two
V4HIs and using the result to select between two V4SIs.

> I'm fine with using tree_nop_conversion_p for now.

I like the suggestion about checking TYPE_VECTOR_SUBPARTS and the element
mode.  How about:

 (if (VECTOR_INTEGER_TYPE_P (type)
      && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))
      && (TYPE_MODE (TREE_TYPE (type))
          == TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)))))

(But is it really OK to be adding more mode-based compatibility checks?
I thought you were hoping to move away from modes in the middle end.)

>>> +/* We could instead convert all instances of the vec_cond to negate,
>>> +   but that isn't necessarily a win on its own.  */
>
> so p ? 1 : 0 -> -p?  Why isn't that a win on its own?  It looks more compact
> at least ;)  It would also simplify the patterns below.

In the past I've dealt with processors where arithmetic wasn't handled
as efficiently as logical ops.  Seems like an especial risk for 64-bit
elements, from a quick scan of the i386 scheduling models.

> I'm missing a comment on the transform done by the following patterns.

Heh.  The comment was supposed to be describing all four at once.
I originally had then bunched together without whitespace, but it
looked bad.

>>> +(simplify
>>> + (plus:c @3 (vec_cond @0 integer_each_onep@1 integer_zerop@2))
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> +  (minus @3 (convert @0))))
>>> +
>>> +(simplify
>>> + (plus:c @3 (view_convert_expr
>>
>>
>> Aren't we suppose to drop _expr in match.pd?
>
> Yes.  I probably should adjust genmatch.c to reject the _expr variants ;)

OK.

>>> +            (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> +  (minus @3 (convert @0))))
>>> +
>>> +(simplify
>>> + (minus @3 (vec_cond @0 integer_each_onep@1 integer_zerop@2))
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> +  (plus @3 (convert @0))))
>>> +
>>> +(simplify
>>> + (minus @3 (view_convert_expr
>>> +           (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> +  (plus @3 (convert @0))))
>>> +
>
> Generally for sign-conversions of vectors you should use view_convert.

OK.

> The above also hints at missing conditional view_convert support
> and a way to iterate over commutative vs. non-commutative ops so
> we could write
>
> (for op (plus:c minus)
>      rop (minus plus)
>   (op @3 (view_convert? (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>    (rop @3 (view_convert @0)))))
>
> I'll see implementing that.

Looks good. :-)

I also realised later that:

/* Vector comparisons are defined to produce all-one or all-zero results.  */
(simplify
 (vec_cond @0 integer_all_onesp@1 integer_zerop@2)
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
   (convert @0)))

is redundant with some fold-const.c code.

Thanks,
Richard

Follow-Ups:
- Re: Remove redundant AND from count reduction loop
  - From: Richard Biener

References:
- Remove redundant AND from count reduction loop
  - From: Richard Sandiford
- Re: Remove redundant AND from count reduction loop
  - From: Marc Glisse
- Re: Remove redundant AND from count reduction loop
  - From: Richard Biener

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]