This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Remove redundant AND from count reduction loop
- From: Richard Sandiford <richard dot sandiford at arm dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 24 Jun 2015 10:57:18 +0100
- Subject: Re: Remove redundant AND from count reduction loop
- Authentication-results: sourceware.org; auth=none
- References: <87pp4m8mkp dot fsf at e105548-lin dot cambridge dot arm dot com> <alpine dot DEB dot 2 dot 20 dot 1506232307180 dot 1715 at laptop-mg dot saclay dot inria dot fr> <CAFiYyc0F3YqAjr2+EMf8STUBrVN+bE3aYpC+1nnstSaf2oiaDg at mail dot gmail dot com>
Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Jun 23, 2015 at 11:27 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Tue, 23 Jun 2015, Richard Sandiford wrote:
>>
>>> +/* Vector comparisons are defined to produce all-one or all-zero results.
>>> */
>>> +(simplify
>>> + (vec_cond @0 integer_all_onesp@1 integer_zerop@2)
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> + (convert @0)))
>>
>>
>> I am trying to understand why the test tree_nop_conversion_p is the right
>> one (at least for the transformations not using VIEW_CONVERT_EXPR). By
>> definition of VEC_COND_EXPR, type and TREE_TYPE (@0) are both integer vector
>> types of the same size and number of elements. It thus seems like a
>> conversion is always fine. For vectors, tree_nop_conversion_p apparently
>> only checks that they have the same mode (quite often VOIDmode I guess).
>
> The only conversion we seem to allow is changing the signed vector from
> the comparison result to an unsigned vector (same number of elements
> and same mode of the elements). That is, a check using
> TYPE_MODE (type) == TYPE_MODE (TREE_TYPE (@0)) would probably
> be better (well, technically a TYPE_VECTOR_SUBPARTS && element
> mode compare should be better as generic vectors might not have a vector mode).
OK. The reason I was being paranoid was that I couldn't see anywhere
where we enforced that the vector condition in a VEC_COND had to have
the same element width as the values being selected. tree-cfg.c
only checks that rhs2 and rhs3 are compatible with the result.
There doesn't seem to be any checking of rhs1 vs. the other types.
So I wasn't sure whether anything stopped us from, e.g., comparing two
V4HIs and using the result to select between two V4SIs.
> I'm fine with using tree_nop_conversion_p for now.
I like the suggestion about checking TYPE_VECTOR_SUBPARTS and the element
mode. How about:
(if (VECTOR_INTEGER_TYPE_P (type)
&& TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))
&& (TYPE_MODE (TREE_TYPE (type))
== TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)))))
(But is it really OK to be adding more mode-based compatibility checks?
I thought you were hoping to move away from modes in the middle end.)
>>> +/* We could instead convert all instances of the vec_cond to negate,
>>> + but that isn't necessarily a win on its own. */
>
> so p ? 1 : 0 -> -p? Why isn't that a win on its own? It looks more compact
> at least ;) It would also simplify the patterns below.
In the past I've dealt with processors where arithmetic wasn't handled
as efficiently as logical ops. Seems like an especial risk for 64-bit
elements, from a quick scan of the i386 scheduling models.
> I'm missing a comment on the transform done by the following patterns.
Heh. The comment was supposed to be describing all four at once.
I originally had then bunched together without whitespace, but it
looked bad.
>>> +(simplify
>>> + (plus:c @3 (vec_cond @0 integer_each_onep@1 integer_zerop@2))
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> + (minus @3 (convert @0))))
>>> +
>>> +(simplify
>>> + (plus:c @3 (view_convert_expr
>>
>>
>> Aren't we suppose to drop _expr in match.pd?
>
> Yes. I probably should adjust genmatch.c to reject the _expr variants ;)
OK.
>>> + (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> + (minus @3 (convert @0))))
>>> +
>>> +(simplify
>>> + (minus @3 (vec_cond @0 integer_each_onep@1 integer_zerop@2))
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> + (plus @3 (convert @0))))
>>> +
>>> +(simplify
>>> + (minus @3 (view_convert_expr
>>> + (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>> + (plus @3 (convert @0))))
>>> +
>
> Generally for sign-conversions of vectors you should use view_convert.
OK.
> The above also hints at missing conditional view_convert support
> and a way to iterate over commutative vs. non-commutative ops so
> we could write
>
> (for op (plus:c minus)
> rop (minus plus)
> (op @3 (view_convert? (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
> (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> (rop @3 (view_convert @0)))))
>
> I'll see implementing that.
Looks good. :-)
I also realised later that:
/* Vector comparisons are defined to produce all-one or all-zero results. */
(simplify
(vec_cond @0 integer_all_onesp@1 integer_zerop@2)
(if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
(convert @0)))
is redundant with some fold-const.c code.
Thanks,
Richard