This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Remove redundant AND from count reduction loop
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, richard dot sandiford at arm dot com
- Date: Wed, 24 Jun 2015 12:21:06 +0200
- Subject: Re: Remove redundant AND from count reduction loop
- Authentication-results: sourceware.org; auth=none
- References: <87pp4m8mkp dot fsf at e105548-lin dot cambridge dot arm dot com> <alpine dot DEB dot 2 dot 20 dot 1506232307180 dot 1715 at laptop-mg dot saclay dot inria dot fr> <CAFiYyc0F3YqAjr2+EMf8STUBrVN+bE3aYpC+1nnstSaf2oiaDg at mail dot gmail dot com> <87egl1sa2p dot fsf at e105548-lin dot cambridge dot arm dot com>
On Wed, Jun 24, 2015 at 11:57 AM, Richard Sandiford
<richard.sandiford@arm.com> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Jun 23, 2015 at 11:27 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>> On Tue, 23 Jun 2015, Richard Sandiford wrote:
>>>
>>>> +/* Vector comparisons are defined to produce all-one or all-zero results.
>>>> */
>>>> +(simplify
>>>> + (vec_cond @0 integer_all_onesp@1 integer_zerop@2)
>>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>>> + (convert @0)))
>>>
>>>
>>> I am trying to understand why the test tree_nop_conversion_p is the right
>>> one (at least for the transformations not using VIEW_CONVERT_EXPR). By
>>> definition of VEC_COND_EXPR, type and TREE_TYPE (@0) are both integer vector
>>> types of the same size and number of elements. It thus seems like a
>>> conversion is always fine. For vectors, tree_nop_conversion_p apparently
>>> only checks that they have the same mode (quite often VOIDmode I guess).
>>
>> The only conversion we seem to allow is changing the signed vector from
>> the comparison result to an unsigned vector (same number of elements
>> and same mode of the elements). That is, a check using
>> TYPE_MODE (type) == TYPE_MODE (TREE_TYPE (@0)) would probably
>> be better (well, technically a TYPE_VECTOR_SUBPARTS && element
>> mode compare should be better as generic vectors might not have a vector mode).
>
> OK. The reason I was being paranoid was that I couldn't see anywhere
> where we enforced that the vector condition in a VEC_COND had to have
> the same element width as the values being selected.
We don't require that indeed.
> tree-cfg.c
> only checks that rhs2 and rhs3 are compatible with the result.
> There doesn't seem to be any checking of rhs1 vs. the other types.
> So I wasn't sure whether anything stopped us from, e.g., comparing two
> V4HIs and using the result to select between two V4SIs.
Nothing does (or should).
>> I'm fine with using tree_nop_conversion_p for now.
>
> I like the suggestion about checking TYPE_VECTOR_SUBPARTS and the element
> mode. How about:
>
> (if (VECTOR_INTEGER_TYPE_P (type)
> && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))
> && (TYPE_MODE (TREE_TYPE (type))
> == TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)))))
>
> (But is it really OK to be adding more mode-based compatibility checks?
> I thought you were hoping to move away from modes in the middle end.)
The TYPE_MODE check makes the VECTOR_INTEGER_TYPE_P check redundant
(the type of a comparison is always a signed vector integer type).
Yes, mode-based
checks are ok. I don't see us moving away from them.
>>>> +/* We could instead convert all instances of the vec_cond to negate,
>>>> + but that isn't necessarily a win on its own. */
>>
>> so p ? 1 : 0 -> -p? Why isn't that a win on its own? It looks more compact
>> at least ;) It would also simplify the patterns below.
>
> In the past I've dealt with processors where arithmetic wasn't handled
> as efficiently as logical ops. Seems like an especial risk for 64-bit
> elements, from a quick scan of the i386 scheduling models.
But then expansion could undo this ...
>> I'm missing a comment on the transform done by the following patterns.
>
> Heh. The comment was supposed to be describing all four at once.
> I originally had then bunched together without whitespace, but it
> looked bad.
>
>>>> +(simplify
>>>> + (plus:c @3 (vec_cond @0 integer_each_onep@1 integer_zerop@2))
>>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>>> + (minus @3 (convert @0))))
>>>> +
>>>> +(simplify
>>>> + (plus:c @3 (view_convert_expr
>>>
>>>
>>> Aren't we suppose to drop _expr in match.pd?
>>
>> Yes. I probably should adjust genmatch.c to reject the _expr variants ;)
>
> OK.
>
>>>> + (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
>>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>>> + (minus @3 (convert @0))))
>>>> +
>>>> +(simplify
>>>> + (minus @3 (vec_cond @0 integer_each_onep@1 integer_zerop@2))
>>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>>> + (plus @3 (convert @0))))
>>>> +
>>>> +(simplify
>>>> + (minus @3 (view_convert_expr
>>>> + (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
>>>> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>>>> + (plus @3 (convert @0))))
>>>> +
>>
>> Generally for sign-conversions of vectors you should use view_convert.
>
> OK.
>
>> The above also hints at missing conditional view_convert support
>> and a way to iterate over commutative vs. non-commutative ops so
>> we could write
>>
>> (for op (plus:c minus)
>> rop (minus plus)
>> (op @3 (view_convert? (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
>> (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>> (rop @3 (view_convert @0)))))
>>
>> I'll see implementing that.
>
> Looks good. :-)
>
> I also realised later that:
>
> /* Vector comparisons are defined to produce all-one or all-zero results. */
> (simplify
> (vec_cond @0 integer_all_onesp@1 integer_zerop@2)
> (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> (convert @0)))
>
> is redundant with some fold-const.c code.
If so then you should remove the fold-const.c at the time you add the pattern.
Note that ISTR code performing exactly the opposite transform in
fold-const.c ...
Richard.
> Thanks,
> Richard
>