This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC][PR82479] missing popcount builtin detection


Hi Richard,

On 31 January 2018 at 21:39, Richard Biener <richard.guenther@gmail.com> wrote:
> On Wed, Jan 31, 2018 at 11:28 AM, Kugan Vivekanandarajah
> <kugan.vivekanandarajah@linaro.org> wrote:
>> Hi Richard,
>>
>> Thanks for the review.
>> On 25 January 2018 at 20:04, Richard Biener <richard.guenther@gmail.com> wrote:
>>> On Wed, Jan 24, 2018 at 10:56 PM, Kugan Vivekanandarajah
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>> Hi All,
>>>>
>>>> Here is a patch for popcount builtin detection similar to LLVM. I
>>>> would like to queue this for review for next stage 1.
>>>>
>>>> 1. This is done part of loop-distribution and effective for -O3 and above.
>>>> 2. This does not distribute loop to detect popcount (like
>>>> memcpy/memmove). I dont think that happens in practice. Please correct
>>>> me if I am wrong.
>>>
>>> But then it has no business inside loop distribution but instead is
>>> doing final value
>>> replacement, right?  You are pattern-matching the whole loop after all.  I think
>>> final value replacement would already do the correct thing if you
>>> teached number of
>>> iteration analysis that niter for
>>>
>>>   <bb 3> [local count: 955630224]:
>>>   # b_11 = PHI <b_5(5), b_8(6)>
>>>   _1 = b_11 + -1;
>>>   b_8 = _1 & b_11;
>>>   if (b_8 != 0)
>>>     goto <bb 6>; [89.00%]
>>>   else
>>>     goto <bb 8>; [11.00%]
>>>
>>>   <bb 6> [local count: 850510900]:
>>>   goto <bb 3>; [100.00%]
>>
>> I am looking into this approach. What should be the scalar evolution
>> for b_8 (i.e. b & (b -1) in a loop) should be? This is not clear to me
>> and can this be represented with the scev?
>
> No, it's not affine and thus cannot be represented.  You only need the
> scalar evolution of the counting IV which is already handled and
> the number of iteration analysis needs to handle the above IV - this
> is the missing part.
Thanks for the clarification. I am now matching this loop pattern in
number_of_iterations_exit when number_of_iterations_exit_assumptions
fails. If the pattern matches, I am inserting the _builtin_popcount in
the loop preheater and setting the loop niter with this. This will be
used by the final value replacement. Is this what you wanted?

In general, there is also a condition gating the loop like

if (b_4 != 0)
  goto loop;
else
  end:

This of course will not be removed by final value replacement. Since
popcount (0) is defined, this is redundant and could be removed but
not removed.

Thanks,
Kugan

>
> Richard.
>
>> Thanks,
>> Kugan
>>>
>>> is related to popcount (b_5).
>>>
>>> Richard.
>>>
>>>> Bootstrapped and regression tested on aarch64-linux-gnu with no new regressions.
>>>>
>>>> Thanks,
>>>> Kugan
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>> 2018-01-25  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>
>>>>     PR middle-end/82479
>>>>     * tree-loop-distribution.c (handle_popcount): New.
>>>>     (pass_loop_distribution::execute): Use handle_popcount.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>> 2018-01-25  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>
>>>>     PR middle-end/82479
>>>>     * gcc.dg/tree-ssa/popcount.c: New test.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]