This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Combine vectorized loops with its scalar remainder.
- From: Yuri Rumyantsev <ysrumyan at gmail dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: Ilya Enkovich <enkovich dot gnu at gmail dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>, Jeff Law <law at redhat dot com>, Igor Zamyatin <izamyatin at gmail dot com>
- Date: Mon, 23 Nov 2015 18:52:59 +0300
- Subject: Re: [RFC] Combine vectorized loops with its scalar remainder.
- Authentication-results: sourceware.org; auth=none
- References: <CAEoMCqSmMRW1C2LniYShbfdA+JfSS6kzfrPYCcdd-rdVXa4mzg at mail dot gmail dot com> <CAFiYyc2badGgiQDyAuW6N5CnD6qMGCNCHD3fFvqK=un5V5BmWg at mail dot gmail dot com> <CAEoMCqT8vCXHqHTpWswGodRSL0GBUEppzpVWiUJnJ1DcQbxBCw at mail dot gmail dot com> <CAFiYyc16awDZrJ+0byvGwcuFGtYpJkyeVK9k00rc4fzsF-3xJA at mail dot gmail dot com> <CAMbmDYa7pnw53Rr4QWctKOz=MAC-6kiBmDS7W29LzN35QWq9wg at mail dot gmail dot com> <CAFiYyc1fsVZUgMph-BwuC94WqAdf0xKobbTishVw+TN-Tg759Q at mail dot gmail dot com> <CAEoMCqTuiFXv-y-SUbHn1ke__OKzCOpynjxhYYYbN6Smph5szQ at mail dot gmail dot com>
Hi Richard,
Did you have a chance to look at this?
Thanks.
Yuri.
2015-11-13 13:35 GMT+03:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
> Hi Richard,
>
> Here is updated version of the patch which 91) is in sync with trunk
> compiler and (2) contains simple cost model to estimate profitability
> of scalar epilogue elimination. The part related to vectorization of
> loops with small trip count is in process of developing. Note that
> implemented cost model was not tuned well for HASWELL and KNL but we
> got ~6% speed-up on 436.cactusADM from spec2006 suite for HASWELL.
>
> 2015-11-10 17:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Tue, Nov 10, 2015 at 2:02 PM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>>> 2015-11-10 15:30 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Tue, Nov 3, 2015 at 1:08 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Richard,
>>>>>
>>>>> It looks like misunderstanding - we assume that for GCCv6 the simple
>>>>> scheme of remainder will be used through introducing new IV :
>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01435.html
>>>>>
>>>>> Is it true or we missed something?
>>>>
>>>> <quote>
>>>>> > Do you have an idea how "masking" is better be organized to be usable
>>>>> > for both 4b and 4c?
>>>>>
>>>>> Do 2a ...
>>>> Okay.
>>>> </quote>
>>>
>>> 2a was 'transform already vectorized loop as a separate
>>> post-processing'. Isn't it what this prototype patch implements?
>>> Current version only masks loop body which is in practice applicable
>>> for AVX-512 only in the most cases. With AVX-512 it's easier to see
>>> how profitable masking might be and it is a main target for the first
>>> masking version. Extending it to prologues/epilogues and thus making
>>> it more profitable for other targets is the next step and is out of
>>> the scope of this patch.
>>
>> Ok, technically the prototype transforms the already vectorized loop.
>> Of course I meant the vectorized loop be copied, masked and that
>> result used as epilogue...
>>
>> I'll queue a more detailed look into the patch for this week.
>>
>> Did you perform any measurements with this patch like # of
>> masked epilogues in SPEC 2006 FP (and any speedup?)
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Ilya
>>>
>>>>
>>>> Richard.
>>>>