This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Combine vectorized loops with its scalar remainder.


Hi Richard,

Did you have a chance to look at this?

Thanks.
Yuri.

2015-11-13 13:35 GMT+03:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
> Hi Richard,
>
> Here is updated version of the patch which 91) is in sync with trunk
> compiler and (2) contains simple cost model to estimate profitability
> of scalar epilogue elimination. The part related to vectorization of
> loops with small trip count is in process of developing. Note that
> implemented cost model was not tuned  well for HASWELL and KNL but we
> got  ~6% speed-up on 436.cactusADM from spec2006 suite for HASWELL.
>
> 2015-11-10 17:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Tue, Nov 10, 2015 at 2:02 PM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>>> 2015-11-10 15:30 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Tue, Nov 3, 2015 at 1:08 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Richard,
>>>>>
>>>>> It looks like misunderstanding - we assume that for GCCv6 the simple
>>>>> scheme of remainder will be used through introducing new IV :
>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01435.html
>>>>>
>>>>> Is it true or we missed something?
>>>>
>>>> <quote>
>>>>> > Do you have an idea how "masking" is better be organized to be usable
>>>>> > for both 4b and 4c?
>>>>>
>>>>> Do 2a ...
>>>> Okay.
>>>> </quote>
>>>
>>> 2a was 'transform already vectorized loop as a separate
>>> post-processing'. Isn't it what this prototype patch implements?
>>> Current version only masks loop body which is in practice applicable
>>> for AVX-512 only in the most cases.  With AVX-512 it's easier to see
>>> how profitable masking might be and it is a main target for the first
>>> masking version.  Extending it to prologues/epilogues and thus making
>>> it more profitable for other targets is the next step and is out of
>>> the scope of this patch.
>>
>> Ok, technically the prototype transforms the already vectorized loop.
>> Of course I meant the vectorized loop be copied, masked and that
>> result used as epilogue...
>>
>> I'll queue a more detailed look into the patch for this week.
>>
>> Did you perform any measurements with this patch like # of
>> masked epilogues in SPEC 2006 FP (and any speedup?)
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Ilya
>>>
>>>>
>>>> Richard.
>>>>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]