[PATCH, vec-tails 03/10] Support epilogues vectorization with no masking

Jeff Law law@redhat.com
Fri Jun 17 16:46:00 GMT 2016

On 06/17/2016 08:33 AM, Ilya Enkovich wrote:
>> Hmm, there seems to be a level of indirection I'm missing here.  We're
>> smuggling LOOP_VINFO_ORIG_LOOP_INFO around in loop->aux.  Ewww.  I thought
>> the whole point of LOOP_VINFO_ORIG_LOOP_INFO was to smuggle the VINFO from
>> the original loop to the vectorized epilogue.  What am I missing?  Rather
>> than smuggling around in the aux field, is there some inherent reason why we
>> can't just copy the info from the original loop directly into
>> LOOP_VINFO_ORIG_LOOP_INFO for the vectorized epilogue?
> LOOP_VINFO_ORIG_LOOP_INFO is used for several things:
>  - mark this loop as epilogue
>  - get VF of original loop (required for both mask and nomask modes)
>  - get decision about epilogue masking
> That's all.  When epilogue is created it has no LOOP_VINFO.  Also when we
> vectorize loop we create and destroy its LOOP_VINFO multiple times.  When
> loop has LOOP_VINFO loop->aux points to it and original LOOP_VINFO is in
> LOOP_VINFO_ORIG_LOOP_INFO.  When Loop has no LOOP_VINFO associated I have no
> place to bind it with the original loop and therefore I use vacant loop->aux
> for that.  Any other way to bind epilogue with its original loop would work
> as well.  I just chose loop->aux to avoid new fields and data structures.
I was starting to draw the conclusion that the smuggling in the aux 
field was for cases when there was no LOOP_VINFO.  But was rather late 
at night and I didn't follow that idea through the code.  THanks for 

>> And something just occurred to me -- is there some inherent reason why SLP
>> doesn't vectorize the epilogue, particularly for the cases where we can
>> vectorize the epilogue using smaller vectors?  Sorry if you've already
>> answered this somewhere or it's a dumb question.
> IIUC this may happen only if we unroll epilogue into a single BB which happens
> only when epilogue iterations count is known. Right?
Probably.  The need to make sure the epilogue is unrolled probably makes 
this a non-starter.

I have a soft spot for SLP as I stumbled on the idea while rewriting a 
presentation in the wee hours of the morning for the next day. 
Essentially it was a "poor man's" vectorizer that could be done for 
dramatically less engineering cost than a traditional vectorizer.  The 
MIT paper outlining the same ideas came out a couple years later...

>> +       /* Add new loop to a processing queue.  To make it easier
>>> +          to match loop and its epilogue vectorization in dumps
>>> +          put new loop as the next loop to process.  */
>>> +       if (new_loop)
>>> +         {
>>> +           loops.safe_insert (i + 1, new_loop->num);
>>> +           vect_loops_num = number_of_loops (cfun);
>>> +         }
>>> +
>> So just to be clear, the only reason to do this is for dumps -- other than
>> processing the loop before it's epilogue, there's no other inherently
>> necessary ordering of the loops, right?
> Right, I don't see other reasons to do it.
Perfect.  Thanks for confirming.


More information about the Gcc-patches mailing list