This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Induction variable candidates not sufficiently general


On Fri, Jul 13, 2018 at 6:04 AM, Kelvin Nilsen <kdnilsen@linux.ibm.com> wrote:
> A somewhat old "issue report" pointed me to the code generated for a 4-fold manually unrolled version of the following loop:
>
>>                       while (++len != len_limit) /* this is loop */
>>                               if (pb[len] != cur[len])
>>                                       break;
>
> As unrolled, the loop appears as:
>
>>                 while (++len != len_limit) /* this is loop */ {
>>                   if (pb[len] != cur[len])
>>                     break;
>>                   if (++len == len_limit)  /* unrolled 2nd iteration */
>>                     break;
>>                   if (pb[len] != cur[len])
>>                     break;
>>                   if (++len == len_limit)  /* unrolled 3rd iteration */
>>                     break;
>>                   if (pb[len] != cur[len])
>>                     break;
>>                   if (++len == len_limit)  /* unrolled 4th iteration */
>>                     break;
>>                   if (pb[len] != cur[len])
>>                     break;
>>                 }
>
> In examining the behavior of tree-ssa-loop-ivopts.c, I've discovered the only induction variable candidates that are being considered are all forms of the len variable.  We are not considering any induction variables to represent the address expressions &pb[len] and &cur[len].
>
> I rewrote the source code for this loop to make the addressing expressions more explicit, as in the following:
>
>>       cur++;
>>       while (++pb != last_pb) /* this is loop */ {
>>       if (*pb != *cur)
>>         break;
>>       ++cur;
>>       if (++pb == last_pb)  /* unrolled 2nd iteration */
>>         break;
>>       if (*pb != *cur)
>>         break;
>>       ++cur;
>>       if (++pb == last_pb)  /* unrolled 3rd iteration */
>>         break;
>>       if (*pb != *cur)
>>         break;
>>       ++cur;
>>       if (++pb == last_pb)  /* unrolled 4th iteration */
>>         break;
>>       if (*pb != *cur)
>>         break;
>>       ++cur;
>>       }
>
> Now, gcc does a better job of identifying the "address expression induction variables".  This version of the loop runs about 10% faster than the original on my target architecture.
>
> This would seem to be a textbook pattern for the induction variable analysis.  Does anyone have any thoughts on the best way to add these candidates to the set of induction variables that are considered by tree-ssa-loop-ivopts.c?
>
> Thanks in advance for any suggestions.
>
Hi,
Could you please file a bug with your original slow test code
attached?  I tried to construct meaningful test case from your code
snippet but not successful.  There is difference in generated
assembly, but it's not that fundamental.  So a bug with preprocessed
test would be high appreciated.
I think there are two potential issues in cost computation for such
case: invariant expression and iv uses outside of loop handled as
inside uses.

Thanks,
bin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]