This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Question about vectorization limit


On Fri, May 31, 2013 at 6:54 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, May 31, 2013 at 03:48:59PM +0200, Toon Moene wrote:
>> >If you rewrite the above into:
>> >SUBROUTINE XYZ(A, B, N)
>> >DIMENSION A(N), B(N)
>> >DO I = 1, N
>> >    C = B(I)
>> >    IF (A(I)>  0.0) THEN
>> >       A(I) = C / A(I)
>> >    ELSE
>> >       A(I) = C
>> >    ENDIF
>> >ENDDO
>> >END
>> >
>> >then it is vectorized just fine.
>>
>> But this "inner loop" has at least 3 basic blocks - so what does the
>> "loop->num_nodes != 2" test exactly codify ?
>
> With the above testcase it has just 2.
> Before ifcvt pass it still has 4:
>   <bb 4>:
>   # i_1 = PHI <1(3), i_18(7)>
>   _8 = (integer(kind=8)) i_1;
>   _9 = _8 + -1;
>   c_11 = *b_10(D)[_9];
>   _13 = *a_12(D)[_9];
>   if (_13 > 0.0)
>     goto <bb 5>;
>   else
>     goto <bb 6>;
>
>   <bb 5>:
>   _14 = c_11 / _13;
>
>   <bb 6>:
>   # cstore_17 = PHI <_14(5), c_11(4)>
>   *a_12(D)[_9] = cstore_17;
>   i_18 = i_1 + 1;
>   if (i_1 == _7)
>     goto <bb 8>;
>   else
>     goto <bb 7>;
>
>   <bb 7>:
>   goto <bb 4>;
> but ifcvt transforms that into:
>   <bb 4>:
>   # i_1 = PHI <1(3), i_18(5)>
>   _8 = (integer(kind=8)) i_1;
>   _9 = _8 + -1;
>   c_11 = *b_10(D)[_9];
>   _13 = *a_12(D)[_9];
>   _14 = c_11 / _13;
>   cstore_17 = _13 > 0.0 ? _14 : c_11;
>   *a_12(D)[_9] = cstore_17;
>   i_18 = i_1 + 1;
>   if (i_1 == _7)
>     goto <bb 6>;
>   else
>     goto <bb 5>;
>
>   <bb 5>:
>   goto <bb 4>;
> which is already generally vectorizable.  Guess ifcvt can be certainly
> taught if it finds a possibly trapping statement to check if the same
> statement isn't present in all possible branches, though the question is if
> this is best done in ifcvt or some other pass.

This is code hoisting, which should be done earlier.  Similarly
missing in GCC is speculative PRE for loads and stores that are safe
to control speculate, which include global scalars, global array
access within bounds. Fancier ones require some analysis. For
instance, a field reference is fully available or fully anticipated,
then it is safe to control speculate a mem ref to a different field:

  ... p[i].a ....

  if (cond0)
   {
     // frequent
     t = p[i].b;
      ..
   }

  if (cond1)
   {
     // frequent
       .. = p[i].b;
   }

Should be PREed into:

... p[i].a ....

  if (cond0)
   {
     // frequent
     t = p[i].b;
      ..
   }
   else
    {
      t = p[i].b;
    }
  if (cond1)
   {
     // frequent
       .. = t;
   }



David


>
>         Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]