This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] [patch] Support vectorization of min/max location pattern

On 07/08/2010 11:19 AM, Ira Rosen wrote:
> It's minloc pattern, i.e., a loop that finds the location of the minimum:
>   float  arr[N};
>   for (i = 0; i < N; i++)
>     if (arr[i] < limit)
>       {
>         pos = i + 1;
>         limit = arr[i];
>       }
> Vectorizer's input code:
>   # pos_22 = PHI <pos_1(4), 1(2)>
>   # limit_24 = PHI <limit_4(4), 0(2)>
>   ...
>   pos_1 = [cond_expr] limit_9 < limit_24 ? pos_10 : pos_22;       //
> location
>   limit_4 = [cond_expr] limit_9 < limit_24 ? limit_9 : limit_24;  // min

Ok, I get it now.

So your thinking was that you needed the builtin to replace the
comparison portion of the VEC_COND_EXPR?  Or, looking again I see
that you don't actually use VEC_COND_EXPR, you use ...

> +  /* Create: VEC_DEST = (VEC_OPRND1 & MASK) | (VEC_OPRND2 & !MASK).  */ 

... explicit masking.  I.e. you assume that the return value of
the builtin is a bit mask of the full width, and that there's no
better way to implement the VEC_COND.

I wonder if it wouldn't be better to extend the definition
of VEC_COND_EXPR so that the comparison values can be of a 
different type than the data operands (with the caveat that the
number of elements should be the same -- i.e. 4-wide compare must
match 4-wide data movement).

I can think of 2 portability problems with your current solution:

(1) SSE4.1 would prefer to use BLEND instructions, which perform
    that entire (X & M) | (Y & ~M) operation in one insn.

(2) The mips C.cond.PS instruction does *not* produce a bitmask
    like altivec or sse do.  Instead it sets multiple condition
    codes.  One then uses MOV[TF].PS to merge the elements based
    on the individual condition codes.  While there's no direct
    corresponding instruction that will operate on integers, I
    don't think it would be too difficult to use MOV[TF].G or
    BC1AND2[FT] instructions to emulate it.  In any case, this 
    is again a case where you don't want to expose any part of
    the VEC_COND at the gimple level.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]