[Scalar masks 2/x] Use bool masks in if-conversion

Wed Aug 26 11:13:00 GMT 2015

2015-08-26 0:26 GMT+03:00 Jeff Law <law@redhat.com>:
> On 08/21/2015 06:17 AM, Ilya Enkovich wrote:
>>>
>>>
>>> Hmm, I don't see how vector masks are more difficult to operate with.
>>
>>
>> There are just no instructions for that but you have to pretend you
>> have to get code vectorized.
>>
>>>
>>>> Also according to vector ABI integer mask should be used for mask
>>>> operand in case of masked vector call.
>>>
>>>
>>> What ABI?  The function signature of the intrinsics?  How would that
>>> come into play here?
>>
>>
>> Not intrinsics. I mean OpenMP vector functions which require integer
>> arg for a mask in case of 512-bit vector.
>
> That's what I assumed -- you can pass in a mask as an argument and it's
> supposed to be a simple integer, right?

Depending on target ABI requires either vector mask or a simple integer value.

>
>
>>
>>>
>>>> Current implementation of masked loads, masked stores and bool
>>>> patterns in vectorizer just reflect SSE4 and AVX. Can (and should) we
>>>> really call it a canonical representation for all targets?
>>>
>>>
>>> No idea - we'll revisit when another targets adds a similar capability.
>>
>>
>> AVX-512 is such target. Current representation forces multiple scalar
>> mask -> vector mask and back transformations which are artificially
>> introduced by current bool patterns and are hard to optimize out.
>
> I'm a bit surprised they're so prevalent and hard to optimize away. ISTM PRE
> ought to handle this kind of thing with relative ease.

Most of vector comparisons are UNSPEC. And I doubt PRE may actually
help much even if get rid of UNSPEC somehow. Is there really a
redundancy in:

if ((v1 cmp v2) && (v3 cmp v4))
  load

v1 cmp v2 -> mask1
select mask1 vec_cst_-1 vec_cst_0 -> vec_mask1
v3 cmp v4 -> mask2
select mask2 vec_mask1 vec_cst_0 -> vec_mask2
vec_mask2 NE vec_cst_0 -> mask3
load by mask3

It looks to me more like a i386 specific instruction selection problem.

Ilya

>
>
>>> Fact is GCC already copes with vector masks generated by vector compares
>>> just fine everywhere and I'd rather leave it as that.
>>
>>
>> Nope. Currently vector mask is obtained from a vec_cond <A op B, {0 ..
>> 0}, {-1 .. -1}>. AND and IOR on bools are also expressed via
>> additional vec_cond. I don't think vectorizer ever generates vector
>> comparison.
>>
>> And I wouldn't say it's fine 'everywhere' because there is a single
>> target utilizing them. Masked loads and stored for AVX-512 just don't
>> work now. And if we extend existing MASK_LOAD and MASK_STORE optabs to
>> 512-bit vector then we get an ugly inefficient code. The question is
>> where to fight with this inefficiency: in RTL or in GIMPLE. I want to
>> fight with it where it appears, i.e. in GIMPLE by preventing bool ->
>> int conversions applied everywhere even if target doesn't need it.
>
> You should expect pushback anytime target dependencies are added to gimple,
> even if it's stuff in the vectorizer, which is infested with target
> dependencies.
>
>>
>> If we don't want to support both types of masks in GIMPLE then it's
>> more reasonable to make bool -> int conversion in expand for targets
>> requiring it, rather than do it for everyone and then leave it to
>> target to transform it back and try to get rid of all those redundant
>> transformations. I'd give vector<bool> a chance to become a canonical
>> mask representation for that.
>
> Might be worth some experimentation.
>
> Jeff