This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: New GCC options for loop vectorization


On Tue, Sep 17, 2013 at 1:20 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li <davidxl@google.com> wrote:
>> On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener
>>>> <richard.guenther@gmail.com> wrote:
>>>>> On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>> Currently -ftree-vectorize turns on both loop and slp vectorizations,
>>>>>> but there is no simple way to turn on loop vectorization alone. The
>>>>>> logic for default O3 setting is also complicated.
>>>>>>
>>>>>> In this patch, two new options are introduced:
>>>>>>
>>>>>> 1) -ftree-loop-vectorize
>>>>>>
>>>>>> This option is used to turn on loop vectorization only. option
>>>>>> -ftree-slp-vectorize also becomes a first class citizen, and no funny
>>>>>> business of Init(2) is needed.  With this change, -ftree-vectorize
>>>>>> becomes a simple alias to -ftree-loop-vectorize +
>>>>>> -ftree-slp-vectorize.
>>>>>>
>>>>>> For instance, to turn on only slp vectorize at O3, the old way is:
>>>>>>
>>>>>>      -O3 -fno-tree-vectorize -ftree-slp-vectorize
>>>>>>
>>>>>> With the new change it becomes:
>>>>>>
>>>>>>     -O3 -fno-loop-vectorize
>>>>>>
>>>>>>
>>>>>> To turn on only loop vectorize at O2, the old way is
>>>>>>
>>>>>>     -O2 -ftree-vectorize -fno-slp-vectorize
>>>>>>
>>>>>> The new way is
>>>>>>
>>>>>>     -O2 -ftree-loop-vectorize
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2) -ftree-vect-loop-peeling
>>>>>>
>>>>>> This option is used to turn on/off loop peeling for alignment.  In the
>>>>>> long run, this should be folded into the cheap cost model proposed by
>>>>>> Richard.  This option is also useful in scenarios where peeling can
>>>>>> introduce runtime problems:
>>>>>> http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html  which happens to be
>>>>>> common in practice.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Patch attached. Compiler boostrapped. Ok after testing?
>>>>>
>>>>> I'd like you to split 1) and 2), mainly because I agree on 1) but not on 2).
>>>>
>>>> Ok. Can you also comment on 2) ?
>>>
>>> I think we want to decide how granular we want to control the vectorizer
>>> and using which mechanism.  My cost-model re-org makes
>>> ftree-vect-loop-version a no-op (basically removes it), so 2) looks like
>>> a step backwards in this context.
>>
>> Using cost model to do a coarse grain control/configuration is
>> certainly something we want, but having a fine grain control is still
>> useful.
>>
>>>
>>> So, can you summarize what pieces (including versioning) of the vectorizer
>>> you'd want to be able to disable separately?
>>
>> Loop peeling seems to be the main one. There is also a correctness
>> issue related. For instance, the following code is common in practice,
>> but loop peeling wrongly assumes initial base-alignment and generates
>> aligned mov instruction after peeling, leading to SEGV.  Peeling is
>> not something we can blindly turned on -- even when it is on, there
>> should be a way to turn it off explicitly:
>>
>> char a[10000];
>>
>> void foo(int n)
>> {
>>   int* b = (int*)(a+n);
>>   int i = 0;
>>   for (; i < 1000; ++i)
>>     b[i] = 1;
>> }
>>
>> int main(int argn, char** argv)
>> {
>>   foo(argn);
>> }
>
> But that's just a bug that should be fixed (looking into it).

This kind of code is not uncommon for certain applications (e.g, group
varint decoding).  Besides, the code like this may be built with
-fno-strict-aliasing.


>
>>>  Just disabling peeling for
>>> alignment may get you into the versioning for alignment path (and thus
>>> an unvectorized loop at runtime).
>>
>> This is not true for target supporting mis-aligned access. I have not
>> seen a case where alignment driver loop version happens on x86.
>>
>>>Also it's know that the alignment peeling
>>> code needs some serious TLC (it's outcome depends on the order of DRs,
>>> the cost model it uses leaves to be desired as we cannot distinguish
>>> between unaligned load and store costs).
>>
>> Yet another reason to turn it off as it is not effective anyways?
>
> As said I'll disable all remains of -ftree-vect-loop-version with the cost model
> patch because it wasn't guarding versioning for aliasing but only versioning
> for alignment.
>
> We have to be consistent here - if we add a way to disable peeling for
> alignment then we certainly don't want to remove the ability to disable
> versioning for alignment, no?

yes, for consistency, the version control flag may also be useful to be kept.

David

>
> Richard.
>
>>
>> thanks,
>>
>> David
>>
>>>
>>> Richard.
>>>
>>>>>
>>>>> I've stopped a quick try doing 1) myself because
>>>>>
>>>>> @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options
>>>>>          opts->x_flag_ipa_reference = false;
>>>>>        break;
>>>>>
>>>>> +    case OPT_ftree_vectorize:
>>>>> +      if (!opts_set->x_flag_tree_loop_vectorize)
>>>>> + opts->x_flag_tree_loop_vectorize = value;
>>>>> +      if (!opts_set->x_flag_tree_slp_vectorize)
>>>>> + opts->x_flag_tree_slp_vectorize = value;
>>>>> +      break;
>>>>>
>>>>> doesn't look obviously correct.  Does that handle
>>>>>
>>>>>   -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize
>>>>>
>>>>> or
>>>>>
>>>>>   -ftree-loop-vectorize -fno-tree-vectorize
>>>>>
>>>>> properly?  Currently at least
>>>>>
>>>>>   -ftree-slp-vectorize -fno-tree-vectorize
>>>>>
>>>>> doesn't "work".
>>>>
>>>>
>>>> Right -- same is true for -fprofile-use option. FDO enables some
>>>> passes, but can not re-enable them if they are flipped off before.
>>>>
>>>>>
>>>>> That said, the option machinery doesn't handle an option being an alias
>>>>> for two other options, so it's mechanism to contract positives/negatives
>>>>> doesn't work here and the override hooks do not work reliably for
>>>>> repeated options.
>>>>>
>>>>> Or am I wrong here?  Should we care at all?  Joseph?
>>>>
>>>> We should probably just document the behavior. Even better, we should
>>>> deprecate the old option.
>>>>
>>>> thanks,
>>>>
>>>> David
>>>>
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> David


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]