This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: New GCC options for loop vectorization


On Mon, Sep 23, 2013 at 10:37 PM, Xinliang David Li <davidxl@google.com> wrote:
> Thanks. I modified the patch so that the max allowed peel iterations
> can be specified via a parameter. Testing on going. Ok for trunk ?

+DEFPARAM(PARAM_VECT_MAX_PEELING_FOR_ALIGNMENT,
+         "vect-max-peeling-for-alignment",
+         "Max number of loop peels to enhancement alignment of data
references in a loop",
+         -1, 0, 0)

I believe this doesn't allow --param vect-max-peeling-for-alignment=-1 as you
specify 0 as the minimum.

So, ok with changing minimum and maxmum to -1. (double check that this works)

Thanks,
Richard.



> David
>
> On Mon, Sep 23, 2013 at 4:31 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Sep 18, 2013 at 10:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> On Tue, Sep 17, 2013 at 1:20 AM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>> On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener
>>>>> <richard.guenther@gmail.com> wrote:
>>>>>> On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>>> On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener
>>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>>> On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>>>>> Currently -ftree-vectorize turns on both loop and slp vectorizations,
>>>>>>>>> but there is no simple way to turn on loop vectorization alone. The
>>>>>>>>> logic for default O3 setting is also complicated.
>>>>>>>>>
>>>>>>>>> In this patch, two new options are introduced:
>>>>>>>>>
>>>>>>>>> 1) -ftree-loop-vectorize
>>>>>>>>>
>>>>>>>>> This option is used to turn on loop vectorization only. option
>>>>>>>>> -ftree-slp-vectorize also becomes a first class citizen, and no funny
>>>>>>>>> business of Init(2) is needed.  With this change, -ftree-vectorize
>>>>>>>>> becomes a simple alias to -ftree-loop-vectorize +
>>>>>>>>> -ftree-slp-vectorize.
>>>>>>>>>
>>>>>>>>> For instance, to turn on only slp vectorize at O3, the old way is:
>>>>>>>>>
>>>>>>>>>      -O3 -fno-tree-vectorize -ftree-slp-vectorize
>>>>>>>>>
>>>>>>>>> With the new change it becomes:
>>>>>>>>>
>>>>>>>>>     -O3 -fno-loop-vectorize
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To turn on only loop vectorize at O2, the old way is
>>>>>>>>>
>>>>>>>>>     -O2 -ftree-vectorize -fno-slp-vectorize
>>>>>>>>>
>>>>>>>>> The new way is
>>>>>>>>>
>>>>>>>>>     -O2 -ftree-loop-vectorize
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2) -ftree-vect-loop-peeling
>>>>>>>>>
>>>>>>>>> This option is used to turn on/off loop peeling for alignment.  In the
>>>>>>>>> long run, this should be folded into the cheap cost model proposed by
>>>>>>>>> Richard.  This option is also useful in scenarios where peeling can
>>>>>>>>> introduce runtime problems:
>>>>>>>>> http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html  which happens to be
>>>>>>>>> common in practice.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Patch attached. Compiler boostrapped. Ok after testing?
>>>>>>>>
>>>>>>>> I'd like you to split 1) and 2), mainly because I agree on 1) but not on 2).
>>>>>>>
>>>>>>> Ok. Can you also comment on 2) ?
>>>>>>
>>>>>> I think we want to decide how granular we want to control the vectorizer
>>>>>> and using which mechanism.  My cost-model re-org makes
>>>>>> ftree-vect-loop-version a no-op (basically removes it), so 2) looks like
>>>>>> a step backwards in this context.
>>>>>
>>>>> Using cost model to do a coarse grain control/configuration is
>>>>> certainly something we want, but having a fine grain control is still
>>>>> useful.
>>>>>
>>>>>>
>>>>>> So, can you summarize what pieces (including versioning) of the vectorizer
>>>>>> you'd want to be able to disable separately?
>>>>>
>>>>> Loop peeling seems to be the main one. There is also a correctness
>>>>> issue related. For instance, the following code is common in practice,
>>>>> but loop peeling wrongly assumes initial base-alignment and generates
>>>>> aligned mov instruction after peeling, leading to SEGV.  Peeling is
>>>>> not something we can blindly turned on -- even when it is on, there
>>>>> should be a way to turn it off explicitly:
>>>>>
>>>>> char a[10000];
>>>>>
>>>>> void foo(int n)
>>>>> {
>>>>>   int* b = (int*)(a+n);
>>>>>   int i = 0;
>>>>>   for (; i < 1000; ++i)
>>>>>     b[i] = 1;
>>>>> }
>>>>>
>>>>> int main(int argn, char** argv)
>>>>> {
>>>>>   foo(argn);
>>>>> }
>>>>
>>>> But that's just a bug that should be fixed (looking into it).
>>>>
>>>>>>  Just disabling peeling for
>>>>>> alignment may get you into the versioning for alignment path (and thus
>>>>>> an unvectorized loop at runtime).
>>>>>
>>>>> This is not true for target supporting mis-aligned access. I have not
>>>>> seen a case where alignment driver loop version happens on x86.
>>>>>
>>>>>>Also it's know that the alignment peeling
>>>>>> code needs some serious TLC (it's outcome depends on the order of DRs,
>>>>>> the cost model it uses leaves to be desired as we cannot distinguish
>>>>>> between unaligned load and store costs).
>>>>>
>>>>> Yet another reason to turn it off as it is not effective anyways?
>>>>
>>>> As said I'll disable all remains of -ftree-vect-loop-version with the cost model
>>>> patch because it wasn't guarding versioning for aliasing but only versioning
>>>> for alignment.
>>>>
>>>> We have to be consistent here - if we add a way to disable peeling for
>>>> alignment then we certainly don't want to remove the ability to disable
>>>> versioning for alignment, no?
>>>
>>> We already have the ability to turn off versioning -- via --param. It
>>> is a more natural way to fine tune a pass instead of introducing a -f
>>> option. For this reason, your planned deprecation of the option is a
>>> good thing to do.
>>>
>>> For consistency, I think we should introduce a new parameter to turn
>>> on/off peeling. This can also be tied closely with the cost model
>>> change.
>>>
>>> The proposed patch attached. Does this one look ok?
>>
>> I'd principally support adding a param but rather would for example make
>> it limit the number of peeled iterations (zero turning off the feature).  As
>> you implemented it it will give false messages for
>>
>>       supportable_dr_alignment = vect_supportable_dr_alignment (dr, true);
>>       do_peeling = vector_alignment_reachable_p (dr);
>>       if (do_peeling)
>>         {
>> ...
>>         }
>>       else
>>         {
>>           if (!aligned_access_p (dr))
>>             {
>>               if (dump_enabled_p ())
>>                 dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>>                                  "vector alignment may not be reachable\n");
>>               break;
>>             }
>>         }
>>
>> the else path.  I'd restructure this as
>>
>>      if (!vector_alignment_reachable_p (dr))
>>        {
>>           if (!aligned_access_p (dr))
>>          ....
>>           continue;
>>        }
>>
>>      if (known_alignment_for_access_p (dr))
>>       {
>> ... check param against the number of iterations to peel for this DR ....
>>       }
>>      else
>>       {
>> ... check param against max iterations ...
>>       }
>>
>> an alternative is a tri-state param that would either allow no peeling at all,
>> just peeling with known number of iterations or all kind of peelings.
>>
>> Thanks,
>> Richard.
>>
>>
>>> thanks,
>>>
>>> David
>>>
>>>
>>>
>>>>
>>>> Richard.
>>>>
>>>>>
>>>>> thanks,
>>>>>
>>>>> David
>>>>>
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>>>
>>>>>>>> I've stopped a quick try doing 1) myself because
>>>>>>>>
>>>>>>>> @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options
>>>>>>>>          opts->x_flag_ipa_reference = false;
>>>>>>>>        break;
>>>>>>>>
>>>>>>>> +    case OPT_ftree_vectorize:
>>>>>>>> +      if (!opts_set->x_flag_tree_loop_vectorize)
>>>>>>>> + opts->x_flag_tree_loop_vectorize = value;
>>>>>>>> +      if (!opts_set->x_flag_tree_slp_vectorize)
>>>>>>>> + opts->x_flag_tree_slp_vectorize = value;
>>>>>>>> +      break;
>>>>>>>>
>>>>>>>> doesn't look obviously correct.  Does that handle
>>>>>>>>
>>>>>>>>   -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize
>>>>>>>>
>>>>>>>> or
>>>>>>>>
>>>>>>>>   -ftree-loop-vectorize -fno-tree-vectorize
>>>>>>>>
>>>>>>>> properly?  Currently at least
>>>>>>>>
>>>>>>>>   -ftree-slp-vectorize -fno-tree-vectorize
>>>>>>>>
>>>>>>>> doesn't "work".
>>>>>>>
>>>>>>>
>>>>>>> Right -- same is true for -fprofile-use option. FDO enables some
>>>>>>> passes, but can not re-enable them if they are flipped off before.
>>>>>>>
>>>>>>>>
>>>>>>>> That said, the option machinery doesn't handle an option being an alias
>>>>>>>> for two other options, so it's mechanism to contract positives/negatives
>>>>>>>> doesn't work here and the override hooks do not work reliably for
>>>>>>>> repeated options.
>>>>>>>>
>>>>>>>> Or am I wrong here?  Should we care at all?  Joseph?
>>>>>>>
>>>>>>> We should probably just document the behavior. Even better, we should
>>>>>>> deprecate the old option.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> David


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]