[PATCH] rs6000/test: Adjust some cases due to O2 vect [PR102658]

Wed Oct 13 07:43:11 GMT 2021

on 2021/10/13 下午2:29, Hongtao Liu via Gcc-patches wrote:
> On Wed, Oct 13, 2021 at 11:34 AM Hongtao Liu <crazylht@gmail.com> wrote:
>>
>> On Tue, Oct 12, 2021 at 11:49 PM Martin Sebor <msebor@gmail.com> wrote:
>>>
>>> On 10/11/21 8:31 PM, Hongtao Liu wrote:
>>>> On Tue, Oct 12, 2021 at 4:08 AM Martin Sebor via Gcc-patches
>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>>
>>>>> On 10/11/21 11:43 AM, Segher Boessenkool wrote:
>>>>>> On Mon, Oct 11, 2021 at 10:23:03AM -0600, Martin Sebor wrote:
>>>>>>> On 10/11/21 9:30 AM, Segher Boessenkool wrote:
>>>>>>>> On Mon, Oct 11, 2021 at 10:47:00AM +0800, Kewen.Lin wrote:
>>>>>>>>> - For generic test cases, it follows the existing suggested
>>>>>>>>> practice with necessary target/xfail selector.
>>>>>>>>
>>>>>>>> Not such a great choice.  Many of those tests do not make sense with
>>>>>>>> vectorisation enabled.  This should have been thought about, in some
>>>>>>>> cases resulting in not running the test with vectorisation enabled, and
>>>>>>>> in some cases duplicating the test, once with and once without
>>>>>>>> vectorisation.
>>>>>>>
>>>>>>> The tests detect bugs that are present both with and without
>>>>>>> vetctorization, so they should pass both ways.
>>>>>>
>>>>>> Then it should be tested both ways!  This is my point.
>>>>>
>>>>> Agreed.  (Most warnings are tested with just one set of options,
>>>>> but it's becoming apparent that the middle end ones should be
>>>>> exercised more extensively.)
>>>>>
>>>>>>
>>>>>>> That they don't
>>>>>>> tells us that that the warnings need work (they were written with
>>>>>>> an assumption that doesn't hold anymore).
>>>>>>
>>>>>> They were written in world A.  In world B many things behave
>>>>>> differently.  Transplanting the testcases from A to B without any extra
>>>>>> analysis will not test what the testcases wanted to test, and possibly
>>>>>> nothing at all anymore.
>>>>>
>>>>> Absolutely.
>>>>>
>>>>>>
>>>>>>> We need to track that
>>>>>>> work somehow, but simply xfailing them without making a record
>>>>>>> of what underlying problem the xfails correspond to isn't the best
>>>>>>> way.  In my experience, what works well is opening a bug for each
>>>>>>> distinct limitation (if one doesn't already exist) and adding
>>>>>>> a reference to it as a comment to the xfail.
>>>>>>
>>>>>> Probably, yes.
>>>>>>
>>>>>>>> But you are just following established practice, so :-)
>>>>>>
>>>>>> I also am okay with this.  If it was decided x86 does not have to deal
>>>>>> with these (generic!) problems, then why should we do other people's
>>>>>> work?
>>>>>
>>>>> I don't know that anything was decided.  I think those changes
>>>>> were made in haste, and (as you noted in your review of these
>>>>> updates to them), were incomplete (missing comments referencing
>>>>> the underlying bugs or limitations).  Now that we've noticed it
>>>>> we should try to fix it.  I'm not expecting you (or Kwen) to do
>>>>> other people's work, but it would help to let them/us know that
>>>>> there is work for us to do.  I only noticed the problem by luck.
>>>>>
>>>>>>>>> -  struct A1 a = { 0, { 1 } };   // { dg-warning
>>>>>>>>> "\\\[-Wstringop-overflow" "" { target { i?86-*-* x86_64-*-* } } }
>>>>>>>>> +  struct A1 a = { 0, { 1 } };   // { dg-warning
>>>>>>>>> "\\\[-Wstringop-overflow" "" { target { i?86-*-* x86_64-*-* powerpc*-*-*
>>>>>>>>> } } }
>>>>>>>
>>>>>>> As I mentioned in the bug, when adding xfails for regressions
>>>>>>> please be sure to reference the bug that tracks the underlying
>>>>>>> root cause.]
>>>>>>
>>>>>> You are saying this to whoever added that x86 xfail I hope.
>>>>>
>>>>> In general it's an appeal to both authors and reviewers of such
>>>>> changes.  Here, it's mostly for Hongtao who apparently added all
>>>>> these undocumented xfails.
>>>>>
>>>>>>> There may be multiple problems, and we need to
>>>>>>> identify what it is in each instance.  As the author of
>>>>>>> the tests I can help with that but not if I'm not in the loop
>>>>>>> on these changes (it would seem prudent to get the author's
>>>>>>> thoughts on such sweeping changes to their work).
>>>>>>
>>>>>> Yup.
>>>>>>
>>>>>>> I discussed one of these failures with Hongtao in detail at
>>>>>>> the time autovectorization was being enabled and made the same
>>>>>>> request then but I didn't realize the problem was so pervasive.
>>>>>>>
>>>>>>> In addition, the target-specific conditionals in the xfails are
>>>>>>> going to be difficult to maintain.
>>>>>>
>>>>>> It is a cop-out.  Especially because it makes no comment why it is
>>>>>> xfailed (which should *always* be explained!)
>>>>>>
>>>>>>> It might be okay for one or
>>>>>>> two in a single test but for so many we need a better solution
>>>>>>> than that.  If autovectorization is only enabled for a subset
>>>>>>> of targets then a solution might be to add a new DejagGNU test
>>>>>>> for it and conditionalize the xfails on it.
>>>>>>
>>>>>> That, combined with duplicating these tests and still testing the
>>>>>> -fno-vectorization situation properly.  Those tests tested something.
>>>>>> With vectorisation enabled they might no longer test that same thing,
>>>>>> especially if the test fails now!
>>>>>
>>>>> Right.  The original autovectorization change was made either
>>>>> without a full analysis of its impact on the affected warnings,
>>>>> or its impact wasn't adequately captured (either in the xfails
>>>>> comments or by opening bugs for them).  Now that we know about
>>>>> this we should try to fix it.  The first step toward that is
>>>>> to review the xfailed test cases and for each add a comment with
>>>>> the bug that captures its root cause.
>>>>>
>>>>> Hongtao, please let me know if you are going to work on that.
>>>> I will make a copy of the tests to test the -fno-tree-vectorize
>>>> scenario(the original test).
>>>> For the xfails, they're analyzed and recorded in pr102462/pr102697,
>>>> sorry for not adding comments to them.
>>>
>>> Thanks for raising pr102697!  It captures the essence of the bug
>>> that's masked by the vectorization of the invalid store.  This is
>>> due to the hack I pointed to in the discussion below:
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580172.html
>>>
>>>> The root causes for those xfails are divided into 2 categories:
>>>>
>>>> 1. All accesses are out of bound, and after vectorization, there are
>>>> some warnings missing.(Because there is only 1 access after
>>>> vectorization, 2 accesses w/o vectorization, and diagnostic is based
>>>> on access).
>>>
>>> If these involve -Wstringop-overflow for accesses that span
>>> multiple subobjects, as in writing past the end of one member
>>> and over the following member, then that would be due to
>>> pr102697 (the hack above).
>>>
>>>> 2. Part of accesses are inbound, part of accesses are out of bound,
>>>> and after vectorization, the warning goes from out of bound line to
>>>> inbound line.
>>>
>>> Right, this is the issue we talked about during the review of
>>> your patch, and I think is captured in the test case in comment
>>> #4 on pr102462.
>>>
>>>>
>>>> for pr102697, it looks like the testcase is not well written.
>>>
>>> The test case is correct.  I've added my comments to the PR
>>> and confirmed it as a GCC 12 regression.  (I may not have
>>> the time to fix it for GCC 12 but I will plan to get to it
>>> for GCC 13 unless someone beats me to it.)
>>>
>>> I think it might be helpful to open a bug just for case (2)
>>> and reference it in all the corresponding xfails.
>>>
>>> pr102462 talks about three distinct cases and mentions
>>> -Warray-bounds as well as -Wstringop-overflow.  It's not clear
>>> from it exactly which of the three cases it's meant to be about.
>>>
>>> There is also an undocumented xfail in
>>> g++.dg/warn/Wuninitialized-13.C.  It should get its own bug
>>> even if the essence of the problem is the same (the warning
>>> doesn't share an implementation with -Warray-bounds or
>>> -Wstringop-overflow so a fix will most likely need to be
>>> separate from one for the other bugs).
>>>
>>> Coming back to the xfail conditionals, do you think you'll
>>> be able to put together some target-supports magic so they
>>> don't have to enumerate all the affected targets?
>>>
>> Those failure testcases(exposed by x86 part)can be extracted and
>> categorized into 3 below testcases.
>> Question is can we check vectorization ability in
>> dg-require-effective-target for those testcase?
>> If we can, we  can dynamically check whether each target supports this xfail.
>>
> How about
> +# Return true if vectorization of v2qi store is enabed.
> +# Return zero if the desirable pattern isn't found.
> +# It's used by Warray-bounds/Wstringop-overflow testcases which are
> +# regressed by O2 vectorization, refer to PR102697/PR102462/PR102706
> +proc check_vect_slp_v2qi_store_usage { } {
> +    global tool
> +
> +    return [check_cached_effective_target slp_v2qi_store_usage {
> +      set result [check_compile slp_v2qi_store_usage assembly {
> +   char p[4] __attribute__ ((aligned (4)));
> +   void
> +   foo ()
> +   {
> +       p[0] = 1;
> +       p[1] = 2;
> +       p[2] = 3;
> +   }
> +      } "-O2 -fopt-info-all" ]
> +
> +      # Get compiler emitted messages and delete generated file.
> +      set lines [lindex $result 0]
> +      set output [lindex $result 1]
> +      remote_file build delete $output
> +
> +      set pattern1 {optimized: basic block part vectorized using
> [0-9]+ byte vectors}
> +      set pattern2 {add new stmt: MEM <vector\(2\) char>}
> +      # Capture the vectorized info of v2qi, set it to zero if not found.
> + if { ![regexp $pattern1 $lines whole val]
> +      || ![regexp $pattern2 $lines whole val] } then {
> +   set val 0
> +      }
> +
> +      return $val
> +    }]
> +}
> +
> +# Return the true if target support vectorization of v2qi store.
> +proc check_effective_target_vect_slp_v2qi_store { } {
> +    return [expr { [check_vect_slp_v2qi_store_usage] != 0 }]
> +}
> 
> similar for vect_slp_v4qi_store/vec_slp_v2hi_store, and add these
> target selector to xfail/target cases.

Nice!  Thanks for doing this.  It seems we can have one general proc
check_vect_slp_store_usage, and pass the different arguments to it
according to v2qi, v4qi and v2hi.

And I assume all these kinds of test cases are simple, it won't have
the possibility that this pre-test says slp-ed while the actual case
says no when there are some other stmts affecting the profitability
evaluation.  Otherwise, we might have to force unlimited cost model
for both.

BR,
Kewen

>> foo is used for vector(2) char, foo1 vector(4) char, foo2 vector(2) short.
>>
>> char p[4] __attribute__ ((aligned (4)));
>> void
>> foo ()
>> {
>>   p[0] = 1;
>>   p[1] = 2;
>>   p[2] = 3;
>> }
>>
>> void
>> foo1 ()
>> {
>>   p[0] = 0;
>>   p[1] = 1;
>>   p[2] = 2;
>>   p[3] = 3;
>> }
>>
>> void
>> foo2 (short* q)
>> {
>>   q[0] = 0;
>>   q[1] = 1;
>> }
>>> Martin
>>
>>
>>
>>
>> --
>> BR,
>> Hongtao
> 
> 
>