This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch V2] Fix SLP PR58135.


On Wed, May 18, 2016 at 5:29 PM, Kumar, Venkataramanan
<Venkataramanan.Kumar@amd.com> wrote:
> Hi Richard,
>
>> -----Original Message-----
>> From: Richard Biener [mailto:richard.guenther@gmail.com]
>> Sent: Tuesday, May 17, 2016 5:40 PM
>> To: Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com>
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: Re: [Patch V2] Fix SLP PR58135.
>>
>> On Tue, May 17, 2016 at 1:56 PM, Kumar, Venkataramanan
>> <Venkataramanan.Kumar@amd.com> wrote:
>> > Hi Richard,
>> >
>> > I created the patch by passing -b option to git. Now the patch is more
>> readable.
>> >
>> > As per your suggestion I tried to fix the PR by splitting the SLP store group at
>> vector boundary after the SLP tree is built.
>> >
>> > Boot strap PASSED on x86_64.
>> > Checked the patch with check_GNU_style.sh.
>> >
>> > The gfortran.dg/pr46519-1.f test now does SLP vectorization. Hence it
>> generated 2 more vzeroupper.
>> > As recommended I adjusted the test case by adding -fno-tree-slp-vectorize
>> to make it as expected after loop vectorization.
>> >
>> > The following tests are now passing.
>> >
>> > ------ Snip-----
>> > Tests that now work, but didn't before:
>> >
>> > gcc.dg/vect/bb-slp-19.c -flto -ffat-lto-objects  scan-tree-dump-times
>> > slp2 "basic block vectorized" 1
>> >
>> > gcc.dg/vect/bb-slp-19.c scan-tree-dump-times slp2 "basic block
>> > vectorized" 1
>> >
>> > New tests that PASS:
>> >
>> > gcc.dg/vect/pr58135.c (test for excess errors) gcc.dg/vect/pr58135.c
>> > -flto -ffat-lto-objects (test for excess errors)
>> >
>> > ------ Snip-----
>> >
>> > ChangeLog
>> >
>> > 2016-05-14  Venkataramanan Kumar
>> <Venkataramanan.kumar@amd.com>
>> >      PR tree-optimization/58135
>> >     * tree-vect-slp.c:  When group size is not multiple of vector size,
>> >      allow splitting of store group at vector boundary.
>> >
>> > Test suite  ChangeLog
>> > 2016-05-14  Venkataramanan Kumar
>> <Venkataramanan.kumar@amd.com>
>> >     * gcc.dg/vect/bb-slp-19.c:  Remove XFAIL.
>> >     * gcc.dg/vect/pr58135.c:  Add new.
>> >     * gfortran.dg/pr46519-1.f: Adjust test case.
>> >
>> > The attached patch Ok for trunk?
>>
>>
>> Please avoid the excessive vertical space around the vect_build_slp_tree call.
> Yes fixed in the attached patch.
>>
>> +      /* Calculate the unrolling factor.  */
>> +      unrolling_factor = least_common_multiple
>> +                         (nunits, group_size) / group_size;
>> ...
>> +      else
>>         {
>>           /* Calculate the unrolling factor based on the smallest type.  */
>>           if (max_nunits > nunits)
>> -        unrolling_factor = least_common_multiple (max_nunits, group_size)
>> -                           / group_size;
>> +           unrolling_factor
>> +               = least_common_multiple (max_nunits,
>> + group_size)/group_size;
>>
>> please compute the "correct" unroll factor immediately and move the
>> "unrolling of BB required" error into the if() case by post-poning the nunits <
>> group_size check (and use max_nunits here).
>>
> Yes fixed in the attached patch.
>
>> +      if (is_a <bb_vec_info> (vinfo)
>> +         && nunits < group_size
>> +         && unrolling_factor != 1
>> +         && is_a <bb_vec_info> (vinfo))
>> +       {
>> +         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> +                          "Build SLP failed: store group "
>> +                          "size not a multiple of the vector size "
>> +                          "in basic block SLP\n");
>> +         /* Fatal mismatch.  */
>> +         matches[nunits] = false;
>>
>> this is too pessimistic - you want to add the extra 'false' at group_size /
>> max_nunits * max_nunits.
> Yes fixed in attached patch.
>
>>
>> It looks like you leak 'node' in the if () path as well.  You need
>>
>>           vect_free_slp_tree (node);
>>           loads.release ();
>>
>> thus treat it as a failure case.
>
> Yes fixed. I added an else part before scalar_stmts.release call for the case when SLP tree is not built. This avoids double freeing.
> Bootstrapped and reg tested on X86_64.
>
> Ok for trunk ?

+      /*Calculate the unrolling factor based on the smallest type.  */
+      unrolling_factor

Space after /* missing.

+      unrolling_factor
+       = least_common_multiple (max_nunits, group_size)/group_size;

spaces before/after the '/'

+      if (max_nunits > nunits
+         && unrolling_factor != 1
+         && is_a <bb_vec_info> (vinfo))
        {
          if (dump_enabled_p ())
-            dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-                            "Build SLP failed: unrolling required in basic"
-                            " block SLP\n");
+             dump_printf_loc (MSG_MISSED_OPTIMIZATION,
+                              vect_location,
+                              "Build SLP failed: unrolling "
+                              "required in basic block SLP\n");
          vect_free_slp_tree (node);
          loads.release ();
          return false;
        }

+      if (is_a <bb_vec_info> (vinfo)
+         && max_nunits < group_size
+         && unrolling_factor != 1)

please merge these.  I think you have a not covered case of max_nunits
== nunits, max_nunits > group_size.
So do

    if (unrolling_factor != 1
        && is_a <...>)
     {
        if (max_nunits >= group_size)
          {
             first error
          }
        ... signal fatal mismatch instead...
     }
    else
     {

Ok with that change.

Thanks,
Richard.

>>
>> Thanks,
>> Richard.
>>
>> > Regards,
>> > Venkat.
>> >
>
> Regards,
> Venkat.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]