Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

Wei Mi wmi@google.com
Mon Nov 25 21:47:00 GMT 2013


On Mon, Nov 25, 2013 at 10:36 AM, Jeff Law <law@redhat.com> wrote:
> On 11/24/13 00:30, Wei Mi wrote:
>>
>> Sorry about the problem.
>>
>> For the failed testcase, it was compiled using -fmodulo-sched.
>> modulo-sched phase set SCHED_GROUP_P of a jump insn to be true, which
>> means the jump insn should be scheduled with prev insn as a group.
>> When modulo scheduling is finished, the flag of SCHED_GROUP_P is not
>> cleaned up. After that, pass_jump2 phase split the bb and move the
>> prev insn to another bb. Then pass_sched2 see the flag and mistakenly
>> try to bind the jump insn with a code label.
>>
>> I am thinking other cases setting SCHED_GROUP_P should have the same
>> problem because SCHED_GROUP_P is not cleaned up after scheduling is
>> done. The flag may become inconsistent after some optimizations and
>> may cause problem if it is used by later scheduling passes. I don't
>> know why similar problem was never exposed before.
>>
>> The fix is to simply cleanup SCHED_GROUP_P flag in sched_finish.
>
> I think this is showing up because this is the first time we have used
> SCHED_GROUP_P in cases where we merely want to keep two instructions
> consecutive vs cases where we are required to keep certain instructions
> consecutive.  For example, all the RTL passes already know they need to keep
> a cc0 setter and cc0 user consecutive on a HAVE_cc0 target.
>
> In the latter case passes should already be doing what is necessary to keep
> those instructions consecutive.  In the former case, we'd have to audit &
> fix passes to honor the desire to keep certain instructions consecutive.
>

I see. Thanks for showing me the reason.

>>
>> bootstrap is ok. regression test is going on. Is it ok if regression
>> passes?
>>
>> Thanks,
>> Wei.
>>
>> 2013-11-23  Wei Mi  <wmi@google.com>
>>
>>          PR rtl-optimization/59020
>>          * haifa-sched.c (cleanup_sched_group): New function.
>>          (sched_finish): Call cleanup_sched_group to cleanup
>> SCHED_GROUP_P.
>>
>> 2013-11-23  Wei Mi  <wmi@google.com>
>>          PR rtl-optimization/59020
>>          * testsuite/gcc.dg/pr59020.c (void f):
>
> I'll note you're doing an extra pass over all the RTL here.   Is there any
> clean way you can clean SCHED_GROUP_P without that extra pass over the RTL?
> Perhaps when the group actually gets scheduled?
>
> jeff
>

With your help to understand that sched group will not be broken by
other passes in other cases, I can cleanup SCHED_GROUP_P for
macrofusion only by checking every condjump insn which is at the end
of BB. Then the cost will be in the same scale with bb nums. Do you
think it is ok?

Thanks,
Wei.

2013-11-25  Wei Mi  <wmi@google.com>

        PR rtl-optimization/59020
        * haifa-sched.c (cleanup_sched_group): New function.
        (sched_finish): Call cleanup_sched_group to cleanup SCHED_GROUP_P.

2013-11-25  Wei Mi  <wmi@google.com>
        PR rtl-optimization/59020
        * testsuite/gcc.dg/pr59020.c (void f):

Index: haifa-sched.c
===================================================================
--- haifa-sched.c       (revision 204923)
+++ haifa-sched.c       (working copy)
@@ -6598,6 +6598,25 @@ group_insns_for_macro_fusion ()
     try_group_insn (BB_END (bb));
 }

+/* Cleanup SCHED_GROUP_P after scheduling is done. This is necessary because
+   bb may be changed by other optimizations and the flag from last scheduling
+   may become invalid. If later scheduler see the flag generated from last
+   scheduling, it may produces incorrect result.  */
+
+static void
+cleanup_sched_group ()
+{
+  basic_block bb;
+  rtx insn;
+
+  FOR_EACH_BB (bb)
+    {
+      insn = BB_END (bb);
+      if (INSN_P (insn) && SCHED_GROUP_P (insn))
+       SCHED_GROUP_P (insn) = 0;
+    }
+}
+
 /* Initialize some global state for the scheduler.  This function works
    with the common data shared between all the schedulers.  It is called
    from the scheduler specific initialization routine.  */
@@ -6841,6 +6860,8 @@ sched_finish (void)
     }
   free (curr_state);

+  cleanup_sched_group ();
+
   if (targetm.sched.finish_global)
     targetm.sched.finish_global (sched_dump, sched_verbose);

Index: testsuite/gcc.dg/pr59020.c
===================================================================
--- testsuite/gcc.dg/pr59020.c  (revision 0)
+++ testsuite/gcc.dg/pr59020.c  (revision 0)
@@ -0,0 +1,15 @@
+/* PR rtl-optimization/59020 */
+
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 -fmodulo-sched -fno-inline -march=corei7" } */
+
+int a, b, d;
+unsigned c;
+
+void f()
+{
+  unsigned q;
+  for(; a; a++)
+    if(((c %= d && 1) ? : 1) & 1)
+      for(; b; q++);
+}



More information about the Gcc-patches mailing list