Bug 45352 - ICE: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7058
Summary: ICE: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7058
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: Andrey Belevantsev
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-20 11:13 UTC by Zdenek Sojka
Modified: 2011-04-07 07:04 UTC (History)
2 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-09-22 14:29:22


Attachments
reduced testcase (from gcc.dg/vect/no-vfa-vect-43.c) (108 bytes, text/plain)
2010-08-20 11:15 UTC, Zdenek Sojka
Details
different testcase (200 bytes, text/plain)
2010-08-22 17:15 UTC, Zdenek Sojka
Details
another testcase that doesn't need many flags to reproduce (154 bytes, text/plain)
2010-09-20 18:00 UTC, Zdenek Sojka
Details
another testcase (205 bytes, text/plain)
2010-09-22 21:54 UTC, Zdenek Sojka
Details
another testcase (124 bytes, text/plain)
2010-09-29 19:00 UTC, Zdenek Sojka
Details
Latest patch (2.50 KB, patch)
2010-09-30 08:55 UTC, Andrey Belevantsev
Details | Diff
another testcase (728 bytes, text/x-c)
2010-10-05 16:13 UTC, Zdenek Sojka
Details
Patch (3.08 KB, patch)
2010-10-15 10:02 UTC, Andrey Belevantsev
Details | Diff
Updated patch (3.07 KB, patch)
2010-11-03 12:23 UTC, Andrey Belevantsev
Details | Diff
testcase failing in r166433 (277 bytes, text/plain)
2010-11-08 17:58 UTC, Zdenek Sojka
Details
Another patch (763 bytes, patch)
2010-12-21 12:10 UTC, Andrey Belevantsev
Details | Diff
testcase failing in r168214 (115 bytes, text/plain)
2010-12-23 20:56 UTC, Zdenek Sojka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2010-08-20 11:13:45 UTC
Command line:
$ gcc -O1 -frerun-cse-after-loop -ftree-vectorize -fschedule-insns -fschedule-insns2 -fselective-scheduling2 -fsel-sched-pipelining -funroll-loops -fprefetch-loop-arrays testcase.c
or
$ gcc -O3 -fschedule-insns -fschedule-insns2 -fselective-scheduling2 -fsel-sched-pipelining -funroll-loops -fprefetch-loop-arrays testcase.c

I came to this problem in several testcases, but the set of options could never be reduced more.

Compiler output:
$ gcc -O1 -frerun-cse-after-loop -ftree-vectorize -fschedule-insns -fschedule-insns2 -fselective-scheduling2 -fsel-sched-pipelining -funroll-loops -fprefetch-loop-arrays testcase.c
testcase.c: In function 'main1':
testcase.c:10:1: internal compiler error: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7058
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

Tested revisions:
r163371 - crash
Comment 1 Zdenek Sojka 2010-08-20 11:15:13 UTC
Created attachment 21528 [details]
reduced testcase (from gcc.dg/vect/no-vfa-vect-43.c)

The first loop is not needed to reproduce the problem, it just makes values initialized.
Comment 2 Zdenek Sojka 2010-08-22 17:15:08 UTC
Created attachment 21544 [details]
different testcase

This one crashes with:
$ gcc -Os -fselective-scheduling2 -fsel-sched-pipelining -fprofile-generate pr45352-2.c     
pr45352-2.c: In function 'df_md_alloc':
pr45352-2.c:21:1: internal compiler error: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7058
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

This set of flags might be more useful.
Comment 3 Zdenek Sojka 2010-09-20 18:00:01 UTC
Created attachment 21849 [details]
another testcase that doesn't need many flags to reproduce

$ gcc -fselective-scheduling2 -fsel-sched-pipelining -funroll-all-loops -march=amdfam10 -O3 pr45352-5.c 
pr45352-5.c: In function 'foo':
pr45352-5.c:22:1: internal compiler error: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7077
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

The failing assert is:
	  gcc_assert (cost < 0);
Comment 4 Andrey Belevantsev 2010-09-22 14:29:22 UTC
Confirmed.

All testcases except the first with the -O3 flags are fixed by the below patch.  The bug that the patch "fixes" is actually PR37360 all over again but in sel-sched instead of haifa.  We have the process of resetting sched-cycles for insns (that may be wrong because of pipelining) for the targets that may use them in their sched_finish hook.  (E.g. ia64 does bundling in this hook.)  The process is just scheduling insns in the same order as they are already, calling all the necessary hooks and massaging DFA so we get the correct cycles from it.  The assert triggered means that the selective scheduling and this resetting process got out of sync.  And this happened guess why, because (for the last test I had actually analyzed) the target claims issue_rate of 3 while issuing 4 insns on the same cycle!  I'm actually surprised that the GCC target lying to the backend may still surprise me, but I guess ten more years of GCC work would do the trick.

I will be looking at the remaining failure shortly.

diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 041c471..aff7eae 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -4402,7 +4402,8 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
     {
       can_issue_more = invoke_aftermath_hooks (fence, EXPR_INSN_RTX (best),
                                                can_issue_more);
-      if (can_issue_more == 0)
+      if (targetm.sched.variable_issue
+         && can_issue_more == 0)
         *pneed_stall = 1;
     }

Comment 5 Andrey Belevantsev 2010-09-22 15:34:19 UTC
The remaining problem is another case where we don't try to issue more insns because we believe from issue_rate that this is impossible.  Full patch that fixes all the tests with all the flags for me is below.  What it does is to fix the situation when we don't try to issue an insn not because some target hook said so, but because we believe that issue_rate is achieved already.  In this case, we still try.  There is some related debug dump improvements in the patch.

The patch will need a round of testing on a number of arches to check that I didn't broke the honest targets and a big comment explaining why we do this before I will submit to gcc-patches.  

diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 041c471..aee298a 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -4051,10 +4051,11 @@ sel_dfa_new_cycle (insn_t insn, fence_t fence)
 /* Invoke reorder* target hooks on the ready list.  Return the number of insns
    we can issue.  FENCE is the current fence.  */
 static int
-invoke_reorder_hooks (fence_t fence)
+invoke_reorder_hooks (fence_t fence, bool *pran_hook)
 {
   int issue_more;
-  bool ran_hook = false;
+
+  *pran_hook = false;
 
   /* Call the reorder hook at the beginning of the cycle, and call
      the reorder2 hook in the middle of the cycle.  */
@@ -4077,7 +4078,7 @@ invoke_reorder_hooks (fence_t fence)
           if (pipelining_p)
             ++ready.n_ready;
 
-          ran_hook = true;
+          *pran_hook = true;
         }
       else
         /* Initialize can_issue_more for variable_issue.  */
@@ -4106,14 +4107,14 @@ invoke_reorder_hooks (fence_t fence)
             ++ready.n_ready;
         }
 
-      ran_hook = true;
+      *pran_hook = true;
     }
   else
     issue_more = FENCE_ISSUE_MORE (fence);
 
   /* Ensure that ready list and vec_av_set are in line with each other,
      i.e. vec_av_set[i] == ready_element (&ready, i).  */
-  if (issue_more && ran_hook)
+  if (issue_more && *pran_hook)
     {
       int i, j, n;
       rtx *arr = ready.vec;
@@ -4313,7 +4314,7 @@ get_expr_cost (expr_t expr, fence_t fence)
 /* Find the best insn for scheduling, either via max_issue or just take
    the most prioritized available.  */
 static int
-choose_best_insn (fence_t fence, int privileged_n, int *index)
+choose_best_insn (fence_t fence, int privileged_n, bool ran_hook, int *index)
 {
   int can_issue = 0;
 
@@ -4338,6 +4339,8 @@ choose_best_insn (fence_t fence, int privileged_n, int *index)
 	  if (get_expr_cost (expr, fence) < 1)
 	    {
 	      can_issue = can_issue_more;
+	      if (!ran_hook && !can_issue)
+		can_issue = 1;
 	      *index = i;
 
 	      if (sched_verbose >= 2)
@@ -4366,6 +4369,7 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
                 int *pneed_stall)
 {
   expr_t best;
+  bool ran_hook;
 
   /* Choose the best insn for scheduling via:
      1) sorting the ready list based on priority;
@@ -4376,8 +4380,8 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
     {
       int privileged_n, index;
 
-      can_issue_more = invoke_reorder_hooks (fence);
-      if (can_issue_more > 0)
+      can_issue_more = invoke_reorder_hooks (fence, &ran_hook);
+      if (can_issue_more > 0 || !ran_hook)
         {
           /* Try choosing the best insn until we find one that is could be
              scheduled due to liveness restrictions on its destination register.
@@ -4385,7 +4389,7 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
              in the order of their priority.  */
           invoke_dfa_lookahead_guard ();
           privileged_n = calculate_privileged_insns ();
-          can_issue_more = choose_best_insn (fence, privileged_n, &index);
+          can_issue_more = choose_best_insn (fence, privileged_n, ran_hook, &index);
           if (can_issue_more)
             best = find_expr_for_ready (index, true);
         }
@@ -4402,7 +4406,8 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
     {
       can_issue_more = invoke_aftermath_hooks (fence, EXPR_INSN_RTX (best),
                                                can_issue_more);
-      if (can_issue_more == 0)
+      if (targetm.sched.variable_issue
+	  && can_issue_more == 0)
         *pneed_stall = 1;
     }
 
@@ -7046,6 +7051,8 @@ reset_sched_cycles_in_current_ebb (void)
 	    }
 
 	  haifa_clock += i;
+          if (sched_verbose >= 2)
+            sel_print ("haifa clock: %d\n", haifa_clock);
 	}
       else
 	gcc_assert (haifa_cost == 0);
@@ -7064,6 +7071,7 @@ reset_sched_cycles_in_current_ebb (void)
               {
                 sel_print ("advance_state (dfa_new_cycle)\n");
                 debug_state (curr_state);
+		sel_print ("haifa clock: %d\n", haifa_clock + 1);
               }
           }
 
@@ -7072,8 +7080,11 @@ reset_sched_cycles_in_current_ebb (void)
 	  cost = state_transition (curr_state, insn);
 
           if (sched_verbose >= 2)
-            debug_state (curr_state);
-
+	    {
+	      sel_print ("scheduled insn %d, clock %d\n", INSN_UID (insn),
+			 haifa_clock + 1);
+              debug_state (curr_state);
+	    }
 	  gcc_assert (cost < 0);
 	}
 
Comment 6 Zdenek Sojka 2010-09-22 21:54:18 UTC
Created attachment 21867 [details]
another testcase

Thank you for having a look!

I tried your patch, it bootstrapped fine. It fixed uploaded testcases, but some still ICE:
(r164527 + patch from comment #5, x86_64-linux):

$ gcc -mtune=amdfam10 -O3 -fpeel-loops -fselective-scheduling2 -fsel-sched-pipelining -fPIC pr45352-7.c 
pr45352-7.c: In function 'V_Pass_Avrg_16_C_ref':
pr45352-7.c:16:1: internal compiler error: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7088
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugs.gentoo.org/> for instructions.

I don't know how much, if at all, is this related.
Comment 7 Andrey Belevantsev 2010-09-28 11:47:33 UTC
Thanks for the test, it shows one more case of using issue_rate that I have missed.  We need to distinguish between the stalls caused by data dependencies and by lack of functional units, i.e. DFA related stalls.  This is because when resetting the cycles, we know that the latter should remain as is (which is actually is checked by the failing assert) but not the former.  Fixed by allowing need_stall to be negative and handling this situation in stall_for_cycles.  This again fixes all tests with all option combinations.

Zdenek, if you can, please test this in your environment, and I will check ia64 and possibly ppc/ppc64, because with this kind of patch I need to make sure that other (honest) targets are unaffected.


diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 041c471..5bf0e19 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -4051,10 +4051,11 @@ sel_dfa_new_cycle (insn_t insn, fence_t fence)
 /* Invoke reorder* target hooks on the ready list.  Return the number of insns
    we can issue.  FENCE is the current fence.  */
 static int
-invoke_reorder_hooks (fence_t fence)
+invoke_reorder_hooks (fence_t fence, bool *pran_hook)
 {
   int issue_more;
-  bool ran_hook = false;
+
+  *pran_hook = false;
 
   /* Call the reorder hook at the beginning of the cycle, and call
      the reorder2 hook in the middle of the cycle.  */
@@ -4077,7 +4078,7 @@ invoke_reorder_hooks (fence_t fence)
           if (pipelining_p)
             ++ready.n_ready;
 
-          ran_hook = true;
+          *pran_hook = true;
         }
       else
         /* Initialize can_issue_more for variable_issue.  */
@@ -4106,14 +4107,14 @@ invoke_reorder_hooks (fence_t fence)
             ++ready.n_ready;
         }
 
-      ran_hook = true;
+      *pran_hook = true;
     }
   else
     issue_more = FENCE_ISSUE_MORE (fence);
 
   /* Ensure that ready list and vec_av_set are in line with each other,
      i.e. vec_av_set[i] == ready_element (&ready, i).  */
-  if (issue_more && ran_hook)
+  if (issue_more && *pran_hook)
     {
       int i, j, n;
       rtx *arr = ready.vec;
@@ -4313,7 +4314,7 @@ get_expr_cost (expr_t expr, fence_t fence)
 /* Find the best insn for scheduling, either via max_issue or just take
    the most prioritized available.  */
 static int
-choose_best_insn (fence_t fence, int privileged_n, int *index)
+choose_best_insn (fence_t fence, int privileged_n, bool ran_hook, int *index)
 {
   int can_issue = 0;
 
@@ -4338,6 +4339,8 @@ choose_best_insn (fence_t fence, int privileged_n, int *index)
 	  if (get_expr_cost (expr, fence) < 1)
 	    {
 	      can_issue = can_issue_more;
+	      if (!ran_hook && !can_issue)
+		can_issue = 1;
 	      *index = i;
 
 	      if (sched_verbose >= 2)
@@ -4360,12 +4363,15 @@ choose_best_insn (fence_t fence, int privileged_n, int *index)
 /* Choose the best expr from *AV_VLIW_PTR and a suitable register for it.
    BNDS and FENCE are current boundaries and scheduling fence respectively.
    Return the expr found and NULL if nothing can be issued atm.
-   Write to PNEED_STALL the number of cycles to stall if no expr was found.  */
+   Write to PNEED_STALL the number of cycles to stall if no expr was found.
+   The positive number of cycles means a data dependency stall, while
+   the negative one means a functional stall (DFA stall).  */
 static expr_t
 find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
                 int *pneed_stall)
 {
   expr_t best;
+  bool ran_hook;
 
   /* Choose the best insn for scheduling via:
      1) sorting the ready list based on priority;
@@ -4376,8 +4382,8 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
     {
       int privileged_n, index;
 
-      can_issue_more = invoke_reorder_hooks (fence);
-      if (can_issue_more > 0)
+      can_issue_more = invoke_reorder_hooks (fence, &ran_hook);
+      if (can_issue_more > 0 || !ran_hook)
         {
           /* Try choosing the best insn until we find one that is could be
              scheduled due to liveness restrictions on its destination register.
@@ -4385,7 +4391,7 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
              in the order of their priority.  */
           invoke_dfa_lookahead_guard ();
           privileged_n = calculate_privileged_insns ();
-          can_issue_more = choose_best_insn (fence, privileged_n, &index);
+          can_issue_more = choose_best_insn (fence, privileged_n, ran_hook, &index);
           if (can_issue_more)
             best = find_expr_for_ready (index, true);
         }
@@ -4394,7 +4400,7 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
       if (can_issue_more == 0)
         {
           best = NULL;
-          *pneed_stall = 1;
+          *pneed_stall = -1;
         }
     }
 
@@ -4402,8 +4408,9 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence,
     {
       can_issue_more = invoke_aftermath_hooks (fence, EXPR_INSN_RTX (best),
                                                can_issue_more);
-      if (can_issue_more == 0)
-        *pneed_stall = 1;
+      if (targetm.sched.variable_issue
+	  && can_issue_more == 0)
+        *pneed_stall = -1;
     }
 
   if (sched_verbose >= 2)
@@ -5471,13 +5478,19 @@ schedule_expr_on_boundary (bnd_t bnd, expr_t expr_vliw, int seqno)
   return insn;
 }
 
-/* Stall for N cycles on FENCE.  */
+/* Stall for N cycles on FENCE.  N > 0 means we want a stall because
+   of an unsatisfied data dependence, N < 0 means the DFA stall.
+   The difference is that we need to set the AFTER_STALL_P bit only
+   for a data dependence stall.  */
 static void
 stall_for_cycles (fence_t fence, int n)
 {
   int could_more;
 
-  could_more = n > 1 || FENCE_ISSUED_INSNS (fence) < issue_rate;
+  if (n > 0)
+    could_more = 1;
+  else
+    n = -n;
   while (n--)
     advance_one_cycle (fence);
   if (could_more)
@@ -5510,7 +5523,7 @@ fill_insns (fence_t fence, int seqno, ilist_t **scheduled_insns_tailpp)
       blist_t *bnds_tailp1, *bndsp;
       expr_t expr_vliw;
       int need_stall;
-      int was_stall = 0, scheduled_insns = 0, stall_iterations = 0;
+      int was_stall = 0, scheduled_insns = 0;
       int max_insns = pipelining_p ? issue_rate : 2 * issue_rate;
       int max_stall = pipelining_p ? 1 : 3;
       bool last_insn_was_debug = false;
@@ -5529,17 +5542,16 @@ fill_insns (fence_t fence, int seqno, ilist_t **scheduled_insns_tailpp)
       do
         {
           expr_vliw = find_best_expr (&av_vliw, bnds, fence, &need_stall);
-          if (!expr_vliw && need_stall)
+          if (! expr_vliw && need_stall != 0)
             {
               /* All expressions required a stall.  Do not recompute av sets
                  as we'll get the same answer (modulo the insns between
                  the fence and its boundary, which will not be available for
-                 pipelining).  */
-              gcc_assert (! expr_vliw && stall_iterations < 2);
-              was_stall++;
-	      /* If we are going to stall for too long, break to recompute av
+                 pipelining).
+		 If we are going to stall for too long, break to recompute av
 		 sets and bring more insns for pipelining.  */
-	      if (need_stall <= 3)
+              was_stall++;
+	      if (need_stall < 0 || need_stall <= 3)
 		stall_for_cycles (fence, need_stall);
 	      else
 		{
@@ -5548,7 +5560,7 @@ fill_insns (fence_t fence, int seqno, ilist_t **scheduled_insns_tailpp)
 		}
             }
         }
-      while (! expr_vliw && need_stall);
+      while (! expr_vliw && need_stall != 0);
 
       /* Now either we've selected expr_vliw or we have nothing to schedule.  */
       if (!expr_vliw)
@@ -7046,6 +7058,8 @@ reset_sched_cycles_in_current_ebb (void)
 	    }
 
 	  haifa_clock += i;
+          if (sched_verbose >= 2)
+            sel_print ("haifa clock: %d\n", haifa_clock);
 	}
       else
 	gcc_assert (haifa_cost == 0);
@@ -7064,6 +7078,7 @@ reset_sched_cycles_in_current_ebb (void)
               {
                 sel_print ("advance_state (dfa_new_cycle)\n");
                 debug_state (curr_state);
+		sel_print ("haifa clock: %d\n", haifa_clock + 1);
               }
           }
 
@@ -7072,8 +7087,11 @@ reset_sched_cycles_in_current_ebb (void)
 	  cost = state_transition (curr_state, insn);
 
           if (sched_verbose >= 2)
-            debug_state (curr_state);
-
+	    {
+	      sel_print ("scheduled insn %d, clock %d\n", INSN_UID (insn),
+			 haifa_clock + 1);
+              debug_state (curr_state);
+	    }
 	  gcc_assert (cost < 0);
 	}
Comment 8 Zdenek Sojka 2010-09-29 19:00:20 UTC
Created attachment 21914 [details]
another testcase

I applied patch from comment #7 to r164716. It seems to work fine (no regressions), but there are still some cases that fail. One of them is attached. It fails with clean trunk the same way.
Comment 9 Zdenek Sojka 2010-09-29 19:02:45 UTC
Sorry, forgot to mention command line:

$ gcc -O1 -freorder-blocks -fschedule-insns2 -funswitch-loops -fselective-scheduling2 -fsel-sched-pipelining -funroll-all-loops pr45352-8.c 
pr45352-8.c: In function 'foo':
pr45352-8.c:15:1: internal compiler error: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7077
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
Comment 10 Andrey Belevantsev 2010-09-30 08:50:14 UTC
Are you sure you have applied the right patch?  With the patch from comment #7, the test doesn't fail for me, and moreover there is no assert on line 7077.  I'm attaching the patch to make sure that it's not garbled by inlining in the message.
Comment 11 Andrey Belevantsev 2010-09-30 08:55:20 UTC
Created attachment 21921 [details]
Latest patch
Comment 12 Zdenek Sojka 2010-09-30 11:03:51 UTC
Yes, sorry, I applied correct patch, but pasted assert from unpatched r164716:

The correct assert is:

$ FLAGS="-O1 -freorder-blocks -fschedule-insns2 -funswitch-loops -fselective-scheduling2 -fsel-sched-pipelining -funroll-all-loops"
$ x86_64-pc-linux-gnu-gcc $FLAGS pr45352-8.c 
pr45352-8.c: In function 'foo':
pr45352-8.c:15:1: internal compiler error: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7095
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugs.gentoo.org/> for instructions.

I am now bootstrapping with the patch from comment #11, I will let you know if there is any difference.

Thanks
Comment 13 Andrey Belevantsev 2010-10-05 12:45:21 UTC
Any updates on this?  The last test still does not fail with the patch from comment #11.  I am about to submit the patch, so are there any new failing tests or flag combinations?
Comment 14 Zdenek Sojka 2010-10-05 16:13:58 UTC
Created attachment 21965 [details]
another testcase

Hello, yes, this testcase as well as testcase from comment #12 are still failing for me with the patch from comment #11. (r164716, x86_64-pc-linux-gnu)

For this case, there are quite many flags needed, so I didn't upload it earlier:

$ FLAGS="-O1 -mtune=amdfam10 -fexpensive-optimizations -fgcse -foptimize-register-move -freorder-blocks -fschedule-insns2 -funswitch-loops -fgcse-las -fselective-scheduling2 -fsel-sched-pipelining -funroll-all-loops"
$ /mnt/svn/gcc-trunk/binary-164716-lto-fortran-checking-yes-rtl-df-pr45352-3/bin/gcc $FLAGS pr45352-9.c 
pr45352-9.c: In function 'slice_xvmc_init':
pr45352-9.c:55:7: warning: assignment makes integer from pointer without a cast [enabled by default]
pr45352-9.c: In function 'mpeg2_xvmc_slice':
pr45352-9.c:90:3: warning: passing argument 2 of 'bitstream_init' makes pointer from integer without a cast [enabled by default]
pr45352-9.c:40:8: note: expected 'void *' but argument is of type 'uint8_t'
pr45352-9.c:90:24: warning: initialization makes pointer from integer without a cast [enabled by default]
pr45352-9.c:105:1: internal compiler error: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7095
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
Comment 15 Andrey Belevantsev 2010-10-15 10:02:48 UTC
Created attachment 22050 [details]
Patch

The last issue was a different one, we have failed to co-ordinate the EBBs used when emulating Haifa scheduler and the blocks to reschedule after pipelining/bookkeeping/etc.  The existing code wasn't enough, we need to force the ebb start not only on the blocks for rescheduling, but also on the successors of the rescheduling "region".  Also, the patch that turned bitmap_bit_p into bitmap_clear_bit for the corresponding bitmap was wrong, this contributed to the bug.

The attached patch again fixes all flags/test combinations.  I'll be waiting for more test cases for a few days and then will submit.
Comment 16 Zdenek Sojka 2010-10-22 11:17:55 UTC
I gave this patch a lot of testing (compiled ~800 packages) and it seems to solve all crashes in reset_sched_cycles_in_current_ebb, without introducing any new problems.
Comment 17 Andrey Belevantsev 2010-11-03 12:23:09 UTC
Created attachment 22248 [details]
Updated patch

The updated patch is posted at http://gcc.gnu.org/ml/gcc-patches/2010-11/msg00273.html.  It is simpler as we now agreed that the scheduler should not issue more than issue_rate insns, so we fix the resetting loop itself instead of the core scheduler, retaining the needed bits from the previous patch.
Comment 18 Andrey Belevantsev 2010-11-08 08:11:43 UTC
Author: abel
Date: Mon Nov  8 08:11:38 2010
New Revision: 166429

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=166429
Log:
        PR rtl-optimization/45352
        * sel-sched.c (find_best_expr): Do not set pneed_stall when
        the variable_issue hook is not implemented.
        (fill_insns): Remove dead variable stall_iterations.
        (init_seqno_1): Force EBB start for resetting sched cycles on any
        successor blocks of the rescheduled region.
        (sel_sched_region_1): Use bitmap_bit_p instead of bitmap_clear_bit.
        (reset_sched_cycles_in_current_ebb): Add debug printing.
        New variable issued_insns.  Advance state when we have issued
        issue_rate insns.

        gcc.dg/pr45352.c, gcc.dg/pr45352-1.c, gcc.dg/pr45352-2.c: New tests.
        gcc.target/i386/pr45352.c, gcc.target/i386/pr45352-1.c,
        gcc.target/i386/pr45352-2.c: New tests.


Added:
    trunk/gcc/testsuite/gcc.dg/pr45352-1.c
    trunk/gcc/testsuite/gcc.dg/pr45352-2.c
    trunk/gcc/testsuite/gcc.dg/pr45352.c
    trunk/gcc/testsuite/gcc.target/i386/pr45352-1.c
    trunk/gcc/testsuite/gcc.target/i386/pr45352-2.c
    trunk/gcc/testsuite/gcc.target/i386/pr45352.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/sel-sched.c
    trunk/gcc/testsuite/ChangeLog
Comment 19 Zdenek Sojka 2010-11-08 17:58:36 UTC
Created attachment 22334 [details]
testcase failing in r166433

Thanks for fixing this! It seems there is one further failing testcase:

$ gcc -O -fcse-follow-jumps -fpartial-inlining -freorder-blocks -frerun-cse-after-loop -fschedule-insns2 -fsel-sched-pipelining -fselective-scheduling2 -funroll-loops -funswitch-loops testcase.C
testcase.C: In function 'void __fill_bvector(_Bit_iterator_base, _Bit_iterator_base, bool)':
testcase.C:41:1: internal compiler error: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7092
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
Comment 20 Zdenek Sojka 2010-11-17 16:15:34 UTC
Comment on attachment 22334 [details]
testcase failing in r166433

Opened PR46521 and PR46522 for the new two testcases. This one wasn't really reduced anyway. I also verified the new testcases don't give any warning with -Wall -Wextra.

I hope this will make the situation more clear and allow this PR to be closed.
Comment 21 Andrey Belevantsev 2010-11-17 16:47:00 UTC
(In reply to comment #20)
> Comment on attachment 22334 [details]
> testcase failing in r166433
> 
> Opened PR46521 and PR46522 for the new two testcases. This one wasn't really
> reduced anyway. I also verified the new testcases don't give any warning with
> -Wall -Wextra.
> 
> I hope this will make the situation more clear and allow this PR to be closed.

I'm looking at this testcase now, it is not too big.  The problem seems indeed be a bit different from other test cases.  I don't yet have a solution.
Comment 22 Andrey Belevantsev 2010-12-21 12:10:32 UTC
Created attachment 22834 [details]
Another patch

This patch should fix the last attachment from this bug and also the tests from 46521/46522.  The patch solves two problems:
- propagates the rescheduling bits also through empty blocks;
- fixes the (wrong) assumption that if we need to stall for N cycles because of the DFA, and for M > N cycles because of the data dependency, then after M cycles the DFA will be ready to issue.  Actually, we need to recheck the DFA after M cycles again.

I will submit the patch tomorrow after further testing and I can commit it either until Friday or next January, so Zdenek, please let me know if there are further problems with this patch.
Comment 23 Andrey Belevantsev 2010-12-22 07:46:57 UTC
Author: abel
Date: Wed Dec 22 07:46:53 2010
New Revision: 168164

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168164
Log:
	PR rtl-optimization/45352
	PR rtl-optimization/46521
	PR rtl-optimization/46522
	* sel-sched.c (reset_sched_cycles_in_current_ebb): Recheck the DFA state
	on the last iteration of the advancing loop.
	(sel_sched_region_1): Propagate the rescheduling bit to the next block
	also for empty blocks.

	* gcc.dg/pr46521.c: New.
	* gcc.dg/pr46522.c: New. 

Added:
    trunk/gcc/testsuite/gcc.dg/pr46521.c
    trunk/gcc/testsuite/gcc.dg/pr46522.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/sel-sched.c
    trunk/gcc/testsuite/ChangeLog
Comment 24 Zdenek Sojka 2010-12-23 20:56:46 UTC
Created attachment 22848 [details]
testcase failing in r168214

Thank you for fixing all the problem so far, but there seems to be further problem, this time with array of volatile floats and insane set of flags. I hope to find a more sane testcase...

$ gcc -O -fprofile-generate -fgcse -fno-gcse-lm -fgcse-sm -fno-ivopts -fno-tree-loop-im -ftree-pre -funroll-loops -fno-web -fschedule-insns2 -fselective-scheduling2 -fsel-sched-pipelining pr45352_r168214.c 
pr45352_r168214.c: In function 'foo':
pr45352_r168214.c:13:1: internal compiler error: in reset_sched_cycles_in_current_ebb, at sel-sched.c:7105
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
Comment 25 Andrey Belevantsev 2010-12-24 08:23:07 UTC
Zdenek, please don't worry about the set of flags, it does not make fixing the problem any harder, and your work on finding these is invaluable for us.

Sigh, in this case I forgot that we now also stall when we have issued exactly issue_rate instructions, so in this case we also need to recheck the DFA, not only after the data dependency stall.  So the last patch should be amended like below.

diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index edd6cb9..2f721fb 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -6990,7 +6990,7 @@ reset_sched_cycles_in_current_ebb (void)
     {
       int cost, haifa_cost;
       int sort_p;
-      bool asm_p, real_insn, after_stall;
+      bool asm_p, real_insn, after_stall, all_issued;
       int clock;

       if (!INSN_P (insn))
@@ -7026,8 +7026,8 @@ reset_sched_cycles_in_current_ebb (void)
           haifa_cost = cost;
           after_stall = 1;
         }
-      if (haifa_cost == 0
-         && issued_insns == issue_rate)
+      all_issued = issued_insns == issue_rate;
+      if (haifa_cost == 0 && all_issued)
        haifa_cost = 1;
       if (haifa_cost > 0)
        {
@@ -7059,7 +7059,7 @@ reset_sched_cycles_in_current_ebb (void)
                  become unavailable  to the DFA restrictions.  Looks strange
                  but happens e.g. on x86-64.  So recheck DFA on the last
                  iteration.  */
-              if (after_stall
+              if ((after_stall || all_issued)
                   && real_insn
                   && haifa_cost == 0)
                 haifa_cost = estimate_insn_cost (insn, curr_state);
Comment 26 Zdenek Sojka 2011-01-10 16:39:43 UTC
(In reply to comment #25)
> Sigh, in this case I forgot that we now also stall when we have issued exactly
> issue_rate instructions, so in this case we also need to recheck the DFA, not
> only after the data dependency stall.  So the last patch should be amended like
> below.
> 

During a lot of testing, I didn't get any ICE or miscompilation with that patch.
Comment 27 Andrey Belevantsev 2011-01-13 09:29:13 UTC
Author: abel
Date: Thu Jan 13 09:29:09 2011
New Revision: 168742

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168742
Log:
        PR rtl-optimization/45352
        * sel-sched.c: Update copyright years.
        (reset_sched_cycles_in_current_ebb): Also recheck the DFA state
        in the advancing loop when we have issued issue_rate insns.

	* gcc.dg/pr45352-3.c: New.

Added:
    trunk/gcc/testsuite/gcc.dg/pr45352-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/sel-sched.c
    trunk/gcc/testsuite/ChangeLog
Comment 28 Andrey Belevantsev 2011-01-13 09:33:28 UTC
Fixed.  Thanks Zdenek, this work was only made possible by your testing efforts!
Comment 29 Andrey Belevantsev 2011-04-07 07:04:09 UTC
Author: abel
Date: Thu Apr  7 07:04:02 2011
New Revision: 172088

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172088
Log:
        Backport from mainline
        2011-01-13  Andrey Belevantsev  <abel@ispras.ru>

        PR rtl-optimization/45352
        * sel-sched.c: Update copyright years.
        (reset_sched_cycles_in_current_ebb): Also recheck the DFA state
        in the advancing loop when we have issued issue_rate insns.

        Backport from mainline
        2010-12-22  Andrey Belevantsev  <abel@ispras.ru>

        PR rtl-optimization/45352
        PR rtl-optimization/46521
        PR rtl-optimization/46522
        * sel-sched.c (reset_sched_cycles_in_current_ebb): Recheck the DFA state
        on the last iteration of the advancing loop.
        (sel_sched_region_1): Propagate the rescheduling bit to the next block
        also for empty blocks.

        Backport from mainline
        2010-11-08  Andrey Belevantsev  <abel@ispras.ru>

        PR rtl-optimization/45352
        * sel-sched.c (find_best_expr): Do not set pneed_stall when
        the variable_issue hook is not implemented.
        (fill_insns): Remove dead variable stall_iterations.
        (init_seqno_1): Force EBB start for resetting sched cycles on any
        successor blocks of the rescheduled region.
        (sel_sched_region_1): Use bitmap_bit_p instead of bitmap_clear_bit.
        (reset_sched_cycles_in_current_ebb): Add debug printing.
        New variable issued_insns.  Advance state when we have issued
        issue_rate insns.


Added:
    branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/pr45352-1.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/pr45352-2.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/pr45352-3.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/pr45352.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/pr46521.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/pr46522.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.target/i386/pr45352-1.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.target/i386/pr45352-2.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.target/i386/pr45352.c
Modified:
    branches/gcc-4_5-branch/gcc/ChangeLog
    branches/gcc-4_5-branch/gcc/sel-sched.c
    branches/gcc-4_5-branch/gcc/testsuite/ChangeLog