This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Haifa Scheduler] Fix latent bug in macro-fusion/instruction grouping


Hi,

While trying to force some particular conditions out of the scheduler
on a private branch, I was toying with some very aggressive
macro-fusion patterns. In doing so, I hit on a latent bug which can
cause miscompilation.

The exact conditions for the bug seem to be:

We go in to macro fusion with a previous pattern which can be rewritten
as a function of the current pattern. That is to say, we have something
like:

  x21 = f00f
  x1 = load (x21 + 16)
  x21 = x21 + 8

Which could be rewritten as:

  x21 = f00f
  x21 = x21 + 8
  x1 = load (x21 + 8)

If we choose to fuse the instructions, we set the SCHED_GROUP_P
flag on x21 = 21 + 8.

recompute_todo_spec will potentially apply the rewriting above, and
will apply it regardless of SCHED_GROUP_P. It also will not clear
the flag, so we end up pulling the fused instruction to the head of
the wrong scheduling group.

After scheduling, this gives the code:

  x21 = x21 + 8
  x21 = f00f
  x1 = load (x21 + 8)

Which is clearly bogus. To show this concretely with scheduler dumps
(with some additional debug information):

;;   ======================================================
;;   -- basic block 8 from 339 to 344 -- after reload
;;   ======================================================

;;	| insn | prio |
;;	|  339 |    3 | x21=x19+low(`*.LANCHOR0')      ca57_sx1|ca57_sx2
;;	|  341 |    2 | x21=x21+0x8                    ca57_sx1|ca57_sx2
;;	|  342 |    1 | x1=[x21+0x10]                  ca57_load_model
;;	|  204 |    0 | x21=x21+0x8                    ca57_sx1|ca57_sx2
;;	|  344 |    0 | pc={(x1==0)?L4717:pc}          ca57_bx
;;   --------------- forward dependences: ------------

;;   --- Region Dependences --- b 8 bb 0
;;      insn  code    bb   dep  prio  cost   reservation
;;      ----  ----    --   ---  ----  ----   -----------
;;      339   739     8     0     3     2   ca57_sx1|ca57_sx2	: 344 341m
;;      341    87     8     1     2     2   ca57_sx1|ca57_sx2	: 344 342m
;;      342    40     8     1     1     5   ca57_load_model	: 344m 204
;;   +  204    87     8     1     0     2   ca57_sx1|ca57_sx2	: 344
;;      344    17     8     4     0     0   ca57_bx	:

;;		Looking at: insn 204
;;	 Can rewrite predecessor of :: 204 :: from:

  (mem/f/c:DI (plus:DI (reg/f:DI 21 x21 [1023])
          (const_int 16 [0x10])) [3 MEM[(struct _Rb_tree_node_baseD.35435 * *)&_ZN12_GLOBAL__N_120section_workload_mapED.48605 + 16B]+0 S8 A64])

;;	 	TO

  (mem/f/c:DI (plus:DI (reg/f:DI 21 x21 [1023])
          (const_int 8 [0x8])) [3 MEM[(struct _Rb_tree_node_baseD.35435 * *)&_ZN12_GLOBAL__N_120section_workload_mapED.48605 + 16B]+0 S8 A64])

;;		dependencies resolved: insn 204
;;		tick updated: insn 204 into ready
;;		Ready list after queue_to_ready:  204:42:prio=0  339:39:prio=3
;;		Ready list after ready_sort:      204:42:prio=0  339:39:prio=3
;;	Ready list (t =   0):    204:42:prio=0  339:39:prio=3
;;		max_issue among 2 insns:  204:42:prio=0  339:39:prio=3

I'm fairly sure that it can't trigger on trunk with any of the
patterns in the AArch64/i386 back ends, and I can't give a reduced
testcase showing the bug (miscompilation of proprietary code :( ).

My fix is simply to clear the SCHED_GROUP_P flag if we do a rewrite
as above, this seems sensible, but it may not be - I don't know this
code, and I might be pessimising (or worse, breaking) code which
requires fusion between the re-ordered instructions.

I've bootstrapped the patch on x86_64 and aarch64-none-linux-gnu
with no issues.

Does this look sensible? I'm fairly sure my analysis of the bug is
sound, but I could imagine other possible fixes to the same issue.
I guess the most obvious would be to not allow the rewriting at all
for instructions with SCHED_GROUP_P set.

Thanks,
James

---
2015-02-06  James Greenhalgh  <james.greenhalgh@arm.com>

	* haifa-sched.c (recompute_todo_spec): After applying a
	replacement and cancelling a dependency, also clear the
	SCHED_GROUP_P flag.
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 75d2421..730a8db 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -1293,6 +1293,10 @@ recompute_todo_spec (rtx_insn *next, bool for_backtrack)
 		  apply_replacement (dep, true);
 		}
 	      DEP_STATUS (dep) |= DEP_CANCELLED;
+	      /* Also cancel any scheduler grouping, or we can
+		 end up falsely claiming we are independent of
+		 previous instructions.  */
+	      SCHED_GROUP_P (next) = 0;
 	    }
 	}
       return 0;

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]