[PATCH v2, target]: Fix PR 88070, ICE in create_pre_exit, at mode-switching.c:438

Uros Bizjak ubizjak@gmail.com
Tue Nov 20 19:40:00 GMT 2018


Hello!

Attached patch is a different approach to the problem of split return
copies in create_pre_exit. It turns out that for vzeroupper insertion
pass, we actually don't need to insert a mode switch before the return
copy, it is enough to split edge to exit block - so we can emit
vzeroupper at the function exit edge.

Since x86 is the only target that uses optimize mode switching after
reload, I took the liberty and used !reload_completed for the
condition when we don't need to search for return copy. Sure, with the
big comment as evident from the patch.

2018-11-20  Uros Bizjak  <ubizjak@gmail.com>

    PR target/88070
    * mode-switching.c (create_pre_exit): After reload, always split the
    fallthrough edge to the exit block.

testsuite/ChangeLog:

2018-11-20  Uros Bizjak  <ubizjak@gmail.com>

    PR target/88070
    * gcc.target/i386/pr88070.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
-------------- next part --------------
Index: mode-switching.c
===================================================================
--- mode-switching.c	(revision 266278)
+++ mode-switching.c	(working copy)
@@ -248,8 +248,22 @@ create_pre_exit (int n_entities, int *entity_map,
 	gcc_assert (!pre_exit);
 	/* If this function returns a value at the end, we have to
 	   insert the final mode switch before the return value copy
-	   to its hard register.  */
-	if (EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) == 1
+	   to its hard register.
+
+	   x86 targets use mode-switching infrastructure to
+	   conditionally insert vzeroupper instruction at the exit
+	   from the function where there is no need to switch the
+	   mode before the return value copy.  The vzeroupper insertion
+	   pass runs after reload, so use !reload_completed as a stand-in
+	   for x86 to skip the search for the return value copy insn.
+
+	   N.b.: the code below assumes that the return copy insn
+	   immediately precedes its corresponding use insn.  This
+	   assumption does not hold after reload, since sched1 pass
+	   can schedule the return copy insn away from its
+	   corresponding use insn.  */
+	if (!reload_completed
+	    && EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) == 1
 	    && NONJUMP_INSN_P ((last_insn = BB_END (src_bb)))
 	    && GET_CODE (PATTERN (last_insn)) == USE
 	    && GET_CODE ((ret_reg = XEXP (PATTERN (last_insn), 0))) == REG)
Index: testsuite/gcc.target/i386/pr88070.c
===================================================================
--- testsuite/gcc.target/i386/pr88070.c	(nonexistent)
+++ testsuite/gcc.target/i386/pr88070.c	(working copy)
@@ -0,0 +1,12 @@
+/* PR target/88070 */
+/* { dg-do compile } */
+/* { dg-options "-O -fexpensive-optimizations -fnon-call-exceptions -fschedule-insns -fno-dce -fno-dse -mavx" } */
+
+typedef float vfloat2 __attribute__ ((__vector_size__ (2 * sizeof (float))));
+
+vfloat2
+test1float2 (float c)
+{
+  vfloat2 v = { c, c };
+  return v;
+}


More information about the Gcc-patches mailing list