[PATCH v2, target]: Fix PR 88070, ICE in create_pre_exit, at mode-switching.c:438
Uros Bizjak
ubizjak@gmail.com
Tue Nov 20 19:40:00 GMT 2018
Hello!
Attached patch is a different approach to the problem of split return
copies in create_pre_exit. It turns out that for vzeroupper insertion
pass, we actually don't need to insert a mode switch before the return
copy, it is enough to split edge to exit block - so we can emit
vzeroupper at the function exit edge.
Since x86 is the only target that uses optimize mode switching after
reload, I took the liberty and used !reload_completed for the
condition when we don't need to search for return copy. Sure, with the
big comment as evident from the patch.
2018-11-20 Uros Bizjak <ubizjak@gmail.com>
PR target/88070
* mode-switching.c (create_pre_exit): After reload, always split the
fallthrough edge to the exit block.
testsuite/ChangeLog:
2018-11-20 Uros Bizjak <ubizjak@gmail.com>
PR target/88070
* gcc.target/i386/pr88070.c: New test.
Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Committed to mainline SVN.
Uros.
-------------- next part --------------
Index: mode-switching.c
===================================================================
--- mode-switching.c (revision 266278)
+++ mode-switching.c (working copy)
@@ -248,8 +248,22 @@ create_pre_exit (int n_entities, int *entity_map,
gcc_assert (!pre_exit);
/* If this function returns a value at the end, we have to
insert the final mode switch before the return value copy
- to its hard register. */
- if (EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) == 1
+ to its hard register.
+
+ x86 targets use mode-switching infrastructure to
+ conditionally insert vzeroupper instruction at the exit
+ from the function where there is no need to switch the
+ mode before the return value copy. The vzeroupper insertion
+ pass runs after reload, so use !reload_completed as a stand-in
+ for x86 to skip the search for the return value copy insn.
+
+ N.b.: the code below assumes that the return copy insn
+ immediately precedes its corresponding use insn. This
+ assumption does not hold after reload, since sched1 pass
+ can schedule the return copy insn away from its
+ corresponding use insn. */
+ if (!reload_completed
+ && EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) == 1
&& NONJUMP_INSN_P ((last_insn = BB_END (src_bb)))
&& GET_CODE (PATTERN (last_insn)) == USE
&& GET_CODE ((ret_reg = XEXP (PATTERN (last_insn), 0))) == REG)
Index: testsuite/gcc.target/i386/pr88070.c
===================================================================
--- testsuite/gcc.target/i386/pr88070.c (nonexistent)
+++ testsuite/gcc.target/i386/pr88070.c (working copy)
@@ -0,0 +1,12 @@
+/* PR target/88070 */
+/* { dg-do compile } */
+/* { dg-options "-O -fexpensive-optimizations -fnon-call-exceptions -fschedule-insns -fno-dce -fno-dse -mavx" } */
+
+typedef float vfloat2 __attribute__ ((__vector_size__ (2 * sizeof (float))));
+
+vfloat2
+test1float2 (float c)
+{
+ vfloat2 v = { c, c };
+ return v;
+}
More information about the Gcc-patches
mailing list