This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
- From: Hans-Peter Nilsson <hp at bitrange dot com>
- To: gcc-patches at gcc dot gnu dot org
- Date: Sun, 11 Nov 2012 23:49:34 -0500 (EST)
- Subject: [RFA:] fix PR55030, wrong code from __builtin_setjmp_receiver
The problem exposed in PR55030 (repeatable on x86_64-linux with -m32 at
r192676) is that the fake-frame-pointer "frame" is replaced with the
actual-frame-pointer "bp" in cse1, around the critical insn in
__builtin_setjmp_receiver that restores their defined offset. The
patch in PR55030/r192676 removed a clobber of the frame-pointer, and
cse1 felt free to replace the former register with the latter, as in
the following expansion of __builtin_setjmp_receiver (in builtins.c:
expand_builtin_setjmp_receiver) from a reduced test-case, see the PR.
(You might find the setup/restore code having an unexpected order;
there are two __builtin_setjmps in series and the setup of the second
one is storing the wrong value for the frame-pointer.)
The pre-transformation dump in cse1 says:
(code_label/s 13 51 14 3 2 "" [2 uses])
(note 14 13 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 15 14 78 3 (use (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1
(nil))
(insn 78 15 17 3 (set (reg/f:SI 20 frame)
(reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1
(nil))
(insn 17 78 56 3 (unspec_volatile [
(const_int 0 [0])
] UNSPECV_BLOCKAGE) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 643 {blockage}
(nil))
(insn 56 17 57 3 (set (reg/f:SI 74)
(symbol_ref:SI ("chk_fail_buf") [flags 0x40] <var_decl 0x7ffff7512688 chk_fail_buf>)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal}
(nil))
(insn 57 56 58 3 (set (mem:SI (reg/f:SI 74) [5 S4 A8])
(reg/f:SI 20 frame)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal}
(nil))
Before r192676 and after the reversion in r192701, there was/is that
weird extra "(clobber (reg/f:SI 6 bp)" inserted before insn 17. But
without that, we notice post-cse1-transformation, a change in insn 57:
(code_label/s 13 51 14 3 2 "" [3 uses])
(note 14 13 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 15 14 78 3 (use (reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1
(nil))
(insn 78 15 17 3 (set (reg/f:SI 20 frame)
(reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 -1
(nil))
(insn 17 78 56 3 (unspec_volatile [
(const_int 0 [0])
] UNSPECV_BLOCKAGE) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:31 643 {blockage}
(nil))
(insn 56 17 57 3 (set (reg/f:SI 74)
(symbol_ref:SI ("chk_fail_buf") [flags 0x40] <var_decl 0x7ffff7512688 chk_fail_buf>)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal}
(nil))
(insn 57 56 58 3 (set (mem:SI (reg/f:SI 74) [5 S4 A8])
(reg/f:SI 6 bp)) /home/hp/gcctop/tmp/base/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/pr55030-chk.c:37 65 {*movsi_internal}
(nil))
It seemed wrong for this change to be stopped *only* by that (clobber bp).
Stepping through the code for insn 78 and 57, I was looking for
related special-casing of frame- and other similar registers in cse.c
and was a bit confused when I found none (not counting hash_rtx_cb).
(I admit lack of special-casing is a good sign, though. :) Then I was
blinded by the obvious; the next insn, insn 17, is a "blockage" insn.
This is a target-specific blockage insn, but with the general form
found in all targets defining it. The default blockage is an empty
asm-volatile, which is what cse_insn recognized. The blockage insn is
there to "prevent scheduling" of the critical insns and register
values. It's almost obvious that an unspec_volatile should have that
effect "too"; at least that's intended by the code in
expand_builtin_setjmp_receiver. Luckily (or unluckily regarding the
presence of the bug) *this* cse code is not run post-frame-layout
(post-reload-cse does not use this code), or it seems people would
soon notice register values used from the wrong side of the blockage,
considering its critical use at the restore of the stack-pointer.
As mentioned in the previous patch,
<http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html>, clobbering
the frame-pointer (and then using it) does not seem the logical
alternative to the patch below; the blockage insn should just do its job.
I updated comments and documentation so it's more apparent that
register uses should also not be moved across the blockage; see the
patched comments.
Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after
192677. Ok to commit?
gcc:
PR middle-end/55030
* builtins.c (expand_builtin_setjmp_receiver): Update comment
regarding purpose of blockage.
* emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for
the head comment.
* doc/md.texi (blockage): Update similarly. Change wording to
require one of two forms, rather than implying a wider choice.
* cse.c (cse_insn): Where checking for blocking insns, treat
UNSPEC_VOLATILE as blocking, besides volatile ASM.
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c (revision 192677)
+++ gcc/builtins.c (working copy)
@@ -963,7 +963,8 @@ expand_builtin_setjmp_receiver (rtx rece
/* We must not allow the code we just generated to be reordered by
scheduling. Specifically, the update of the frame pointer must
- happen immediately, not later. */
+ happen immediately, not later. Similarly, we must block
+ (frame-related) register values to be used across this code. */
emit_insn (gen_blockage ());
}
Index: gcc/cse.c
===================================================================
--- gcc/cse.c (revision 192677)
+++ gcc/cse.c (working copy)
@@ -5660,10 +5660,11 @@ cse_insn (rtx insn)
invalidate (XEXP (dest, 0), GET_MODE (dest));
}
- /* A volatile ASM invalidates everything. */
+ /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything. */
if (NONJUMP_INSN_P (insn)
- && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
- && MEM_VOLATILE_P (PATTERN (insn)))
+ && ((GET_CODE (PATTERN (insn)) == ASM_OPERANDS
+ && MEM_VOLATILE_P (PATTERN (insn)))
+ || GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE))
flush_hash_table ();
/* Don't cse over a call to setjmp; on some machines (eg VAX)
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c (revision 192677)
+++ gcc/emit-rtl.c (working copy)
@@ -364,8 +364,8 @@ get_reg_attrs (tree decl, int offset)
#if !HAVE_blockage
-/* Generate an empty ASM_INPUT, which is used to block attempts to schedule
- across this insn. */
+/* Generate an empty ASM_INPUT, which is used to block attempts to schedule,
+ and to block register equivalences to be seen across this insn. */
rtx
gen_blockage (void)
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi (revision 192677)
+++ gcc/doc/md.texi (working copy)
@@ -5832,8 +5832,9 @@ the values of operands 1 and 2.
@item @samp{blockage}
This pattern defines a pseudo insn that prevents the instruction
-scheduler from moving instructions across the boundary defined by the
-blockage insn. Normally an UNSPEC_VOLATILE pattern.
+scheduler and other passes from moving instructions and using register
+equivalences across the boundary defined by the blockage insn.
+This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.
@cindex @code{memory_barrier} instruction pattern
@item @samp{memory_barrier}
brgds, H-P