Consider the testcase: void foo(unsigned a, float b) { unsigned c = (unsigned) b; register unsigned r0 __asm__("r0") = a; register unsigned r1 __asm__("r1") = c; __asm__ volatile( "str %[r1], [%[r0]]\n" : : [r0] "r" (r0), [r1] "r" (r1)); } Compile for an arm target with (for example) -Os -mfloat-abi=soft -march=armv7-a -marm. This generates: foo: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 push {r4, lr} mov r0, r1 bl __aeabi_f2uiz mov r1, r0 .syntax divided @ 7 "foo.c" 1 str r1, [r0] @ 0 "" 2 .syntax unified pop {r4, pc} The register r0 holding 'a' gets overwritten as part of the setup for the rounding libcall so when it gets tied to the variable 'r0' expecting it to hold 'a' it gets garbage instead. This testcase is warned about in: https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables but the code above explicitly follows the workaround suggested on that page so I'd expect it to work. Is this indeed a supported usecase of local register variables? and if so, is there a workaround for this bug?
Things are bad already at expand time: 2: r111:SI=r0:SI 3: r112:SF=r1:SF 4: NOTE_INSN_FUNCTION_BEG 7: r0:SI=r111:SI 8: r0:SF=r112:SF 9: r0:SI=call [`__aeabi_f2uiz'] argc:0 REG_CALL_DECL `__aeabi_f2uiz' REG_EH_REGION 0xffffffff80000000 10: r113:SI=r0:SI REG_EQUAL uns_fix(r112:SF) 11: r1:SI=r113:SI 12: asm_operands
Does -fno-ter fix the issue? How does the tree level look?
(In reply to Andrew Pinski from comment #2) > Does -fno-ter fix the issue? > How does the tree level look? You mean -fno-tree-ter? That gives: foo: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 push {r4, lr} mov r4, r0 mov r0, r1 bl __aeabi_f2uiz mov r1, r0 mov r0, r4 .syntax divided @ 7 "foo.c" 1 str r1, [r0] @ 0 "" 2 .syntax unified pop {r4, pc} Which looks correct, thanks! The optimized tree dump looks right to me with or without -fno-tree-ter ;; Function foo (foo, funcdef_no=0, decl_uid=4090, cgraph_uid=0, symbol_order=0) foo (unsigned int a, float b) { register unsigned int r1 __asm__ (*r1); register unsigned int r0 __asm__ (*r0); unsigned int c; <bb 2>: c_2 = (unsigned int) b_1(D); r0 = a_4(D); r1 = c_2; __asm__ __volatile__("str %1, [%0] " : : "r0" "r" r0, "r1" "r" r1); return; }
With TER we delay expanding of (unsigned b) until you require its expansion during asm op expansion (and thus may be interleaved with asm expansion code). I believe we had this issue in the past for other archs and libcalls. Don't remember what the solution there was though ;) IIRC it was libcalls expanded during libcall expansion or so like for a + (b * c) and both + and * resulting in a libcall.
Looks similar to PR48863.
Confirmed - but not sure if this is a dup of PR48863
(In reply to Richard Biener from comment #4) > With TER we delay expanding of (unsigned b) until you require its expansion > during asm op expansion (and thus may be interleaved with asm expansion > code). > > I believe we had this issue in the past for other archs and libcalls. Don't > remember what the solution there was though ;) IIRC it was libcalls > expanded during libcall expansion or so like for a + (b * c) and both > + and * resulting in a libcall. I see that tree-ssa-ter.c:find_replaceable_in_bb already tries to avoid replacing across calls but it doesn't take into account expressions that expand into libcalls and doesn't consider register variables. I don't think there's an easy way to query from gimple if an arbitrary expression will result in a libcall, maybe the solution here is to not allow TER across definitions of register variables? Or is that too big a hammer?
Author: ktkachov Date: Thu Nov 24 15:22:34 2016 New Revision: 242840 URL: https://gcc.gnu.org/viewcvs?rev=242840&root=gcc&view=rev Log: [TER] PR target/48863 : Don't replace expressions across local register variable definitions PR target/48863 PR inline-asm/70184 * tree-ssa-ter.c (temp_expr_table): Add reg_vars_cnt field. (new_temp_expr_table): Initialise reg_vars_cnt. (free_temp_expr_table): Release reg_vars_cnt. (process_replaceable): Add reg_vars_cnt argument, set reg_vars_cnt field of TAB. (find_replaceable_in_bb): Use the above to record register variable write occurrences and cancel replacement across them. * gcc.target/arm/pr48863.c: New test. Added: trunk/gcc/testsuite/gcc.target/arm/pr48863.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-ter.c
Author: ktkachov Date: Thu Dec 1 13:47:13 2016 New Revision: 243109 URL: https://gcc.gnu.org/viewcvs?rev=243109&root=gcc&view=rev Log: [TER] PR target/48863: Don't replace expressions across local register variable definitions Backport from mainline 2016-11-24 Kyrylo Tkachov <kyrylo.tkachov@arm.com> PR target/48863 PR inline-asm/70184 * tree-ssa-ter.c (temp_expr_table): Add reg_vars_cnt field. (new_temp_expr_table): Initialise reg_vars_cnt. (free_temp_expr_table): Release reg_vars_cnt. (process_replaceable): Add reg_vars_cnt argument, set reg_vars_cnt field of TAB. (find_replaceable_in_bb): Use the above to record register variable write occurrences and cancel replacement across them. * gcc.target/arm/pr48863.c: New test. Added: branches/gcc-6-branch/gcc/testsuite/gcc.target/arm/pr48863.c Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/testsuite/ChangeLog branches/gcc-6-branch/gcc/tree-ssa-ter.c
Fixed.
Fixed in 6.3