Bug 70184 - Explicit register variables holding function arguments overwritten by conversion libcall
Summary: Explicit register variables holding function arguments overwritten by convers...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: inline-asm (show other bugs)
Version: 6.0
: P3 normal
Target Milestone: 6.3
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2016-03-11 09:55 UTC by ktkachov
Modified: 2017-06-21 11:11 UTC (History)
2 users (show)

See Also:
Host:
Target: arm
Build:
Known to work:
Known to fail: 4.8.5, 4.9.4, 5.3.1, 6.0
Last reconfirmed: 2016-03-31 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ktkachov 2016-03-11 09:55:09 UTC
Consider the testcase:
void foo(unsigned a, float b)
{
  unsigned c = (unsigned) b;
  register unsigned r0 __asm__("r0") = a;
  register unsigned r1 __asm__("r1") = c;
    __asm__ volatile( "str %[r1], [%[r0]]\n"
                      :
                      : [r0] "r" (r0),
                        [r1] "r" (r1));
}

Compile for an arm target with (for example) -Os -mfloat-abi=soft -march=armv7-a -marm.
This generates:
foo:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        push    {r4, lr}
        mov     r0, r1
        bl      __aeabi_f2uiz
        mov     r1, r0
        .syntax divided
@ 7 "foo.c" 1
        str r1, [r0]

@ 0 "" 2
        .syntax unified
        pop     {r4, pc}

The register r0 holding 'a' gets overwritten as part of the setup for the rounding libcall so when it gets tied to the variable 'r0' expecting it to hold 'a' it gets garbage instead.

This testcase is warned about in:
https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables

but the code above explicitly follows the workaround suggested on that page so I'd expect it to work.
Is this indeed a supported usecase of local register variables? and if so, is there a workaround for this bug?
Comment 1 ktkachov 2016-03-11 10:01:02 UTC
Things are bad already at expand time:
    2: r111:SI=r0:SI
    3: r112:SF=r1:SF
    4: NOTE_INSN_FUNCTION_BEG
    7: r0:SI=r111:SI
    8: r0:SF=r112:SF
    9: r0:SI=call [`__aeabi_f2uiz'] argc:0
      REG_CALL_DECL `__aeabi_f2uiz'
      REG_EH_REGION 0xffffffff80000000
   10: r113:SI=r0:SI
      REG_EQUAL uns_fix(r112:SF)
   11: r1:SI=r113:SI
   12: asm_operands
Comment 2 Andrew Pinski 2016-03-11 10:02:41 UTC
Does -fno-ter fix the issue?
How does the tree level look?
Comment 3 ktkachov 2016-03-11 10:07:46 UTC
(In reply to Andrew Pinski from comment #2)
> Does -fno-ter fix the issue?
> How does the tree level look?

You mean -fno-tree-ter?
That gives:
foo:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        push    {r4, lr}
        mov     r4, r0
        mov     r0, r1
        bl      __aeabi_f2uiz
        mov     r1, r0
        mov     r0, r4
        .syntax divided
@ 7 "foo.c" 1
        str r1, [r0]

@ 0 "" 2
        .syntax unified
        pop     {r4, pc}

Which looks correct, thanks!

The optimized tree dump looks right to me with or without -fno-tree-ter

;; Function foo (foo, funcdef_no=0, decl_uid=4090, cgraph_uid=0, symbol_order=0)

foo (unsigned int a, float b)
{
  register unsigned int r1 __asm__ (*r1);
  register unsigned int r0 __asm__ (*r0);
  unsigned int c;

  <bb 2>:
  c_2 = (unsigned int) b_1(D);
  r0 = a_4(D);
  r1 = c_2;
  __asm__ __volatile__("str %1, [%0]
" :  : "r0" "r" r0, "r1" "r" r1);
  return;

}
Comment 4 Richard Biener 2016-03-11 11:43:33 UTC
With TER we delay expanding of (unsigned b) until you require its expansion
during asm op expansion (and thus may be interleaved with asm expansion code).

I believe we had this issue in the past for other archs and libcalls.  Don't
remember what the solution there was though ;)  IIRC it was libcalls
expanded during libcall expansion or so like for a + (b * c) and both
+ and * resulting in a libcall.
Comment 5 Mikael Pettersson 2016-03-12 08:57:16 UTC
Looks similar to PR48863.
Comment 6 Ramana Radhakrishnan 2016-03-31 14:01:52 UTC
Confirmed - but not sure if this is a dup of PR48863
Comment 7 ktkachov 2016-11-23 10:29:50 UTC
(In reply to Richard Biener from comment #4)
> With TER we delay expanding of (unsigned b) until you require its expansion
> during asm op expansion (and thus may be interleaved with asm expansion
> code).
> 
> I believe we had this issue in the past for other archs and libcalls.  Don't
> remember what the solution there was though ;)  IIRC it was libcalls
> expanded during libcall expansion or so like for a + (b * c) and both
> + and * resulting in a libcall.

I see that tree-ssa-ter.c:find_replaceable_in_bb already tries to avoid replacing across calls but it doesn't take into account expressions that expand into libcalls and doesn't consider register variables.
I don't think there's an easy way to query from gimple if an arbitrary expression will result in a libcall, maybe the solution here is to not allow TER across definitions of register variables? Or is that too big a hammer?
Comment 8 ktkachov 2016-11-24 15:23:05 UTC
Author: ktkachov
Date: Thu Nov 24 15:22:34 2016
New Revision: 242840

URL: https://gcc.gnu.org/viewcvs?rev=242840&root=gcc&view=rev
Log:
[TER] PR target/48863 : Don't replace expressions across local register variable definitions

	PR target/48863
	PR inline-asm/70184
	* tree-ssa-ter.c (temp_expr_table): Add reg_vars_cnt field.
	(new_temp_expr_table): Initialise reg_vars_cnt.
	(free_temp_expr_table): Release reg_vars_cnt.
	(process_replaceable): Add reg_vars_cnt argument, set reg_vars_cnt
	field of TAB.
	(find_replaceable_in_bb): Use the above to record register variable
	write occurrences and cancel replacement across them.

	* gcc.target/arm/pr48863.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/arm/pr48863.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-ter.c
Comment 9 ktkachov 2016-12-01 13:47:45 UTC
Author: ktkachov
Date: Thu Dec  1 13:47:13 2016
New Revision: 243109

URL: https://gcc.gnu.org/viewcvs?rev=243109&root=gcc&view=rev
Log:
[TER] PR target/48863: Don't replace expressions across local register variable definitions

	Backport from mainline
	2016-11-24  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

	PR target/48863
	PR inline-asm/70184
	* tree-ssa-ter.c (temp_expr_table): Add reg_vars_cnt field.
	(new_temp_expr_table): Initialise reg_vars_cnt.
	(free_temp_expr_table): Release reg_vars_cnt.
	(process_replaceable): Add reg_vars_cnt argument, set reg_vars_cnt
	field of TAB.
	(find_replaceable_in_bb): Use the above to record register variable
	write occurrences and cancel replacement across them.

	* gcc.target/arm/pr48863.c: New test.


Added:
    branches/gcc-6-branch/gcc/testsuite/gcc.target/arm/pr48863.c
Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/testsuite/ChangeLog
    branches/gcc-6-branch/gcc/tree-ssa-ter.c
Comment 10 ktkachov 2017-06-19 14:57:30 UTC
Fixed.
Comment 11 Ramana Radhakrishnan 2017-06-21 11:11:55 UTC
Fixed in 6.3