Bug 86635 - [avr] Miscompilation with __memx and libgcc float function __gtsf2
Summary: [avr] Miscompilation with __memx and libgcc float function __gtsf2
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Senthil Kumar Selvaraj
Keywords: wrong-code
: 65657 87376 (view as bug list)
Depends on:
Reported: 2018-07-23 06:51 UTC by Senthil Kumar Selvaraj
Modified: 2019-02-19 17:51 UTC (History)
2 users (show)

See Also:
Target: avr
Known to work:
Known to fail:
Last reconfirmed: 2018-07-23 00:00:00

pr86635.patch (276 bytes, patch)
2018-07-23 13:05 UTC, Senthil Kumar Selvaraj
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Senthil Kumar Selvaraj 2018-07-23 06:51:05 UTC
A libgcc float function invocation (_e.g. _gtsf2) with one of its arguments in the __memx address space is miscompiled - that argument is never loaded/passed to the function.

In the below case, a does not get loaded from memory, and it's value is not set in the argument registers.

$ cat test.c
extern const  __memx float a;
extern const float b;

int diff () { return a > b; }

$ avr-gcc -Os -mmcu=atmega328p -S -o -
        push r28
        push r29
/* prologue: function */
/* frame size = 0 */
/* stack size = 2 */
        ldi r28,lo8(1)
        ldi r29,0
        lds r18,b
        lds r19,b+1
        lds r20,b+2
        lds r21,b+3
        call __gtsf2
        cp __zero_reg__,r24
        brlt .L2
        ldi r29,0
        ldi r28,0
        movw r24,r28
/* epilogue start */
        pop r29
        pop r28
Comment 1 Senthil Kumar Selvaraj 2018-07-23 13:05:32 UTC
Created attachment 44422 [details]

Looks like ud_dce removes the insn that sets reg:SF r22 because the insn says r22 is clobbered. The below insn is in the previous pass dump (init-regs), and ud_dce deletes insns 8, presumably because the output register is clobbered by the insn.

(insn 8 7 9 2 (parallel [
            (set (reg:SF 22 r22)
                (mem/u/c:SF (reg/f:PSI 47) [1 a+0 S4 A8 AS7]))
            (clobber (reg:SF 22 r22))
            (clobber (reg:QI 21 r21))
            (clobber (reg:HI 30 r30))
        ]) "test.c":4 36 {xloadsf_A}
     (expr_list:REG_DEAD (reg/f:PSI 47)
        (expr_list:REG_UNUSED (reg:HI 30 r30)
            (expr_list:REG_EQUAL (mem/u/c:SF (symbol_ref:PSI ("a") [flags 0xe40]  <var_decl 0x7fabf444c900 a>) [1 a+0 S4 A8 AS7])

This pattern is generated via a gen_xload<mode>_A call in mov<mode> expander, so adding constraints to xload<mode>_A will not help.

Forcing the dest to be a pseudo (attached patch) fixes the problem - other passes see the clobber, remove it and use reg:SF r22 as the output reg.
Comment 2 Georg-Johann Lay 2018-07-23 14:30:31 UTC
Hi, in expand dump there is

(insn 8 7 9 (parallel [
            (set (reg:SF 22 r22)
                (mem/u/c:SF (reg/f:PSI 47) [1 a+0 S4 A8 AS7]))
            (clobber (reg:SF 22 r22))
            (clobber (reg:QI 21 r21))
            (clobber (reg:HI 30 r30))
        ]) "foo.c":4 -1

so the problem is that the middle-end provides a hard reg as target that overlaps one of the interface regs.  This reminds me of PR63633 / PR65657.  PR63633 fixed the 3-operand insns case that use hard regs used in the transparent libgcc calls.

The intention of the clobber of reg 22 (and the other clobbers) is to keep passes from propagating anything that overlaps the clobbers into an operand of the insn, xload<mode>_A at that time. The very libcall (xload_<mode>_libgcc) with its proper operands is generated from that insn during .split1.

FYI, I tried the "proper" solution (use 1-reg constraints etc. and let reg-alloc do the job) several times and with different versions of gcc, and I always failed miserably: the code bloat was not acceptable, in particular with DImode (cf. also PR85805).  Lest alone all the spill fails to expect with such a solution.

Thank you for taking care of this; at the moment all I can do is commenting from the distance and try to help with analysis and the rationals behind the current code.
Comment 3 Georg-Johann Lay 2018-07-23 14:53:40 UTC
As a work-around -fno-tree-ter appears to work.
Comment 4 Georg-Johann Lay 2018-10-14 18:12:01 UTC
*** Bug 87376 has been marked as a duplicate of this bug. ***
Comment 5 Georg-Johann Lay 2018-10-14 18:13:36 UTC
*** Bug 65657 has been marked as a duplicate of this bug. ***