This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC, LRA] Incorrect subreg resolution?


Hi,

When I relaxed CANNOT_CHANGE_MODE_CLASS to undefined for AArch64, gcc.c-torture/execute/copysign1.c generates incorrect code because LRA cannot seem to handle subregs like

 (subreg:DI (reg:TF hard_reg) 8)

on hard registers where the subreg byte offset is unaligned to a hard register boundary(16 for AArch64). It seems to quietly ignore the 8 and resolves this to incorrect an hard register during reload.

When I compile this test with -O3,

long double
cl (long double x, long double y)
{
  return __builtin_copysignl (x, y);
}

cs.c.213r.ira:

(insn 26 10 33 2 (set (reg:DI 87 [ y+8 ])
        (subreg:DI (reg:TF 33 v1 [ y ]) 8)) cs.c:4 34 {*movdi_aarch64}
     (expr_list:REG_DEAD (reg:TF 33 v1 [ y ])
        (nil)))
(insn 33 26 35 2 (set (reg:TF 93)
        (reg:TF 32 v0 [ x ])) cs.c:4 40 {*movtf_aarch64}
     (expr_list:REG_DEAD (reg:TF 32 v0 [ x ])
        (nil)))
(insn 35 33 34 2 (set (reg:DI 92 [ x+8 ])
        (subreg:DI (reg:TF 93) 8)) cs.c:4 34 {*movdi_aarch64}
     (nil))
(insn 34 35 23 2 (set (reg:DI 91 [ x ])
        (subreg:DI (reg:TF 93) 0)) cs.c:4 34 {*movdi_aarch64}
     (expr_list:REG_DEAD (reg:TF 93)
        (nil)))
....

cs.c.214r.reload

(insn 26 10 33 2 (set (reg:DI 2 x2 [orig:87 y+8 ] [87])
        (reg:DI 33 v1 [ y+8 ])) cs.c:4 34 {*movdi_aarch64}
     (nil))
(insn 33 26 35 2 (set (reg:TF 0 x0 [93])
        (reg:TF 32 v0 [ x ])) cs.c:4 40 {*movtf_aarch64}
     (nil))
(insn 35 33 34 2 (set (reg:DI 1 x1 [orig:92 x+8 ] [92])
        (reg:DI 1 x1 [+8 ])) cs.c:4 34 {*movdi_aarch64}
     (nil))
(insn 34 35 8 2 (set (reg:DI 0 x0 [orig:91 x ] [91])
        (reg:DI 0 x0 [93])) cs.c:4 34 {*movdi_aarch64}
     (nil))
.....

You can see the changes to insn 26 before and after reload - the SUBREG_BYTE offset of 8 seems to have been translated to v0 instead of v0.d[1] by get_hard_regno ().

What's interesting here is that the SUBREG_BYTE that is generated for

(subreg:DI (reg:TF 33 v1 [ y ]) 8)

isn't aligned to a hard register boundary on SIMD regs where UNITS_PER_VREG for AArch64 is 16. Therefore when this subreg is resolved, it resolves to v1 instead of v1.d[1]. Is this something going wrong in LRA or is this a more fundamental problem with generating subregs of hard regs with unaligned subreg byte offsets? The same subreg on a pseudo works OK because in insn 33, the TF mode is allocated integer registers and all is well.

Thanks,
Tejas Belagod
ARM.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]