This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
[analysis] PR target/17990: reload produces unaligned stack reference
- From: Uros Bizjak <uros at kss-loka dot si>
- To: gcc at gcc dot gnu dot org
- Date: Tue, 26 Oct 2004 08:27:31 +0200
- Subject: [analysis] PR target/17990: reload produces unaligned stack reference
Hello!
This simple testcase from target/17990 shows a bug in reload pass:
float hmag[128],hphase[128];
void prepare(float *oscilFFTfreqs)
{
int i;
for (i=0;i<128;i++)
*oscilFFTfreqs=-hmag[i]*sin(hphase[i])/2.0;
}
When compiled with '-O2 -msse', it will emit an unaligned access to stack:
...
movl $hmag, %ebx
subl $76, %esp
movss .LC0, %xmm0
movaps %xmm0, -56(%ebp) <--- here!
.p2align 4,,15
.L2:
flds (%ebx)
addl $4, %ebx
...
This problem is analyzed in detail in PR target/17990. In short: when
frame reg is eliminated [actually substituded with ebp] in
eliminate_regs() function, offsets are changed and aligned access can
become unaligned. This can be seen by adding a couple of debug_rtx()
calls around line 950 in reload1.c to analyze input and output to
eliminate_regs(). The debug info says everything:
IN:
(mem:V4SF (plus:SI (reg/f:SI 20 frame)
(const_int -32 [0xffffffe0])) [6 S16 A8])
OUT:
(mem:V4SF (plus:SI (reg/f:SI 6 bp)
(const_int -56 [0xffffffc8])) [6 S16 A8])
The problem is in PLUS case of eliminate_regs(), where following code is triggered:
...
else
return gen_rtx_PLUS (Pmode, ep->to_rtx,
plus_constant (XEXP (x, 1),
ep->previous_offset));
ep->previous_offset is (-24) in this case. When this value is added to the original (aligned) offset of -32,
-56 is produced, which is certainly wrong.
This bug is also described in http://www.cygwin.com/ml/cygwin/2004-04/msg01103.html, and also affects modes inside
ALIGN_MODE_128 macro (however unaligned access to XFmode IMHO won't crash the application):
#define ALIGN_MODE_128(MODE) \
((MODE) == XFmode || (MODE) == TFmode || SSE_REG_MODE_P (MODE))
I don't know, what is the best solution to this problem. PLUS case of eliminate_regs() just blindly adds
ep->previous_offset, without aligning the resulting offset. It looks that resulting stack should be _rearranged_,
according to required alignments of data, and not just moved, when frame reg is eliminated.
Uros.