This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/12902] Invalid assembly generated when using SSE / xmmintrin.h
- From: "uros at kss-loka dot si" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 14 Dec 2004 10:52:38 -0000
- Subject: [Bug target/12902] Invalid assembly generated when using SSE / xmmintrin.h
- References: <20031105013127.12902.kbowers@lanl.gov>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Additional Comments From uros at kss-loka dot si 2004-12-14 10:52 -------
The problem here is in combiner, which in combination with reload pass produce
somehow incorrect pattern.
The line that segfaults is:
c->v = _mm_loadl_pi(c->v,((__m64 *)a0)+1);
This line is represented with foloowing RTL sequence (pr12902.c.00.expand):
(insn 26 24 27 1 (parallel [
(set (reg:SI 80)
(plus:SI (reg:SI 70 [ a0.26 ])
(const_int 8 [0x8])))
(clobber (reg:CC 17 flags))
]) -1 (nil)
(nil))
(insn 27 26 28 1 (set (reg:SI 81)
(reg:SI 80)) -1 (nil)
(nil))
(insn 28 27 30 1 (set (reg:V4SF 60 [ D.3679 ])
(vec_merge:V4SF (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 <variable>.v+0 S16
A128])
(mem:V4SF (reg:SI 81) [0 S16 A8])
(const_int 3 [0x3]))) -1 (nil)
(nil))
(insn 30 28 32 1 (set (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 <variable>.v+0 S16 A128])
(reg:V4SF 60 [ D.3679 ])) -1 (nil)
(nil))
This whole sequence is combined into one RTL insn (pr12902.c.17.combine) that
satisfies "sse_movlps" pattern constraints:
(insn 30 28 35 0 (set (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 <variable>.v+0 S16 A128])
(vec_merge:V4SF (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 <variable>.v+0 S16
A128])
(mem:V4SF (plus:SI (reg/v/f:SI 71 [ a0 ])
(const_int 8 [0x8])) [0 S16 A8])
(const_int 3 [0x3]))) 541 {sse_movlps} (insn_list:REG_DEP_TRUE 12 (nil))
(expr_list:REG_DEAD (reg/v/f:SI 71 [ a0 ])
(nil)))
Following this, reload generates what it thinks is the best reg/mem combination
to satisfy register constraints (pr12902.c.24.postreload) of "sse_movlps" pattern
(insn 80 28 30 0 (set (reg:V4SF 21 xmm0)
(mem:V4SF (plus:SI (reg/v/f:SI 4 si [orig:71 a0 ] [71])
(const_int 8 [0x8])) [0 S16 A8])) 509 {movv4sf_internal} (nil)
(nil))
(insn:HI 30 80 35 0 (set (mem/s:V4SF (reg/v/f:SI 1 dx [orig:77 c ] [77]) [0
<variable>.v+0 S16 A128])
(vec_merge:V4SF (mem/s:V4SF (reg/v/f:SI 1 dx [orig:77 c ] [77]) [0
<variable>.v+0 S16 A128])
(reg:V4SF 21 xmm0)
(const_int 3 [0x3]))) 541 {sse_movlps} (insn_list:REG_DEP_TRUE 12 (nil))
(nil))
Unfortunatelly, insn 80 will crash, because it results in unaligned load:
...
movaps 8(%esi), %xmm0 <- crash here
movlps %xmm0, (%edx)
...
Uros.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12902