This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/12902] Invalid assembly generated when using SSE / xmmintrin.h


------- Additional Comments From uros at kss-loka dot si  2004-12-14 10:52 -------
The problem here is in combiner, which in combination with reload pass produce
somehow incorrect pattern.

The line that segfaults is:

  c->v = _mm_loadl_pi(c->v,((__m64 *)a0)+1); 


This line is represented with foloowing RTL sequence (pr12902.c.00.expand):

(insn 26 24 27 1 (parallel [
            (set (reg:SI 80)
                (plus:SI (reg:SI 70 [ a0.26 ])
                    (const_int 8 [0x8])))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))

(insn 27 26 28 1 (set (reg:SI 81)
        (reg:SI 80)) -1 (nil)
    (nil))

(insn 28 27 30 1 (set (reg:V4SF 60 [ D.3679 ])
        (vec_merge:V4SF (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 <variable>.v+0 S16
A128])
            (mem:V4SF (reg:SI 81) [0 S16 A8])
            (const_int 3 [0x3]))) -1 (nil)
    (nil))

(insn 30 28 32 1 (set (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 <variable>.v+0 S16 A128])
        (reg:V4SF 60 [ D.3679 ])) -1 (nil)
    (nil))


This whole sequence is combined into one RTL insn (pr12902.c.17.combine) that
satisfies "sse_movlps" pattern constraints:

(insn 30 28 35 0 (set (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 <variable>.v+0 S16 A128])
        (vec_merge:V4SF (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 <variable>.v+0 S16
A128])
            (mem:V4SF (plus:SI (reg/v/f:SI 71 [ a0 ])
                    (const_int 8 [0x8])) [0 S16 A8])
            (const_int 3 [0x3]))) 541 {sse_movlps} (insn_list:REG_DEP_TRUE 12 (nil))
    (expr_list:REG_DEAD (reg/v/f:SI 71 [ a0 ])
        (nil)))


Following this, reload generates what it thinks is the best reg/mem combination
to satisfy register constraints (pr12902.c.24.postreload) of "sse_movlps" pattern

(insn 80 28 30 0 (set (reg:V4SF 21 xmm0)
        (mem:V4SF (plus:SI (reg/v/f:SI 4 si [orig:71 a0 ] [71])
                (const_int 8 [0x8])) [0 S16 A8])) 509 {movv4sf_internal} (nil)
    (nil))

(insn:HI 30 80 35 0 (set (mem/s:V4SF (reg/v/f:SI 1 dx [orig:77 c ] [77]) [0
<variable>.v+0 S16 A128])
        (vec_merge:V4SF (mem/s:V4SF (reg/v/f:SI 1 dx [orig:77 c ] [77]) [0
<variable>.v+0 S16 A128])
            (reg:V4SF 21 xmm0)
            (const_int 3 [0x3]))) 541 {sse_movlps} (insn_list:REG_DEP_TRUE 12 (nil))
    (nil))


Unfortunatelly, insn 80 will crash, because it results in unaligned load:

	...
        movaps  8(%esi), %xmm0    <- crash here
        movlps  %xmm0, (%edx)
	...

Uros.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12902


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]