This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How is generic SIMD support supposed to work?


> On Sun, Oct 20, 2002 at 02:46:28PM +0200, Jan Hubicka wrote:
> >   __v4si val = {1,2,3,4};
> >     return val;
> > }
> > and I hope this to be compiled into static initializer loaded at once.
> > Unforutnately this does not happen, but even worse compiler dies:
> 
> What happens on other architectures is that this value gets
> loaded into 4 integer registers, and the subregging works as
> expected.  That's going to be prohibitive on x86, so we either
> need to arrange for such pseudos to get allocated to the stack
> (via CLASS_CANNOT_CHANGE_MODE_P), and/or come up with another
> mechanism (via named patterns, I assume) for the code generator
> to ask to read or set a vector element.
HI,
I've hit this problem in different context.  We currently misscompile
gcc on P4 and K8 since we generate:

(insn:HI 178 177 179 3 0x2a95ce16c0 (set (subreg:DF (reg/v:DI 94) 0)
        (reg/v:DF 58)) 93 {*movdf_nointeger} (nil)
    (expr_list:REG_DEAD (reg/v:DF 58)
        (nil)))

(insn:HI 179 178 180 3 0x2a95ce16c0 (set (mem/f:SI (plus:SI (reg/f:SI 7 esp)
                (const_int 16 [0x10])) [0 S4 A32])
        (subreg:SI (reg/v:DI 94) 0)) 44 {*movsi_1} (insn_list 178 (nil))
    (nil))

(insn:HI 180 179 181 3 0x2a95ce16c0 (set (mem/f:SI (plus:SI (reg/f:SI 7 esp)
                (const_int 12 [0xc])) [0 S4 A32])
        (subreg:SI (reg/v:DI 94) 4)) 44 {*movsi_1} (nil)
    (expr_list:REG_DEAD (reg/v:DI 94)
        (nil)))

Register allocator decides to put 94 into XMM that is kind of sane
decision.  The sequence than comes out as two movqs with xmm0 as
operand.

I tried to use CLASS_CANNOT_CHANGE_MODE_P like this:
#define CANNOT_CHANGE_MODE_CLASS(FROM, TO) \
  ((FROM) != (TO) ? SSE_REGS : NO_REGS)
But this makes us to refuse subregs like (subreg:V2DF (reg:DF) 0)
that is valid and we use it to represent some of scalar fp operations in
later optimization passes.

problem is that I don't see how to distinquish it from
(subreg:V2DF (reg:DF) 8) that is ivalid.
What would you think about adding an SUBREG_BYTE arugment into the macro
and updating everything?  This is the only way out I see and I think
this is important regression we should fix in 3.3

Honza
> 
> 
> r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]