This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: Add SSE4.1 support


> 
> 1. Some SSE4.1 instructions will take the fixed xmm0 as the 3rd arg.
> Register allocator has to know not to put xmm0 in the 1st/2nd args for
> those instructions. I added 2 register classes, XMM0REG for xmm0 and
> XMMN_REGS for xmm1-15. But I didn't change regclass_map to make xmm0
> as XMM0REG and xmm1-15 as XMMN_REGS. Will it be a problem?

Well, since the xmm0 is completely symmetric to all other xmms from
reload POV, I would expect it does not make difference, but since
reg_class is defined as smallest class, I would preffer to change it.
Or does somethign break?
> 
> 2.  SSE4.1 has new instructions to extact an SI/DI value from an XMM
> register and put it in an SI/DI register/memory, pextrd/pextrq. SSE4.1
> intrinsic may generate:
> 
> (insn:HI 9 8 22 2 (set (reg:SI 0 ax [62])
>         (vec_select:SI (reg:V4SI 21 xmm0 [ i ])
>             (parallel [
>                     (const_int 0 [0x0])
>                 ]))) 1160 {*sse4_1_pextrd} (nil)
>     (nil))
> 
> and optimizer will turn it into
> 
> (insn 28 8 22 2 (set (reg:SI 0 ax [62])
>         (reg:SI 21 xmm0)) 40 {*movsi_1} (nil)
>     (nil))

We probably don't need named sse4_1_pextr patterns at all.  If they are
just new method to encode XMMreg->integer reg move, I would simply keep
them so.

What is the difference from the regular inter unit move?  Is the new instruction
faster or something?
> 
> But *movsi_1 won't allow move from xmm0 to ax if inter-unit moves are
> disabled. *movdi_1_rex64 has the same issue. I added pextrd/pextrq
> support to *movsi_1/*movdi_1_rex64. They are enabled when inter-unit
> moves are disabled.

This seems sane.
> 
> 3.  I introduced new constraints:
> 
> 	a. Y0: For XMM0REG.
> 	b. Yn: For XMMN_REGS.
> 	c. Y4: For any SSE register, when SSE4.1 is enabled and
> 	inter-unit moves are disabled.
> 
> to deal those issues.
> 
> 4.  Also I had to rewrite the umaxv8hi3 pattern in order to generate
> SSE4.1 instruction, pmaxud. As the result, it is no longer available
> for SSE2.
> 
> I will submit a patch for the SSE4.1 intrinsic testsuite later.
> 
> SSE4.2 support will come later.

I will check the patch shortly,
thanks
Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]