Bug 32201 - Can not allocate %xmm0 register for variable blend insn
Can not allocate %xmm0 register for variable blend insn
Status: RESOLVED FIXED
Product: gcc
Classification: Unclassified
Component: target
4.3.0
: P3 normal
: 4.3.0
Assigned To: Not yet assigned to anyone
: ra, ssemmx
Depends on:
Blocks: 32189
  Show dependency treegraph
 
Reported: 2007-06-04 06:51 UTC by Uroš Bizjak
Modified: 2007-06-14 19:11 UTC (History)
2 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Local alloc RTL dump (1.75 KB, text/plain)
2007-06-04 08:06 UTC, Uroš Bizjak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Uroš Bizjak 2007-06-04 06:51:42 UTC
Currently, it is not possible to use SSE 4.1 variable blend instructions in asm statements. These instructions require the third argument to be in %xmm0, but gcc fails to allocate correct register even when (new) "z" constraint is used.

--cut here--
typedef float V4SFmode __attribute__((vector_size(16)));

V4SFmode t (V4SFmode a, V4SFmode b, V4SFmode c)
{
  V4SFmode ret;

  asm ("blenvdps %0, %2, %3" : "=x" (ret) : "0" (a), "x" (b), "z" (c));
  return ret;
}
--cut here--

gcc -O1 -msse

prxxx.c: In function 't':
prxxx.c:7: error: can't find a register in class 'SSE_FIRST_REG' while reloading 'asm'
prxxx.c:7: error: 'asm' operand has impossible constraints
Comment 1 Andrew Pinski 2007-06-04 07:22:34 UTC
How can this be a regression if the constraint is new?
Also it seems like you could use register asm("xmm0") to get the correct register to be used.
Comment 2 Uroš Bizjak 2007-06-04 07:39:10 UTC
(In reply to comment #1)
> How can this be a regression if the constraint is new?

This is the same failure as PR32189, and that one is marked as a regression.

> Also it seems like you could use register asm("xmm0") to get the correct
> register to be used.

But please note that "c" argument is passed to the function via xmm2.
Comment 3 Andrew Pinski 2007-06-04 07:50:52 UTC
> > Also it seems like you could use register asm("xmm0") to get the correct
> > register to be used.
> But please note that "c" argument is passed to the function via xmm2.
So this is why GCC has register asm() extension is to get the correct register to be used.  The code would look like:
typedef float V4SFmode __attribute__((vector_size(16)));

V4SFmode t (V4SFmode a, V4SFmode b, V4SFmode c)
{
 V4SFmode ret;
 register V4SFmode c1 asm("xmm0");
 c1 = c;

 asm ("blenvdps %0, %2, %3" : "=x" (ret) : "0" (a), "x" (b), "x" (c1));
 return ret;
}

Since c1 is already in xmm0, the register allocator knows it cannot use a as xmm0 so there is still a move before the other move and after the asm.
Comment 4 Andrew Pinski 2007-06-04 07:58:25 UTC
Note local allocate should be able to figure the "z" constraint is only one register and assign it to that pesdu-register.
Comment 5 Uroš Bizjak 2007-06-04 08:06:57 UTC
Created attachment 13655 [details]
Local alloc RTL dump

RTL dump of .c.163r.lreg
Comment 6 Uroš Bizjak 2007-06-14 19:11:58 UTC
Fixed in mainline.