This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: md description for intruction that modifies multiple operands


> > >      (set (match_operand:V16SI 4 "register_operand" "=v")
> > >           (unspec:V16SI [(match_dup 1) (match_dup 3) (match_dup 7)] 462))
> > >      (set (match_operand:V16SI 6 "register_operand" "=v")
> > >           (unspec:V16SI [(match_dup 1) (match_dup 3) (match_dup 7)] 463))]
> 
> BTW, after seeing the new RTL, I realized that the last "set" above
> should probably be:
> 
>    (set (match_operand:V16SI 6 "register_operand" "=v")
>         (unspec:V16SI [(match_dup 1) (match_dup 3) (match_dup 5)] 463))]
> 
> Note the change from 7 -> 5.  Hopefully this is correct.
> 
> > This looks like an expansion problem.  How are you calling 
> > gen_fm_block4()?  You need to pass 8 arguments to it now, something like
> > 
> > 	gen_fm_block4(t0, t0, t1, t1, t2, t2, t3, t3);
> 
> That was the problem.  I fixed it and the generated code for the example
> is now:
> 
>  foo:
>         j       $31
>         block4.m        $m0,$m1,$m2,$m3
> 
> which is completely optimal.  The function args are passed in m0
> through m3, the block4 is called with them in the right order, and the
> function returns with the result left in m0.
> 
> However, I'm not clear on whether or not the template guarantees that
> the register allocation will be sequential.  I suspect not.  So we may
> still have the problem of training the register allocator to ensure
> that the operands to the block4.m instruction are always some
> sequential set of four registers out of the possible 16 (m0-m15).

There's no way to do this, unfortunately.  ARM has a similar problem with 
the load-multiple operations.  In that case a bit set of registers to load 
is encoded in the instruction and the marked registers are filled 
sequentially from memory from the lowest numbered register at the lowest 
address.  We work around this by using specific hard registers for that 
pattern and then using peepholes for spotting a few cases 
opportunistically.  Take a look at the movstrqi pattern in arm.md if you 
want some ideas.

> 
> I won't even try to think yet about the block4v instruction, which
> requires a set like {m0,m4,m8,m12} or {m1,m5,m9,m13}.  :-(
> 

Equally impossible for the same reasons, and maybe more ;-(

R.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]