This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: Add XOP 128-bit and 256-bit support for upcoming AMD Orochi processor.
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: "rajagopal, dwarak" <dwarak dot rajagopal at amd dot com>
- Cc: 'Jan Hubicka' <hubicka at ucw dot cz>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, "Harle, Christophe" <christophe dot harle at amd dot com>, "Jagasia, Harsha" <harsha dot jagasia at amd dot com>
- Date: Wed, 14 Oct 2009 01:06:33 +0200
- Subject: Re: PATCH: Add XOP 128-bit and 256-bit support for upcoming AMD Orochi processor.
- References: <1C8DE0332CB01445BF7ADEDE3DDD57071F74C4@sausexmbp02.amd.com>
> > +(define_insn_and_split "*xop_mulv4si3"
> > + [(set (match_operand:V4SI 0 "register_operand" "=&x")
> > + (mult:V4SI (match_operand:V4SI 1 "register_operand" "%x")
> > + (match_operand:V4SI 2 "nonimmediate_operand" "xm")))]
> > > + "TARGET_XOP"
> > > + "#"
> > > + "&& (reload_completed
> > > + || (!reg_mentioned_p (operands[0], operands[1])
> > > + && !reg_mentioned_p (operands[0], operands[2])))"
> >
> > WHat happens when regs are mentioned?
> > There are other cases 2 memory operand multiply-add splitting testing
> > these, are we somehow making sure this conditional will always hold and
> > we won't ICE not being able to satisfy the conditions?
>
> Actually I am not even sure this xop_mulv4si3 pattern is needed because XOP now implies SSE 4.2 and AVX and so we can just generate the mulv4si3 patterns for AVX or SSE 4.1 when -mxop is used. Can I just remove this xop_mulv4si3 pattern then?
Well, if they are always shadowed by AVX or SSE4.1 equivalents then yes.
Are there really n advantage n this mulv4si3 pattern over other two
cases?
>
> As for your reference to "other cases 2 memory operand multiply-add splitting", I assume you are referring to the vpmac/d* define_splits.
>
> In XOP vpmac/d* instructions, there is no restriction any more for the destination reg to be same as the third src operand unlike SSE5. And only the second source can be memory. Also I don't see anything in the manual that the destination reg is enforced to be different from source 1, source 2 or source 3 operands individually either.
>
> Should I then remove below from the pmac/d* patterns?
>
> (!reg_mentioned_p (operands[0], operands[1])
> > > + && !reg_mentioned_p (operands[0], operands[2]))
Isn't the splitter starting with mov instruction from operand 1 into
operand 0? If 3 address form works here, I guess you need to remove
both this and update the splitter sequence to avoid the move.
> > Hmm, there is no unspec or omething that would make it clear that we can
> > not ever somehow simplify into this form with operand 2 being something
> > different than parallel with const_ints. I think this needs new
> > predicate.
>
> I can define a new predicate for it in predicates.md, but I am not sure how exactly to represent the "parallel with const ints" part.
You can use simple C code there, just see how i.e.
x86_64_immediate_operand is defined.
Honza
>
> Any suggestions?
>
> Thanks,
> Harsha