SSE5 patches round 3

Meissner, Michael michael.meissner@amd.com
Mon Sep 10 21:44:00 GMT 2007


I was just being conservative, since it would be possible to have the
register being stored to in operands[0] be used as an index/base
register in one of the memory operations.  If you load up op0 with the
memory operand that doesn't use op0, you will get a segfault in the
instruction when you use op0 as an index regiser.  Hence checking for
reload_completed (reload won't use op0 due to the '&' constraint), or
the register not being mentioned (which I would anticipate happening in
just about every code, but without doing the tests, you can't verify
it).

--
Michael Meissner
AMD, MS 83-29
90 Central Street
Boxborough, MA 01719

> -----Original Message-----
> From: Uros Bizjak [mailto:ubizjak@gmail.com]
> Sent: Monday, September 10, 2007 4:45 PM
> To: Meissner, Michael; Uros Bizjak; GCC Patches; Harle, Christophe;
> rajagopal, dwarak
> Subject: Re: SSE5 patches round 3
> 
> Hello!
> 
> Just a quick suggestion:
> 
> > + ;; We don't have a straight 32-bit parallel multiply on SSE5, so
fake
> it with a
> > + ;; multiply/add.
> > + (define_insn_and_split "*sse5_mulv4si3"
> > +   [(set (match_operand:V4SI 0 "register_operand" "=&x")
> > + 	(mult:V4SI (match_operand:V4SI 1 "register_operand" "%x")
> > + 		   (match_operand:V4SI 2 "nonimmediate_operand" "xm")))]
> > +   "TARGET_SSE5"
> > +   "#"
> > +   "TARGET_SSE5
> > +    && (reload_completed
> > +        || (!reg_mentioned_p (operands[0], operands[1])
> > + 	   && !reg_mentioned_p (operands[0], operands[2])))"
> > +   [(set (match_dup 0)
> > + 	(match_dup 3))
> > +    (set (match_dup 0)
> > + 	(plus:V4SI (mult:V4SI (match_dup 1)
> > + 			      (match_dup 2))
> > + 		   (match_dup 0)))]
> > + {
> > +   operands[3] = CONST0_RTX (V4SImode);
> > + }
> > +   [(set_attr "type" "ssemuladd")
> > +    (set_attr "mode" "TI")])
> > +
> 
> 
> This splitter does not need to be post-reload splitter. If you split
> this insn before register allocation, then allocator will solve all
> register conflicts for you:
> 
> 
> + (define_insn_and_split "*sse5_mulv4si3"
> +   [(set (match_operand:V4SI 0 "register_operand" "")
> + 	(mult:V4SI (match_operand:V4SI 1 "register_operand" "")
> + 		   (match_operand:V4SI 2 "nonimmediate_operand" "")))]
> +   "TARGET_SSE5
> +    && !(reload_completed || reload_in_progress)"
> +   "#"
> +   "&& 1"
> +   [(set (match_dup 0)
> + 	(match_dup 3))
> +    (set (match_dup 0)
> + 	(plus:V4SI (mult:V4SI (match_dup 1)
> + 			      (match_dup 2))
> + 		   (match_dup 0)))]
> + {
> +   operands[3] = CONST0_RTX (V4SImode);
> + }
> +   [(set_attr "type" "ssemuladd")
> +    (set_attr "mode" "TI")])
> 
> 
> Uros.
> 





More information about the Gcc-patches mailing list