[PATCH, i386]: Add SSE4.2 support - pcmpstr part

Uros Bizjak ubizjak@gmail.com
Mon Jun 4 14:36:00 GMT 2007


On 6/4/07, Jan Hubicka <jh@suse.cz> wrote:
> > > No, it is also needed for new __cconly instructions to instantiate
> > > instruction that has free register to clobber. IMO there is no other way
> > > for allocator to choose between two alternative instructions.
> >
> > Ah, you are right here, I was looking for instructions having 'z' in
> > multiple alternatives, but I've missed this one.
> > If we really need use xmm0 constraint in alternatives, we are screwed
> > and we need class.
> Hi,
> looking into the pattern sse.md more, the pcmpistri pcmpistrm really
> only differs in the ignored output operand.

Please note, that xmm is used to return _mask_ and ecx is used to
return _index_. The values in xmm and ecx  are not the same as
pcmpistri and pcmpistrm have different functionality. The patterns
with two outputs (+ CC) are non-existing fake patterns in order to CSE
these insns as much as possible. These patterns are split to real
instruction patterns just before reload.

> I wonder if the performance
> characteristics was already published for SSE4.2 implementation. At
> least from AMD chip experience the SSE instructions communicating output
> to integer registers are a lot more expensive than pure SSE
> instructions.
> That would nullify the need for multiple alternatives (we would want to
> always use the second alternative). Which also suggest that reordering
> the pattern to make pcmpistrm come out with priority now might be a good
> step.

The ecx variant was choosen as a preference on the (shaky) ground that
we just used two XMM registers, so we could use one integer register
to distribute reg usage. But if it helps any target, we can reorder
them without problems.

Uros.



More information about the Gcc-patches mailing list