This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: recent troubles with float vectors & bitwise ops
- From: rridge at csclub dot uwaterloo dot ca (Ross Ridge)
- To: gcc at gcc dot gnu dot org
- Date: Wed, 22 Aug 2007 14:08:46 -0400 (EDT)
- Subject: Re: recent troubles with float vectors & bitwise ops
tbp writes:
>Apparently enough for a small vendor like Intel to propose such things
>as orps, andps, andnps, and xorps.
Paolo Bonzini writes:
>I think you're running too far with your sarcasm. SSE's instructions
>do not go so far as to specify integer vs. floating point. To me, "ps"
>means "32-bit SIMD", independent of integerness
The IA-32 instruction set does distignuish between integer and
floating point bitiwse operations. In addition to the single-precision
floating-point bitwise instructions that tbp mentioned (ORPS, ANDPS,
ANDNPS and XORPS) there are both distinct double-precision floating-point
bitwise instructions (ORPD, ANDPD, ANDNPD and XORPD) and integer bitwise
instructions (POR, PAND, PANDN and PXOR). While these operations all do
the same thing, they can differ in performance depending on the context.
Intel's IA-32 Software Developer's Manual gives this warning:
In this example: XORPS or PXOR can be used in place of XORPD
and yield the same correct result. However, because of the type
mismatch between the operand data type and the instruction data
type, a latency penalty will be incurred due to implementations
of the instructions at the microarchitecture level.
>>And now i guess the only sanctioned access to those ops is via
>>builtins/intrinsics.
>
>No, you can do so with casts.
tbp is correct. Using casts gets you the integer bitwise instrucitons,
not the single-precision bitwise instructions that are more optimal for
flipping bits in single-precision vectors. If you want GCC to generate
better code using single-precision bitwise instructions you're now forced
to use the intrinsics.
Ross Ridge