On some targets, the instruction set contains SIMD vector instructions that operate on multiple values contained in one large register at the same time. For example, on the i386 the MMX, 3Dnow! and SSE extensions can be used this way.
The first step in using these extensions is to provide the necessary data
types. This should be done using an appropriate typedef
:
typedef int v4si __attribute__ ((mode(V4SI)));
The base type int
is effectively ignored by the compiler, the
actual properties of the new type v4si
are defined by the
__attribute__
. It defines the machine mode to be used; for vector
types these have the form V
nB
; n should be the
number of elements in the vector, and B should be the base mode of the
individual elements. The following can be used as base modes:
QI
HI
SI
DI
SF
DF
Not all base types or combinations are always valid; which modes can be used
is determined by the target machine. For example, if targetting the i386 MMX
extensions, only V8QI
, V4HI
and V2SI
are allowed modes.
There are no V1xx
vector modes - they would be identical to the
corresponding base mode.
There is no distinction between signed and unsigned vector modes. This distinction is made by the operations that perform on the vectors, not by the data type.
The types defined in this manner are somewhat special, they cannot be used with most normal C operations (i.e., a vector addition can not be represented by a normal addition of two vector type variables). You can declare only variables and use them in function calls and returns, as well as in assignments and some casts. It is possible to cast from one vector type to another, provided they are of the same size (in fact, you can also cast vectors to and from other datatypes of the same size).
A port that supports vector operations provides a set of built-in functions that can be used to operate on vectors. For example, a function to add two vectors and multiply the result by a third could look like this:
v4si f (v4si a, v4si b, v4si c) { v4si tmp = __builtin_addv4si (a, b); return __builtin_mulv4si (tmp, c); }