This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Notes from tinkering with the autovectorizer (4.1.1)


I've been tinkering with the autovectorizer.  It's really cool.
I particularly like the realignment support.

I've noticed just a few things while tinkering with it (in 4.1.1):

0) The realignment code takes the floor of the unaligned pointer, and we
increment the unaligned pointer in the loop.  This is great for
architectures like Alpha that have floor addressing modes, because finding
the floor is free.  But for architectures like ARM, it's much better to
take the floor outside the loop and be able to postincrement by VECSIZE
inside the loop.


1) The definition of the realignment instruction doesn't match hardware for instrution sets like ARM WMMX, where aligned amounts shift by 0 bytes instead of VECSIZE byes. This makes it useless for vector realignment, because in the case that the pointer happens to be aligned, we get the wrong vector. Looks like the SPARC realignment hook does the same thing... Indeed, it looks like Altivec is the only one to support it, and they do some trickery with shifting the wrong (against endianness) way based on the two's compliment of the source (a very clever trick). No other machine (evidentally) can easily meet the description of the current realignment mechanism.

Of course, for safety reasons I guess we don't always get the next vector
(the one at address floor(ptr+VECSIZE)), which would allow us to use the
shift-style instructions.

So, there may be a few options:

* Have a flag or hook where we can say it is always OK to read the next
       element.  This is probably a bad option; everyone who used the
       vectorizer would have to know that they may need to pad their
       arrays if they are in a protected memory environment.

* Conditionally fetch the next bundle, and don't do the fetch of the
       next data the last time around if might not be safe.  Probably
       a bad idea for architectures without conditional execution.

* Currently we drop out of the loop when there are VEC_ELEMENTS - 1
       iterations or less.  We could drop out when there are VEC_ELEMENTS
       or less, and then we could always fetch the next aligned data.

* Some other clever trick I don't know about. :-)

* Or keep it the way it is, and leave out the machines that have the
       shift-by-zero instead of the shift-by-VECSIZE behavior for
       an aligned pointer.


2) It seems like there may be some hooks that aren't documented. For instance, there seems to be some kind of support for the "vcond" standard name, but I can't seem to find it in the documentation.


In general things work quite well, and it seems to play reasonably well with things like the modulo scheduler.

Cheers,

Erich

--
Why are ``tolerant'' people so intolerant of intolerant people?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]