This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

"generic" vectorization?


Hi,

I was just wondering about vectorization for platforms without proper
vector instructions like Alpha. Sometimes, vectorization could still
be a noticeable win, for example transforming

char *s1, *s2, *d; 
for (i = 0; i < 8; i++)
  d[i] = s1[i] + s2[i] 

to

uint64_t x = load(s1), y = load(s2);
uint64_t signmask = 0x8080808080808080;
uint64_t signs = (x ^ y) & signmask;
x &= ~signmask;
y &= ~signmask;
x += y;
x ^= signs;
store(d,  x);

which has lots of instructions, but also much instruction parallelism.

This would only seem worth it if we can handle at least 4 elements a
time. A way to do this would be to add generic code for to optabs, or
to do it in the machine description.

Does this seem like a good idea? Are there other targets which would
profit from it?

-- 
	Falk


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]