This is the mail archive of the
mailing list for the GCC project.
Re: gcc 3.2 altivec options and glibc
Daniel Egger wrote:
> Am Son, 2002-08-25 um 18.28 schrieb Jack Howarth:
>> Okay. I had been looking over glibc/sysdeps/powerpc/fpu and those
>> routines, at first glance, seemed not to be specificly using assembly
>> for the fpu so I was curious if one could recompile them to run on
>> the altivec instead. Guess not...
> No, not. And actually I doubt that one could gain very much by adding
> altivecized routines because vectorized routines happen to be only fast
> for large(r) datasets and parallelizeable algorithms of which you'll
> find very few in glibc. That idea is not even suitable for
> parallelisable functions like memcpy because the alignment restrictions
> can hardly be enforced on the source and destination parameters which
> means either more specialcases or general aligning data which wouldn't
> leave any speed improvements.
I know it's offtopic, but you are ignoring Altivec's merge instruction,
which allows to write a compact memcpy loop (9 instruction loop to copy 32
bytes: 2 loads, 2 stores, 2 merges, 2 address bumps and one decrement and
branch), taking care of the alignment by shuffling bytes around in
registers. Of course it's only worth for fairly large copies, especially
since the head and tail of the copy are likely to have a non negligible
icache footprint. With a suitable shuffle register parameter, Altivec's
merge instruction can be used for many other things, like endian
conversion, etc..., but that's beyond the point.
Besides that, all vector instruction sets can at least be used for memset.