[RFC/RFT] Tree-level lowering of generic vectors, part 2

David Edelsohn dje@watson.ibm.com
Sat Aug 7 16:31:00 GMT 2004

>>>>> Paolo Bonzini writes:

Paolo> I don't understand this.  Passing by reference ensures that you can use 
Paolo> vld or lwz to load parts of the vector into the vector registers.

Paolo> Do you want to devise different cases for the standard and Altivec ABIs? 
	The point is not whether instructions exist to load from memory
for one particular architecture or another, making this assumption is
flawed.  If the target can load the value in parts, why can't you pass the
value in parts and let the normal ABI take care of the details?

Paolo> If for V16SF you want to use 16 GP slots, partly on the stack of course, 
Paolo> or 4 AltiVec slots, then the non-AltiVec case seems very very expensive 
Paolo> to me; and even though people won't likely use generic 256-element 
Paolo> vectors, potentially the AltiVec case can become expensive too.  Generic 
Paolo> vectors can be arbitrary wide, and in this their ABI requirements are 
Paolo> closer to those of arrays.

	What is the difference if the value is copied to the stack and a
pointer to the arg area is passed in a register versus copying the
argument to registers and letting it overflow so that the rest is copied
to the stack?

	I do not expect most people to use scalar architectures for
generic vector operations.  v16sf fits nicely into four v4sf registers and
Altivec has enough argument registers to handle that.  If you decompose
the arguments into the size appropriate for the target, just like you
decompose the operations, it all works very nicely.  This is what you need
to do.


More information about the Gcc-patches mailing list