This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/26778] GCC4 moves the result of a conditional block through inadequate registers



------- Comment #2 from guillaume dot melquiond at ens-lyon dot fr  2006-03-21 15:27 -------
> But using the *ps variants on an SSE1 target is ok - the xmm
> registers are just used as temporary storage.

I can't really think of situations where it makes sense. If this a temporary
storage, it means this is the result of some computations and it will be used
in some other computations. In my example, the temporary "th" is the sum of two
doubles x[1] and x[2], and it will later be added to the double x[0]. I don't
really know the SSE instruction set, but I can't think of a situation where a
*ps instruction will give a double result or take a double value. So, in order
for an xmm register to act as a temporary, it means that the following pattern
has to be used :

1. do a fp computation, the result is in a fp register
2. store this fp register on the stack
3. load this stack memory in the xmm register
4. do other things
5. store this xmm register on the stack
6. load this stack memory in an fp register
7. do a fp computation with this fp register

Using a xmm register with double type on SSE1 requires steps 2,3 and 5,6. So it
would make a lot more sense to me to skip steps 3 and 5. Step 6 should directly
load the value stored at step 2, there is no point in going through a xmm
register and doing an additional load+store.

Sorry if I missed something.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26778


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]