This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/26778] GCC4 moves the result of a conditional block through inadequate registers
- From: "guillaume dot melquiond at ens-lyon dot fr" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 21 Mar 2006 15:27:16 -0000
- Subject: [Bug target/26778] GCC4 moves the result of a conditional block through inadequate registers
- References: <bug-26778-7904@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #2 from guillaume dot melquiond at ens-lyon dot fr 2006-03-21 15:27 -------
> But using the *ps variants on an SSE1 target is ok - the xmm
> registers are just used as temporary storage.
I can't really think of situations where it makes sense. If this a temporary
storage, it means this is the result of some computations and it will be used
in some other computations. In my example, the temporary "th" is the sum of two
doubles x[1] and x[2], and it will later be added to the double x[0]. I don't
really know the SSE instruction set, but I can't think of a situation where a
*ps instruction will give a double result or take a double value. So, in order
for an xmm register to act as a temporary, it means that the following pattern
has to be used :
1. do a fp computation, the result is in a fp register
2. store this fp register on the stack
3. load this stack memory in the xmm register
4. do other things
5. store this xmm register on the stack
6. load this stack memory in an fp register
7. do a fp computation with this fp register
Using a xmm register with double type on SSE1 requires steps 2,3 and 5,6. So it
would make a lot more sense to me to skip steps 3 and 5. Step 6 should directly
load the value stored at step 2, there is no point in going through a xmm
register and doing an additional load+store.
Sorry if I missed something.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26778