Bug 15061 - [arm] c++ complex<double> arguments
Summary: [arm] c++ complex<double> arguments
Status: VERIFIED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.3.3
: P2 normal
Target Milestone: 4.5.0
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2004-04-22 04:29 UTC by Jonathan Larmour
Modified: 2009-10-14 22:39 UTC (History)
3 users (show)

See Also:
Host: i686-pc-linux-gnulibc2
Target: arm-elf
Build: i686-pc-linux-gnulibc2
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Test case showing failure (217 bytes, text/plain)
2004-04-22 04:31 UTC, Jonathan Larmour
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Larmour 2004-04-22 04:29:11 UTC
This is a bit of code compiled with g++ and using <complex> from libstdc++, but
the issue is an optimization one. Something Bad(tm) happens when complex
arguments are passed. The function prolog misuses the registers.

The small test case in question (which I'll attach separately) when compiled
with "arm-elf-g++ -O2" will produce the output:
Comparing (1,0) with (0,0)
(1,0) != (0,0)!

After a lot of playing I found it could also be reproduced with "arm-elf-g++ -O1
-fcse-skip-blocks":
Comparing (1,0) with (0,0)
(1,0) != (0,0)!

If compiled with "arm-elf-g++ -O2 -fno-cse-skip-blocks" however the problem does
not disappear, so I don't expect any issue with -fcse-skip-blocks itself, but it
just changes the optimizer state sufficiently. Curiously though, the output
isn't quite the same:
Comparing (1,0) with (1,0)
(1,0) != (0,0)!

Various other -fno-* options with -O2 can cause the problem to appear and
disappear as well, so beware of using a different gcc from 3.3.3 - just because
the problem doesn't appear per se doesn't mean it has been fixed!

I tried a native linux gcc 3.3.3 on the test case and it works which may well
mean it's arm specific.

Looking at the generated assembler (in the -O1 -fcse-skip-blocks example) I
thought at first glance there was a problem in main:
        adr     r2, .L20
        ldmia   r2, {r2-r3}
        adr     r4, .L20+8
        ldmia   r4, {r4-r5}
where .L20 is:
.L20:
        .word   1072693248
        .word   0
        .word   0
        .word   0

However the -O1 code which appears to run correctly also has .L20 defined like
that. If compiled with -O0 something it indeed has the equivalent of the first
two words of .L20 being used to load both z1 and z2 which is more what I'd
expect. So maybe there are in fact two problems here? One where .L20 is defined
as above, and another being a code generation issue.

NB I haven't been able to play with 3.4 due to problems with it on my target.

Let me know if you want more info.
Comment 1 Jonathan Larmour 2004-04-22 04:31:11 UTC
Created attachment 6135 [details]
Test case showing failure
Comment 2 Jonathan Larmour 2004-04-22 04:41:01 UTC
Actually forget what I said about .L20 etc. I've just realised that the
assembler generated there is okay... it just slipped my mind that the complex
will be 4 words long, two for each double.
Comment 3 Giovanni Bajo 2004-10-07 09:24:46 UTC
Richard, Paul, this is an ARM problem. Would you please give a look?
Comment 4 Richard Earnshaw 2009-03-17 13:46:06 UTC
Cold case analysis time.  Have you seen this problem in recent releases of GCC?
Comment 5 Richard Earnshaw 2009-10-14 14:10:00 UTC
No feedback in 6 months.  Closing as presumed fixed.
Comment 6 Jonathan Larmour 2009-10-14 22:39:38 UTC
Sorry yes, it was somewhat hard to replicable even with 3.3.3, and I was never able to reproduce it with more recent GCC (but never quite satisfied myself that it wasn't reproduceable!). Time to draw a line.