This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/52705] Loop optimization failure with -O2 versus -O1
- From: "pinskia at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 25 Mar 2012 05:12:44 +0000
- Subject: [Bug tree-optimization/52705] Loop optimization failure with -O2 versus -O1
- Auto-submitted: auto-generated
- References: <bug-52705-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52705
--- Comment #1 from pinskia at gmail dot com <pinskia at gmail dot com> 2012-03-25 05:12:44 UTC ---
You are volating c/c++ aliasing rules. Use memcpy or -fno-strict-aliasing .
Sent from my Palm Pre on AT&T
On Mar 24, 2012 21:27, veiokej at gmail dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52705
Bug #: 52705
Summary: Loop optimization failure with -O2 versus -O1
Classification: Unclassified
Product: gcc
Version: 4.6.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: veiokej@gmail.com
Created attachment 26976
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26976
Intermediate of bug.c
When I compile with these different optimization levels, I get different
output. This isn't confusion over floats or uninitialized variables, as far as
I can tell. It appears to relate to casted memory accesses.
First of all, this relates to
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49938, which _might_ solve the
problem (but I don't know, because I'm unable to upgrade from 4.6.1 under
MinGW). So please try the latest GCC before you try to debug this.
Here's the command line:
gcc -save-temps -O2 -obug.exe bug.c
This bug is very easy to reproduce. Here's the entire source of bug.c:
----------------------------------------------------
#include <stdint.h>
#include <stdio.h>
void
func(uint32_t a[8],uint32_t b[8]){
uint32_t i;
uint32_t c;
int64_t d;
for(i=0;i<=1;i++){
((uint64_t *)(b))[0]=((uint64_t *)(a))[0];
((uint64_t *)(b))[1]=((uint64_t *)(a))[0];
((uint64_t *)(b))[2]=((uint64_t *)(a))[0];
((uint64_t *)(b))[3]=((uint64_t *)(a))[0];
c=1;
d=b[0];
d-=c;
b[0]=d;
c=b[0];
d=b[1];
d-=c<<1;
b[1]=d;
}
return;
}
int
main(int argc, char *argv[]){
uint32_t a[8]={1,0,0,0,0,0,0,0};
uint32_t b[8];
func(a,b);
printf("%08X%08X%08X%08X%08X%08X%08X%08X\n",b[0],b[1],b[2],b[3],b[4],b[5],b[6],b[7]);
return 0;
}
----------------------------------------------------
As you will see, you get different outputs depending on whether you use -O1 or
-O2.
The relation to Bug 49930 is this:
Look at the above code. If you change:
----------------------------------------------------
d=b[1];
d-=c<<1;
b[1]=d;
----------------------------------------------------
to:
----------------------------------------------------
d=b[0];
d-=c<<1;
b[0]=d;
----------------------------------------------------
Then you will see bug 49930.
Note that b[] appears to be only half-initialized because I only write to
subscripts 0 through 3. But that's not the case, because I've casted 8 32-bit
integers to 4 64-bit integers.
I notice that when I change the lines involving (uint64_t *) casts to normal
(uint32_t *) memory accesses, i.e. when I get rid of the casts, it seems to
work better (but didn't investigate at length). But I don't want to do that for
performance reasons. (bug.c is just an adaptation that's filtered from a "real"
function where performance matters.)