This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/52705] Loop optimization failure with -O2 versus -O1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52705

--- Comment #1 from pinskia at gmail dot com <pinskia at gmail dot com> 2012-03-25 05:12:44 UTC ---
You are volating c/c++ aliasing rules. Use memcpy or -fno-strict-aliasing .




Sent from my Palm Pre on AT&amp;T
On Mar 24, 2012 21:27, veiokej at gmail dot com
&lt;gcc-bugzilla@gcc.gnu.org&gt; wrote: 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52705



             Bug #: 52705

           Summary: Loop optimization failure with -O2 versus -O1

    Classification: Unclassified

           Product: gcc

           Version: 4.6.1

            Status: UNCONFIRMED

          Severity: normal

          Priority: P3

         Component: tree-optimization

        AssignedTo: unassigned@gcc.gnu.org

        ReportedBy: veiokej@gmail.com





Created attachment 26976

  --&gt; http://gcc.gnu.org/bugzilla/attachment.cgi?id=26976

Intermediate of bug.c



When I compile with these different optimization levels, I get different

output. This isn't confusion over floats or uninitialized variables, as far as

I can tell. It appears to relate to casted memory accesses.



First of all, this relates to

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49938, which _might_ solve the

problem (but I don't know, because I'm unable to upgrade from 4.6.1 under

MinGW). So please try the latest GCC before you try to debug this.



Here's the command line:



gcc -save-temps -O2 -obug.exe bug.c



This bug is very easy to reproduce. Here's the entire source of bug.c:



----------------------------------------------------

#include &lt;stdint.h&gt;

#include &lt;stdio.h&gt;



void

func(uint32_t a[8],uint32_t b[8]){

  uint32_t i;

  uint32_t c;

  int64_t d;



  for(i=0;i&lt;=1;i++){

    ((uint64_t *)(b))[0]=((uint64_t *)(a))[0];

    ((uint64_t *)(b))[1]=((uint64_t *)(a))[0];

    ((uint64_t *)(b))[2]=((uint64_t *)(a))[0];

    ((uint64_t *)(b))[3]=((uint64_t *)(a))[0];

    c=1;

    d=b[0];

    d-=c;

    b[0]=d;

    c=b[0];

    d=b[1];

    d-=c&lt;&lt;1;

    b[1]=d;

  }

  return;

}



int

main(int argc, char *argv[]){

  uint32_t a[8]={1,0,0,0,0,0,0,0};

  uint32_t b[8];



  func(a,b);



printf("%08X%08X%08X%08X%08X%08X%08X%08X\n",b[0],b[1],b[2],b[3],b[4],b[5],b[6],b[7]);

  return 0;

}

----------------------------------------------------



As you will see, you get different outputs depending on whether you use -O1 or

-O2.



The relation to Bug 49930 is this:



Look at the above code. If you change:



----------------------------------------------------

    d=b[1];

    d-=c&lt;&lt;1;

    b[1]=d;

----------------------------------------------------



to:



----------------------------------------------------

    d=b[0];

    d-=c&lt;&lt;1;

    b[0]=d;

----------------------------------------------------



Then you will see bug 49930.



Note that b[] appears to be only half-initialized because I only write to

subscripts 0 through 3. But that's not the case, because I've casted 8 32-bit

integers to 4 64-bit integers.



I notice that when I change the lines involving (uint64_t *) casts to normal

(uint32_t *) memory accesses, i.e. when I get rid of the casts, it seems to

work better (but didn't investigate at length). But I don't want to do that for

performance reasons. (bug.c is just an adaptation that's filtered from a "real"

function where performance matters.)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]