This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/48764] New: wrong-code bug in gcc-4.5.x, related to __restrict
- From: "wouter.vermaelen at scarlet dot be" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 25 Apr 2011 18:47:16 +0000
- Subject: [Bug tree-optimization/48764] New: wrong-code bug in gcc-4.5.x, related to __restrict
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48764
Summary: wrong-code bug in gcc-4.5.x, related to __restrict
Product: gcc
Version: 4.5.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: wouter.vermaelen@scarlet.be
I had originally posted this on gcc-help because I wasn't sure it was an actual
compiler bug or undefined behavior. Ian Lance Taylor replied that he didn't see
any undefined behavior. So I'm reporting it now as a bug.
Here's the original message:
http://gcc.gnu.org/ml/gcc-help/2011-04/msg00476.html
But I'll repeat it below:
--------------------
Hi all,
I believe I found a wrong-code bug. The problem triggers when using
gcc-4.5.1, 4.5.2 or 4.5.3, but not when using 4.4.5 or 4.7.0 (snapshot
20110419). It also only triggers with certain optimization levels/flags.
I wonder if this is a known problem and already fixed in 4.7.0, or that
the problem still exists but for some reason doesn't trigger in 4.7.0
(I couldn't easily find something in bugzilla).
Below is a reduced test-case that shows the problem. I tried, but I
couldn't get it smaller than these 4 files (combined about 60 lines).
While reducing this problem I realized that it *might* not be a compiler
bug, but undefined behaviour with the usage of __restrict in
Buffer::read(). What I wanted to express there is that the memory write
done by memcpy() can never overwrite the member variable 'p'. At the
moment I still believe it's a compiler bug, but I'm not 100% sure
anymore.
So is this a compiler bug or undefined behavior in my program? In case
of the latter I would appreciate if someone could explain what the
problem is and maybe suggest a way to fix it.
Thanks.
Wouter
BTW: The code for gcc-4.7.0 is correct but contains some useless extra
instructions (which I tried to avoid with __restrict). I'd also appreciate
hints on how to improve the generate code.
I do realize that the code in this reduced test-case may look a bit silly
and that suggestions to optimize the code may be hard because of this.
/// FooBar.hh /////
struct Loader;
struct FooBar {
void load(Loader& l);
char c1, c2;
};
/// Loader.hh /////
#include <cstring>
struct Buffer {
Buffer(const char* data) : p(data) {}
void read(void* __restrict out) __restrict {
memcpy(out, p, 1);
++p;
}
const char* p;
};
template<typename Derived> struct Base {
void load2(char& t) {
Derived& self = static_cast<Derived&>(*this);
self.load1(t);
}
int dummy;
};
struct Loader : Base<Loader> {
Loader(const char* data) : buffer(data) {}
void load1(char& t) { buffer.read(&t); }
Buffer buffer;
};
/// FooBar.cc /////
#include "FooBar.hh"
#include "Loader.hh"
#include <cstdio>
void FooBar::load(Loader& l)
{
l.load1(c1);
//printf("This print hides the bug\n");
l.load2(c2);
}
/// main.cc ///////
#include "FooBar.hh"
#include "Loader.hh"
#include <cstdio>
int main()
{
char data[2] = { 3, 5 };
Loader loader(data);
FooBar fb;
fb.load(loader);
if ((fb.c1 == 3) && (fb.c2 == 5)) {
printf("Ok\n");
} else {
printf("Wrong!\n");
}
}
> g++ --version
g++ (GCC) 4.5.3 20110423 (prerelease)
> uname -a
Linux argon 2.6.35-28-generic #49-Ubuntu SMP Tue Mar 1 14:39:03 UTC 2011 x86_64
GNU/Linux
> g++ -O3 FooBar.cc -c
> g++ -O3 main.cc -c
> g++ -o bug FooBar.o main.o
> ./bug
Wrong!
> objdump -d FooBar.o (gcc-4.5.3 prerelease)
mov 0x8(%rsi),%rdx
lea 0x8(%rsi),%rax
movzbl (%rdx),%edx
mov %dl,(%rdi)
mov 0x8(%rsi),%rdx <-- WRONG: still uses original value of Buffer::p
addq $0x1,(%rax) <-- it is only increased here (for the 1st time)
movzbl (%rdx),%edx
mov %dl,0x1(%rdi)
addq $0x1,(%rax)
retq
> objdump -d FooBar.o (gcc-4.7.0 20110419)
mov 0x8(%rsi),%rax
movzbl (%rax),%edx
mov %dl,(%rdi)
lea 0x1(%rax),%rdx <-- correct, but I know this is not
mov %rdx,0x8(%rsi) <-- required for my application
movzbl 0x1(%rax),%edx
add $0x2,%rax
mov %dl,0x1(%rdi)
mov %rax,0x8(%rsi)
retq