The following code is derived from an effecient string hashing function and seems to indicate a GCC compiler bug. This code has been compiled with g++ 2.95.3, 3.3.4 and 3.4.2 and behaves the same way in each case. But it behaves differently (and the way we would expect) when compiled with the Sun Forte C++ compiler. #include <iostream> int main(int argc, char** argv) { // Round address up to integral multiple const char* pc = argv[1]; pc += 3; pc = reinterpret_cast<const char*>(reinterpret_cast<unsigned>(pc) & ~3); // Treat as pointer to unsigned, display address, and value const unsigned int* pui = reinterpret_cast<const unsigned int*>(pc); std::cout << " pui=" << (void*)pui << " *pui=" << std::hex << *pui << std::endl; // Treat as pointer to unsigned char, display address const unsigned char* puc = reinterpret_cast<const unsigned char*>(pc); std::cout << "puc (before)=" << (void*)puc << std::endl; // Obtain value by shift and OR unsigned int uint = *puc++ << 24 | *puc++ << 16 | *puc++ << 8 | *puc++; // Display address and value std::cout << "puc (after)=" << (void*)puc << " uint=" << uint << std::endl; return 0; } Output when compiled with Forte C++: ./testf abcdef pui=ffbeea14 *pui=61626364 puc (before)=ffbeea14 puc (after)=ffbeea18 uint=61626364 Output when compiled with GCC: ./testg abcdef pui=0xffbeea14 *pui=61626364 puc (before)=0xffbeea14 puc (after)=0xffbeea15 uint=61616161 The pointer value, puc, is only incremented once (instead of 4 times). We are trying to work around this problem, but other implementations produce slower code and speed is important here. g++ -v Reading specs from /ldatae/gnu/gcc-3.3.4/bin/../lib/gcc-lib/sparc-sun-solaris2.8/3.3.4/specs Configured with: ../gcc-3.3.4/configure --prefix=/volws/pmd25/ldatae/gnu/gcc-3.3.4 --enable-shared --enable-threads --enable-cpp --enable-languages=c++ --with-gnu-as --with-as=/volws/pmd25/ldatae/gnu/gcc-3.3.4/bin/as --with-gnu-ld --with-ld=/volws/pmd25/ldatae/gnu/gcc-3.3.4/bin/ld --host=sparc-sun-solaris2.8 Thread model: posix gcc version 3.3.4
I also tried this code on Red Hat Linux with GCC 3.2.2: ./a.out abcdef pui=0xbffffa28 *pui=66656463 puc (before)=0xbffffa28 puc (after)=0xbffffa2c uint=63636363 Here the address is being incremented 4 times but the result still only consists of the firt byte (taking into account the endian differences) repeated 4 times. g++ -v Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/3.2.2/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --host=i386-redhat-linux Thread model: posix gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
*puc++ << 24 | *puc++ << 16 | *puc++ << 8 | *puc++ That statement is undefined because puc++ can happen in any order.
Reopen ...
... to mark as (yet another) duplicate of PR 11751. *** This bug has been marked as a duplicate of 11751 ***
(In reply to comment #2) > *puc++ << 24 | *puc++ << 16 | *puc++ << 8 | *puc++ > > That statement is undefined because puc++ can happen in any order. OK, I guess the Standard allows for this, but there are still 2 points to be made: 1) Why is the value of puc only incremented once instead of 4 times? That still seems like a bug. 2) This works as I had expected: --puc; unsigned int uint = *++puc << 24 | *++puc << 16 | *++puc << 8 | *++puc; output: ./testg2 abcdefgh pui=0xffbeea18 *pui=64656667 puc (before)=0xffbeea18 puc (after)=0xffbeea1b uint=64656667 which seems inconsistent since the preincrement can also happen in any order.
As for question 1) The standard says that the result is undefined, i.e. the compiler can do whatever it pleases. That includes doing one of the increments, four in the same order in which the appear left to right, four in the reverse order, etc. It isn't a bug, but an opportunity the standard leaves to the compiler writers to implement optimizations. As for 2) the result is still undefined, and the fact that you get what you expect can be attributed to luck. W.
(In reply to comment #6) > As for question 1) The standard says that the result is undefined, i.e. > the compiler can do whatever it pleases. That includes doing one of > the increments, four in the same order in which the appear left to > right, four in the reverse order, etc. It isn't a bug, but an opportunity > the standard leaves to the compiler writers to implement optimizations. Regardless of the order that the increment is done in this statement, I would expect it to be done 4 times (not only once) by the time the statement ends. > > As for 2) the result is still undefined, and the fact that you get what > you expect can be attributed to luck. I can accept this, I guess, if nothing in the Standard says the undefined behavior must be consistent.
(In reply to comment #7) > (In reply to comment #6) > > As for question 1) The standard says that the result is undefined, i.e. > > the compiler can do whatever it pleases. That includes doing one of > > the increments, four in the same order in which the appear left to > > right, four in the reverse order, etc. It isn't a bug, but an opportunity > > the standard leaves to the compiler writers to implement optimizations. > > Regardless of the order that the increment is done in this statement, I would > expect it to be done 4 times (not only once) by the time the statement ends. To be very clear here I should have said "incremented by 4, not just by 1". Sorry.
Your expectations are at odds with what the standard allows compilers to do. You will have to modify your code to get what you want. W.