Bug 37631 - non-volatile asm passes volatile asm (-O3)
Summary: non-volatile asm passes volatile asm (-O3)
Status: RESOLVED DUPLICATE of bug 17884
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: unknown
: P3 minor
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-23 22:23 UTC by pardo
Modified: 2008-09-29 17:48 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description pardo 2008-09-23 22:23:36 UTC
I'm not 100% sure this is a bug; it seems like one.

A non-volatile asm passes a volatile asm.  Intuitively, it seems a volatile asm should be a pretty "heavy" barrier.

As a workaround, declaring both volatile does limit code motion, but also  limits the optimizer more than just keeping the relative order of the two volatiles.

Happens with this compiler:

$ g++ -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,java,f95,objc,ada,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.0 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-java-awt=gtk-default --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-4.0-1.4.2.0/jre --enable-mpfr --disable-werror --with-tune=pentium4 --enable-checking=release i486-linux-gnu
Thread model: posix
gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)

also happens with (replacing "add" with "addq" may be needed):

$g++ -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: /build/configure --prefix=/usr/local/build --target=x86_64-unknown-linux-gnu --disable-nls --enable-threads=posix --enable-symvers=gnu --enable-__cxa_atexit --enable-c99 --enable-long-long --build=i686-host_pc-linux-gnu --host=i686-host_pc-linux-gnu --disable-multilib --enable-shared=libgcc,libmudflap,libssp,libstdc++ --enable-languages=c,c++,fortran --with-sysroot=/usr/grte/v1 --with-root-prefix=/usr/grte/v1 --with-native-system-header-dir=/include --with-local-prefix=/
Thread model: posix
gcc version 4.2.2

Build using Makefile:

CXX = g++
CXXFLAGS = -Wall -O3

all:	bug.dis bug.E

bug:	bug.cc Makefile
	$(CXX) $(CXXFLAGS) -o bug bug.cc

bug.E:	bug.cc Makefile
	$(CXX) $(CXXFLAGS) -E -o bug.E bug.cc

bug.dis:	bug
	objdump --disassemble bug > bug.dis

Input program (this is not the CPP output, but the input program does not use CPP and this has a potentially-useful comment):

//
// The following program compiled with g++ -O3 produces the code shown below.
// Note the add of %fs appears in the source between calls to Now() but appears
// in the object code before the first call to Now().  The asm with %fs is not
// not itself volatile so it may be moved with respect to other code (the
// original code is more complicated), but it seems surprising it passes the
// volatile asm in Now().
//
// 8048397:	64 03 05 00 00 00 00 	add    %fs:0x0,%eax
// 804839e:	89 45 f0             	mov    %eax,-0x10(%ebp)
// 80483a1:	e8 aa ff ff ff       	call   8048350 <_Z3Nowv>
// 80483a6:	89 c3                	mov    %eax,%ebx
// 80483a8:	89 d6                	mov    %edx,%esi
// 80483aa:	90                   	nop
// 80483ab:	8b 45 f0             	mov    -0x10(%ebp),%eax
// 80483ae:	8b 38                	mov    (%eax),%edi
// 80483b0:	90                   	nop
// 80483b1:	e8 9a ff ff ff       	call   8048350 <_Z3Nowv>
//

static inline int *XX() {
  long long int offset = 64;
  int *val;
  asm /*not volatile*/ ("add %%fs:0, %0" : "=r"(val) : "0"(offset));
  return val;
}

const int kCallsPerTrial = 30;

typedef long long Tsc;

__attribute__((__noinline__)) Tsc Now() {
  unsigned int eax_lo, edx_hi;
  asm volatile("rdtsc" : "=a" (eax_lo), "=d" (edx_hi));
  Tsc now = ((Tsc)eax_lo) | ((Tsc)(edx_hi) << 32);
  return now;
}

int g_sink;

bool RunTest(Tsc *tsc, int n) {
  int val;
  for (int i = 0; i < n; ++i) {
    Tsc start = Now();
    asm volatile("nop" ::: "memory");
    val = *XX();
    asm volatile("nop" ::: "memory");
    Tsc stop = Now();
    g_sink = val;
    *tsc++ = start;
    *tsc++ = stop;
  }
  return true;
}

int main(int argc, char **argv) {
    Tsc tsc[2 * kCallsPerTrial];
    RunTest(tsc, kCallsPerTrial);
}
Comment 1 Andrew Pinski 2008-09-23 23:50:51 UTC
A volatile asm is not a full barrier.
Please read http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/Extended-Asm.html.

" Note that even a volatile asm instruction can be moved relative to other code, including across jump instructions. For example, on many targets there is a system register which can be set to control the rounding mode of floating point operations. "

This came about with the fix for PR 17884.

*** This bug has been marked as a duplicate of 17884 ***
Comment 2 pardo 2008-09-29 17:48:13 UTC
How can I prevent relative motion?  I tried adding a "memory" constraint to all asms, but they are still moved past each other.  I expected any common constraint would keep them from crossing.

(Adding "volatile" to all asms does prevent relative motion but inhibits other optimizations so is undesirable.)