Why does "volatile" improve Pentium code generation?

Steven Snyder ssnyder@indy.net
Thu Sep 17 11:14:00 GMT 1998


(I see this code generation under pgcc v1.1a, but am not sure if this is
an issue for pgcc specifically or the underlying egcs compiler.  Please
excuse me if this topic is not pertinent to this mailing list.)

This is an example of how a "volatile" pointer cast strangely causes the
generation of much better Pentium code.  The full files are attached, but
here is a synopses:

  castbug1:
        movl myptr,%eax
        movw $-232,336(%eax)
        movl myptr,%edx
        movw $-7401,336(%edx)

  castbug2:
        movl myptr,%eax
        movl $-232,%edx
        movw %dx,336(%eax)
        movl $-7401,%edx
        movw %dx,336(%eax)

Note that castbug1 has 4 memory accesses, including 2 of the dreaded (for
performance reasons) constant-to-memory move.  In contrast, castbug2 has
only 3 memory accesses, all of which involve registers, not constant
values.  Also, castbug1 has additional AGI stalls due to the pointer being
loaded into the register and being dereferenced in the very next
instruction.

This code generation can be see at any optimization level.  For clarity,
though, I stripped off the stack frame creation/destruction with -O6:

        gcc -c -S -mcpu=i586 -march=i586 -O6 -Wall castbug.c

I found this difference in code generation while investigating why the
pointer code generated for my program was so bad.  I had declared the
pointer to be "const" specifically to avoid having it be reloaded (this
ptr is used *often*), yet it was being reloaded on every use.  It turned
out that I had cast the pointer before using it but had neglected to carry
forward the "volatile" part of the original pointer declaration.  The
compiler silently removed the "volatile" part of the pointer, causing the
code generation shown in castbug1.

(My reason for the "volatile" designation is that the pointer refers to
memory-mapped I/O registers on a video card.  The values in these
registers change dynamically, so I don't want the compiler of optimize
away any references to them.)

What is going on with this optimization?

Thank you.


More information about the Gcc mailing list