This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gcc 3.3 / i386 / -O2 question


Luca Benini wrote:

If the code is syntattical garbage ===> compiler don't compile it
if the code is semantical garbage ===> same compiler must generate the same output (in .s or in .o) ==> you must obtain the same result.


or not?

not!


Languages like C define an entire class of programs which are syntactically
correct, but whose semantics are undefined. Undefined means undefined. Your
assumption that such undefined behavior is deterministic, and should be
independent of optimization level is just wrong.

Let me give a simple example. It's the case that we find is most common
when people report a "bug" in the compiler after observing different
behavior when optimization is turned on.

If you have an uninitialized variable, and you access this uninitialized
value, the value that you get is undefined.

Now with optimization off, you are often lucky, the value is stored in
memory, which happens to be zero, or some other "ok" value.

When you optimize, the compiler may choose to place the value in a register,
and suddently you get a garbage value that was whatever was previously in
that register (which might well be a non-deterministic value, e.g. if
you just read the value of the time of data into that register).

Actually things can get a whole lot worse. Compilers are entitled,
as David noted, to assume a program is correct. Consider:

int num_password_attempts; /* oops forgot to initialize to zero */

.. read password

        if (password == magic_value) { ... delete system disk ...}
        else {
           num_password_attempts++;
           .. try again
	}

A compiler can assume that password == magic_value without doing the
test. Why? Because the value of the variable is undefined, which means
that if it is incremented, it may overflow and anything might happen,
including doing whatever you do when the condition is true, so why not
just assume it is true.

Of course that's a pathology, but the general principle is that the
optimizer can assume your program is correct, and if it is not, all
sorts of horrible things can happen.

To enforce the kind of restriction you have in mind (deterministic
behavior + no change in behavior on optimization) would significantly
degrade the efficiency that can be achieved in generating good code.

Note that this is not a potshot at C. Even Ada, designed to be a safe
language, has such cases, since Ada compilers are expected to be able
to generate highly efficient code. It is true that Ada is better than
C in this respect (e.g. an overflow causes well defined deterministic
behavior in raising an exception). But to go all the way has a real
cost. Java does go all the way here to defined deterministic semantics,
and indeed a significant price is paid for this decision.

Note that it would be possible to have a C or Ada compiler that did
make the decision to provide totally determinstic well defined
results at the expense of efficiency. You can get some of these
effects now. For instance, in GNAT, if you use the pragma
Initialize_Scalars, you can eliminate all non-determinism that
comes from uninitialized variables.

There is of course a trade off in compiler implementation between
following the standard on the one hand, and minimizing surprises
on the other, and sometimes, we do make the decision to degrade
the code to avoid suprises.

For example, IBM compilers for the Power architecture decide that
when you are optimizing single precision fpt stuff, they should not
put stuff in registers and silently provide extra precision when
optimizing, even though languages like Fortran allow enough
freedom in fpt semantics to permit this. It is simply too
worrisome for people to get different results when they
optimize. Of course if this happens, and the differences are
significant, it means that the algorithms are suspicious, but
people prefer repeatable results in this case, even if they
are wrong :-)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]