This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Ada front-end depends on signed overflow
- From: Paul Schlie <schlie at comcast dot net>
- To: Robert Dewar <dewar at adacore dot com>
- Cc: Andrew Pinski <pinskia at physics dot uc dot edu>,Florian Weimer <fw at deneb dot enyo dot de>,GCC List <gcc at gcc dot gnu dot org>,<bosch at gnat dot com>
- Date: Mon, 06 Jun 2005 21:53:27 -0400
- Subject: Re: Ada front-end depends on signed overflow
> From: Robert Dewar <dewar@adacore.com>
> Paul Schlie wrote:
>
>> Similar arguments has been given in support an undefined order of
>> evaluation; which is absurd, as the specification of a semantic order
>> of evaluation only constrains the evaluation of expressions which would
>> otherwise be ambiguous, as expressions which are insensitive to their
>> order of evaluation may always be evaluated in any order regardless of
>> a specified semantic order of evaluation and yield the same result; so
>> in effect, defining an order of evaluation only disambiguates expression
>> evaluation, and does not constrain the optimization of otherwise
>> unambiguous expressions.
>
> I think perhaps you have not really looked at the issues that are raised
> in optimizing compilers. Let me address this misconception first, similar
> considerations apply to the overflow case.
>
> The point is that at compile time you cannot tell what expressions are
> "ambiguous". I use quotes here since of course there is no ambiguity
> involved:
>
> suppose you write (a * f(b)) * (c * g(d))
>
> where f(b) modifies c and g(d) modifies a. You would call this ambiguous
> but in fact the semantics is undefined, so it is not ambiguous at all,
> it has quite unambiguous semantics, namely undefined. The notion of
> non-deterministic behavior is of course quite familiar, and what
> undefined means here is that the behavior is non-deterministic from
> among the set of all possible behaviors.
- Agreed, I would classify any expression as being ambiguous if any of
it's operand values (or side effects) were sensitive to the allowable
order of evaluation of it's remaining operands, but not otherwise.
> Now you seem to suggest that the optimizer should simply avoid
> "optimizing" in such cases (where it matters).
- No, I simply assert that if an expression is unambiguous (assuming
my definition above for the sake of discussion), then the compiler
may choose to order the evaluation in any way it desires as long as
it does not introduce an such an ambiguity by doing so.
> But the whole point
> is that of course at compile time in general you can't tell whether
> f(b) and g(d) have (horrible) side effects of modifying global
> variables a and c. If we don't allow undefined behavior, the optimizer
> would have to assume the worst, and refrain from optimizing the above
> expression just in case some idiot had written side effects. The decision
> of the standards committee is to make the side effect case undefined
> PRECISELY so that the optimizer can go ahead and choose the optimal
> order of evaluation without worrying about the side effect case.
- I fully agree that if a complier does not maintain records of the
program state which a function may alter or be dependant on, as
would be required to determine if any resulting operand/side-effect
interdependences may exist upon it's subsequent use as an operand
within a an expression itself; then the compiler would have no choice
but to maintain it's relative order of evaluation as hypothetically
specified, as it may otherwise introduce an ambiguity.
Although I believe I appreciate the relative complexity this introduces
to both the compiler, and well as requirements imposed on "pre-compiled"
libraries, etc., I don't believe that it justifies a language definition
legitimizing the specification of otherwise non-deterministic programs.
> Going back to the overflow case, consider:
>
> a = (some expression the compiler can tell is positive)
> b = (some expression the compiler can tell is positive)
> c = a + b;
> d = c / 8;
>
> Here the compiler can do a right shift for the last statement. This
> would of course be wrong in the general case (assuming c and d are int),
> since the right shift does not work right if c is negative. But the
> compiler is allowed to assume that a+b does not overflow and hence
> the result is positive, and hence the shift is fine.
>
> SO you see that guaranteeing twos complement wrap can hurt the quality
> of the code in a case like this.
- As you've specified the operations as distinct statements, I would argue
that such an optimization would only be legitimate if the result were
known to produce the same result as if the statements were evaluated in
sequence as specified by the standard (which of course would be target
specific). Correspondingly I would assert that:
d = (a + b) / 8;
would be ambiguous if the complier were able to restructure evaluation
of expression in any way which may alter it's resulting effective result
for a given target, As a program which has non-deterministic behavior
doesn't seem very useful, regardless of whether or not it's "allowed" by
a standard or not. (Although concede that some optimizations have more
benign worst-case effects than others, and may be reasonable if explicitly
enabled (aka unsafe fp math); however as described above, this one seems
likely more dangerous than useful, as it may catastrophically diverge from
an otherwise computed result; which although may be legitimate, it's not
clear how it could be more useful than likely error prone; unless for
example it were known that the largest precision of it's true operands
were less than the precision specified for the operation, as it should
then yield equivalent results, but not otherwise.)
> Now it is legitimate to argue about how much quality is hurt, and
> whether the resulting non-determinisim is worth the effeciency hit.
- Or rather is non-determinisim ever legitimately acceptable? (as I can't
honestly think of a single instance were it would be, except if it may
only result in the lost of a measurably few bits of fp precision, which
are imprecise representations to begin with. but not otherwise?)
> But to conduct this argumnnt, you have to start off with an understanding
> that there really is a trade-off and it is not clear what the decision
> should be (C makes one decision, Java another, and Ada a third). The
> three languages also differ in the handling of a+b
- Agreed.
> C says that this is undefined if there are side effects that cause what you
> call ambiguity.
>
> Java specifies left to right evaluation (evaluate a before b always).
- I believe they may be evaluated in any order if it can be determined that
the resulting semantic effect will be logically equivalent.
> Ada specifies that the result is non-deterministic, you either evaluate
> a before b or b before a (not some mixture of the two), and there may
> be (at most) two possible results.
- Which is better, as at least the behavior would be consistent across the
whole program (assuming no potentially corrupting optimizations are
inconsistently applied :)
But overall do agree with your earlier statement, that each language has
made a choice, for better or worse.