Optimisations and undefined behaviour

David Brown david.brown@hesbynett.no
Mon Nov 9 15:57:00 GMT 2015


On 09/11/15 16:22, Andrew Haley wrote:
> On 11/09/2015 03:05 PM, Richard Earnshaw wrote:
>> On 09/11/15 15:00, Andrew Haley wrote:
>>> On 11/09/2015 02:56 PM, Richard Earnshaw wrote:
>>>> On 09/11/15 14:29, Andrew Haley wrote:

>>> And besides, the UB might cause the computer to crash before the
>>> data has been written to stdout by the kernel; the same reasoning
>>> applies.
>>
>> UB that causes the machine to crash is, I think, outside of what we need
>> to think about.  Any machine that's falls over in that case is not
>> interesting.
> 
> Well, in that case you'll have to define what you think we are
> supposed to be thinking about.  I don't think you are talking about
> the C language as it is specified, you're talking about some
> implementation on some machine, and of course I can't argue with you
> about what that machine does because I don't know anything about it.
> Such a discussion is IMO pointless because everyone can just Make
> Stuff Up.
> 
> I do know that once UB has happened in a program, the whole program is
> undefined and the language specification imposes no requirements on it
> at all.  I don't see any language in the specification which requires
> prior operations with side effects to have completed.
> 
> But should we treat all potential UB as having a side effect,
> (essentially treating it like a volatile) so that it cannot be moved
> past a sequence point?  I wouldn't have thought so, but perhaps we
> could.
> 

This is getting to the key issues, as far as I can see.  I don't know
how easy or hard it is to put specifications or restrictions on the
effects of different types of UB in different circumstances.  But I do
know this - the primary purpose of gcc is to help developers write
clear, reliable, bug-free, and efficient software.  Adhering to and
implementing the C (and C++, Ada, Fortran, etc.) language specifications
is a necessary part of that - but it is not sufficient.  gcc has always
gone long beyond the minimum requirements for the language
implementation, with extensions, warnings, and optimisations all
designed to give developers the best tool possible.

In this light, it is important that gcc does what it can to help
developers avoid UB, as well as minimising the damage if UB occurs -
while also generating efficient code on the assuming that UB does not
occur.  There are trade-offs here - there will be times when the
compiler could use knowledge of undefined behaviour to generate better
code, but doing so would make it harder for the developer to identify
the problem and correct it.  gcc should, IMHO, give a stronger weighting
to helping programmers fix bugs than helping them to generate faster
code - fast but incorrect code is of no use to anyone.

So if a developer writes this:

int foo(int x) {
	if (x == 0) printf("Dividing by zero!\n");
	return 10000 / x;
}

The the compiler could remove the conditional and the debugging message
- that is what the language standards allow, and that is the smallest
and fastest code.  But it is not helpful to the developer - it would
make gcc a poorer tool.  Compile-time error or warning messages are the
best here, where possible - they give the developer feedback fastest.
But the run-time message should stand, despite the UB - it is what gives
the developer the best chance of finding and fixing the bug.

I started this thread because these issues are of real concern to
developers, at least in my field (small-systems embedded programming).
Reliability is critical in many such systems, and the opportunities for
debugging or error handling can be limited.  We typically cannot use
"sanatize" options, nor can we accept that a bug in one part of the
program causes undue and unnecessarily damaging side-effects in other parts.


David
(I'm merely a "customer" of gcc, not a developer - but the customer is
always right :-) )



More information about the Gcc-help mailing list