This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] C undefined behavior fix


On Sat, Jan 05, 2002 at 11:45:42AM +1100, Paul Mackerras wrote:
> I could understand it if you were saying that the *value* of "foo"[5]
> is undefined, that is, that you could get any value of type char at
> all, with no guarantees from the compiler about what value you will
> get.

Yes.

> But you (and others) are saying that the *operation* of
> evaluating "foo"[5] is undefined and that if you write that, the
> compiler has an excuse to generate code for system("rm -rf /") or
> anything else it likes.  That sounds like a cop-out to me.

No, not precisely.  If you write "foo"[5] the generated code will
dutifully read a value from some address (or segv, depending on
where the string was placed wrt unmapped pages).

What it will not do is consider the possibility that you are 
accessing anything other than the string literal "foo".  So if
"foo"+5 happens to overlap the variable A, then it's a crap shoot
whether or not you will read from A, or from something else (or
more realistically, whether you'll read from A before or after
it was initialized).

But it all stems from one source -- what is legal when it comes
to pointer arithmetic (or in pointer conversions if you take the
integer arithmetic tack).

If you stick to pointer arithmetic, then it is crystal clear that
the standard says that "foo"+5 must be representable (one past the
end of a 4 element array), but you cannot dereference it.  And it
also says that "foo"+6 may not even be representable.  This is the
point at which the strcpy example as written distinctly contravenes
the standard.

The major disagreement here is whether using integer arithmetic
as an escape hatch allows you to circumvent the rules for pointer
arithmetic.  Because if you allow it in one aspect, you have to
allow it in all aspects, and suddenly there are significant
optimizations (as opposed to this minor strcpy thing) that start
running afoul.

> When you say "do the relocation in assembly", do you mean write a
> function in assembly to take an unrelocated address and return a
> relocated address, or do you mean something different?

Using -mrelocatable or -fpic to collect a set of addresses that
are adjusted before you enter C.  The added bonus here is that
you'd no longer need to remember to mark all referenced addresses.
They'll all be handled all at once right up front.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]