This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: PATCH: change splay_tree_key & splay_tree_value types to void*


Mark Mitchell <mark@codesourcery.com> writes:

> Another one was trailing arrays:
>    
>   struct S { int i [1]; };
>     
> It's specifically allowed by C/C++ to make a bigger array, and index
> through `i' with indices bigger than 1.

BPs already handle this case properly.  The flexible array member
"inherits" its high bound from the enclosing struct.

>   struct S { char c[3]; char d[4]; };
> 
> If you have a pointer into `c' you can't use it to index into `d' --
> but people can, and do, and sometimes that's OK.  

That depends on you mean by OK.  You could say it's OK if there is no
user-visible manifestation, but that makes it no less a bug, only a
low-priority bug.  That such a bug stands in the way of strict BPs
elevates its priority and makes it worthy of fixing.  For example,
bfd/archive.c uses sprintf to fill header fields, and in every case,
it blows one-past the field with the NUL-terminator.  Since it writes
fields in ascending-address order, and because there's an extra two
bytes of padding at the end of the header struct, the bugs have no
consequence.  IMO, these are still bugs in ar, and should be fixed,
rather than making GCC do back-flips to tolerate and therefore
inadvertently tolerate bugs elsewhere.

> Most programs don't exist a vacuum.  One of the key things is working
> with uninstrumented code, i.e., libraries that are not compiled with
> BP enabled.  So, the `char* -> int -> char*' example I gave above is
> just a short-hand for a call to an instrumented function:

I don't consider mixing unchecked code to be a key feature.  On the BP
project page, it's only a "maybe goal" to be able to mix checked and
unchecked code conveniently.  Why?  For one, it's considerably more
painful and time-consuming to implement in GCC and I'd rather invest
my time in working on 100% source-available programs and libraries.
For another, it's a much less important goal than it was ten years ago
before there were 100% source-available systems.  Ten years ago, in
most cases you needed a vendor-supplied binary-only libc, so mixing
unchecked code was required.

> Actually not -- pointer descriptors were stored in a hash table, and
> we only looked them up when we lost the bounds.  It took something
> like 50 cycles to look them up, I think.

What was the typical range for runtime overhead on real programs?

> There's nothing technically wrong with your patches -- but I think
> you're not quite approaching things correctly.  The point is that
> splay-tree.h isn't broken -- it just makes checking harder.

It might not be broken C (except for architectures where sizeof
(void*) > sizeof (long)), but it is broken for what we could call the
BP dialect of C.  If you're going to take a hard line that BP C must
be C, then you're going to significantly undermine BPs.  Yes, I know
it can all be optional, but I'm going to implement strict BP C first.

IMO, you seem too eager to undermine BPs in deference to code that is
somewhat dirty and questionably portable, but otherwise in line with
common practice.  I just can't agree with that philosophy.

> My point is that BPs should deal with this situation gracefully.  Even
> very clean code will make use of libraries and other code that are not
> necessarily as clean.

And so the libraries should then be fixed, rather than undermining
checks everywhere to accommodate the library.  In 100%
source-available systems this is possible and desirable.  Contribute
the fixes back to the community so that over time the world becomes BP
clean.  Again, I assert that this is a worthier goal than undermining
BPs in the compiler to tolerate varying shades of grey in C code.

> BPs should be lenient in this case -- or at the
> very least have a mode in which they are lenient, and my opinion is
> that this case should be the default.  

The motivation to be lenient should come in response to the extreme
difficulty of getting things to work with strict checks.  So far, I
don't have that motivation because it's been easy to get major
programs working with strict BPs.  However, I do know that pride comes
before the fall, and I might come to see things your way in time, but
I don't want to preemptively capitulate on this point.

> Note that I think `char*' could be handled differently from `struct X
> *'.  A pointer of type char * with unknown bounds probably points to a
> string; a `struct X *' with unknown bounds probably points to a single
> struct X.  (Here `probably' means statistically speaking.)

This is a tangled web I care not to weave yet, if ever.

> I really don't think what you suggest is true.  I think that even new
> code relies on old crufty code, and I think that new C/C++ code will
> always play some non-portable platform-dependent tricks.

Yes, and as those cases arise, GCC should be taught to handle them
without weakening checks elsewhere.  In cases where tricks are too
dirty to make GCC handle them, then the offending code should be made
BP clean.

> ...  On the
> other hand, I honestly think that the right way to build an
> error-checker is with a default mode that is lenient ...

I think we'll have to agree to disagree on this one, at least for now.
I might I'll come crawling back if I get beat-up to badly as my
package-testing efforts expands.  8^)

> ... users then gradually crank up the strictness to get to a less
> lenient mode as they fix the bugs found by the first mode.  (This is
> like compiler warning options: you might not start with -Wall if you
> had never compiled the code with GCC before.)

With `-fbounded-pointers' GCC emits warnings whenever it must create
bounds for an unbounded datum or it must discard bounds of a bounded
pointer.  An exception is that it doesn't warn if the unbounded datum
being converted to a bounded pointer is the constant zero, as for a
NULL pointer.  I plan to also silence such warnings when the datum is
a small integer constant (say, -10 <= i <= 10) since programs commonly
use these as equivalent-to-NULL error codes.

IMO, if one desires to gradually crank up checking, the proper way to
proceed with a pile of marginal code is to first make it compile
without warnings while passing `-Wmissing-prototypes', then with no
warnings about BP/non-BP casts while passing `-fbounded-pointers'.
Then, you're ready for dynamic checks.

>     Greg> ...  IMO, bringing all of GNU/Linux code up strict BP
>     Greg> correctness is a worthier effort than adding a pile of
>     Greg> tricky code and options to dumb BPs down for crufty legacy
>     Greg> systems.
> 
> I think you should do both.  I'm trying to emphasize how important it
> is to have the leniency.  

And I'm trying to emphasize that leniency is not so important for what
I consider to be my user base.  It seems that your experience as a
commercial debug tool builder for a commercial SW development market
where closed-source is the norm leads you to believe that leniency is
important.  However, I am designing and building BPs for my own use
and for the use of free/open source community where the landscape is
entirely different and leniency is not very important.

The target user-base is the fundamental assumption which seems to be
at the root of our disagreements.  If the GCC steering committee
disagrees with my choice of target user-base and therefore the design
choices that follow, then they should tell me so!  However, even if
that happens, it still makes sense to implement strict BPs first, and
only then go back and undermine them with leniency options.  I don't
see any reason to design or implement for leniency from the beginning.

I am not on a crusade to kill closed-source proprietary software, but
I do smile inwardly at the happy coincidence that strict BPs are
conceptually simple and elegant and work wonderfully for 100%
source-available systems, and that the unpleasant work of undermining
BPs for leniency is mostly of benefit to closed-source environments.
There is real danger that I might have so much fun playing with strict
BPs in the sunshine of the free/open source world that I entirely
forget to go back and stick boogers all over GCC so that I can labor
with BPs in the closed/proprietary source world.  8^)

At the risk of beating this dead horse to a pulp, I'd be interested in
hearing what other GCC steering committee members think regarding the
fundamental assumption of what the user-base BPs should serve first
and best.  I'm not saying that leniency is out of the question, only
that strictness gives the most bang for the least buck for the user
community that I expect is most important to us, and therefore we
shouldn't invest time and energy in leniency options during this early
phase.

Greg

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]