This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: better -Wuninitialized (Re: Ada files now checked in)

On Sun, Oct 07, 2001 at 02:21:31PM -0400, Diego Novillo wrote:
> On Sun, 07 Oct 2001, Zack Weinberg wrote:
> > > - if its only reaching definition is the ghost def, the variable
> > >   *is* used uninitialized.
> > > 
> > > - if one of its reaching definitions is the ghost def, the
> > >   variable *may be* used uninitialized.
> > ...
> > 
> > I'm not too familiar with reaching definitions, do they take control
> > dependencies into account?
> > 
> Yes, that's what the SSA form is for:
> 1	  int a, b;
> 2	
> 3	  b = foo();
> 4	  if (b < 100)
> 5	    a = 10;
> 6	  b = b + a;

The question is what happens with

1	int a, b;
3	b = foo();
4	if (b < 100)
5	  a = 10;
6	bar();
7	if (b < 100)
8	  b = b + a;

which is the canonical case that the current code gets wrong.  (And
imagine that line 7 is actually several hundred lines of spaghetti
which do not touch A or B.)

hmm... At line 6, the reaching set for A is {def(A, 5), def(A, 0)},
but at line 7 it ought to be just {def(A, 5)}.  Does it know that?

> > It would often be helpful if an uninitialized variable could be
> > automatically set to a "poison" value by the compiler.  This would
> > prevent one major cause of hard-to-find context-dependent bugs.  It
> > sounds like this can easily be implemented by emitting real code for
> > the ghost definitions; dead code elimination would then zap it in all
> > cases where there isn't a problem.  Have you considered this?
> > 
> Not really.  But it is definitely doable.  The only problem is
> what to consider a 'poison' value.

Something that will cause an immediate fault in the program if it gets
used.  More, you want a value that is *likely* to get used and
therefore to expose the bug.  (Zero, for instance, is a bad choice.)

For pointers, this is easy - pick a non-NULL value pointing into
unmapped memory.  Floats should probably get a signalling NaN.
Integers are harder, but on the theory that numbers in real life tend
to be small, use a big one.  For signed int, probably it should be
negative on the theory that the programmer may not have considered
negative numbers.  (But *not* -1.)

Booleans you are probably up a creek with.

Another consideration is that the bit pattern ought to be recognizable
as a poison value.  The garbage collector uses 0xA5A5A5A5.... for this

It's also an opportunity to make jokes with the hexadecimal constant.
Dead beef anyone?

> OTOH, if the compiler is already warning you that you're using the
> thing uninitialized, why would you also need this run-time trick?

Because you may incorrectly think you have inspected the code and
determined that there is no actual problem.  This is of course more
likely the more false positives there are.

> Hmm, I should've initialized p in the example.  But good point.
> This would've given you a warning for *p.  De-referencing a
> pointer is a use of the pointer and a def of every variable in
> its equivalence set.  In this case, we could empty the
> equivalence set if p is used uninitialized.

Sounds good.

> In tree SSA we call calculate_dominance_info and
> compute_dominance_frontiers directly.  Also, the code uses
> sbitmaps quite frequently.  The bitmaps are typically
> O(n_basic_blocks).  What problem are you referring to?

The bitmaps are probably sparse, and n_basic_blocks can blow up, at
which point your memory usage blows up too.  Brad Lucier has some good
examples of this problem.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]