Bug 21855 - array bounds checking elimination
array bounds checking elimination
Status: NEW
Product: gcc
Classification: Unclassified
Component: java
4.1.0
: P2 normal
: ---
Assigned To: Diego Novillo
: alias, missed-optimization
: 18178 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-06-01 01:54 UTC by Tom Tromey
Modified: 2012-01-10 17:21 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2006-02-20 18:33:39


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Tromey 2005-06-01 01:54:35 UTC
public class q
{
  public static void main(String[] args)
  {
    for (int i = 0; i < args.length; ++i)
      System.out.println(args[i]);
  }
}

Right now you will get an unnecessary array bounds check at 'args[i]'.
BTW we implement this check with an unsigned comparison to also weed
out negative indices.

Also, an array's size is fixed at its creation.  We ought to be
able to constant propagate in code like this:

   int[] array = new int[256];
   for (int i = 0; i < array.length; ++i) { ... }

... but afaik there is no way for the front end of the value of
'array.length'
Comment 1 Andrew Pinski 2005-06-01 20:31:28 UTC
Confirmed.
Comment 2 Diego Novillo 2005-06-03 14:34:21 UTC
Aliasing is getting in the way of range propagation here.  We don't realize that
args.length does not change during the loop, which means that we don't eliminate
the redundant load and fail to see the equivalence between the ranges.

Analysis from an IRC discussion:

<dnovillo> tromey: so, this is how we get ourselves tied up in a knot.  This is
the IL inside the loop body (21855):
<dnovillo> <L0>:;
<dnovillo>   D.671_6 = args_5->length;
<dnovillo>   if (i_2 >= D.671_6) goto <L10>; else goto <L1>;
<dnovillo>   [ ... ]
<dnovillo>   i.4_13 = (unsigned int) i_12;
<dnovillo>   D.680_14 = args_5->length;
<dnovillo>   D.681_15 = (unsigned int) D.680_14;
<dnovillo>   if (i.4_13 < D.681_15) goto <L6>; else goto <L4>;
<dnovillo> <L4>:;
<dnovillo>   D.682_27 = _Jv_ThrowBadArrayIndex (i_12);
<dnovillo> the first if() controls the iteration of the loop.
<dnovillo> the second if () is the redundancy we want to remove.
<dnovillo> when we get to this point, i_12 has the range we want, namely [-INF,
D.671_6 - 1].
<dnovillo> Two things go wrong:
<dnovillo> 1- The cast to unsigned int is an euphemism for ABS_EXPR here.  VRP
doesn't see that and sets i.4_13's range to VARYING.  That's trivial to fix.
<dnovillo> 2- We load from 'args_5->length' inside the loop even though the
memory has not changed.  This is the aliasing problem.  If we realized that
args_5->length doesn't change, FRE would've removed the redundant load and the
inner if
<dnovillo> would look like: 'if (i.4_13 < D.671_6)'
<dnovillo> which we would immediately toss away
<DannyB> dnovillo: my proposal to solve #2 is to have multiple name tags per
pointer (one for each offset that is accessed off that pointer), so that we can
say that args_5->length only is modified by args_5->length
<DannyB> (or whatever else happens to alias that field)
<dnovillo> DannyB: not making args point-to call-clobbered vars would suffice here.
<dnovillo> DannyB: there are System. calls in the loop.
<DannyB> Isn't args an incoming pointer?
<dnovillo> nope.
<dnovillo> DannyB: bah. yes.
Comment 3 Diego Novillo 2005-06-04 17:00:35 UTC
*** Bug 18178 has been marked as a duplicate of this bug. ***
Comment 4 Richard Biener 2009-04-03 12:07:33 UTC
Huh.  C testcase please?  I think this may be fixed now.
Comment 5 Andrew Pinski 2009-04-03 19:18:34 UTC
There are multiple of issues here, first we have an issue that the java front-end is not telling the middle-end that args.length cannot be changed after a new has happened (an aliasing issue).  And then we had some branch issues but those might have been fixed already.
Comment 6 Tom Tromey 2009-04-08 16:37:00 UTC
How would the FE indicate that the length field is immutable?
Comment 7 Richard Biener 2009-04-08 20:33:27 UTC
You can't.
Comment 8 Andrew Haley 2009-04-23 16:15:33 UTC
2 reasons:

1.  Habit.
2.  The original test case is written in Java: no unsigned types!
Comment 9 Paolo Carlini 2009-04-23 16:17:11 UTC
Ah! Now however, I **must** know why Java doesn't have unsigned types!
Comment 10 Andrew Haley 2009-04-23 16:23:25 UTC
Officially, java doesn't have unsigned types for economy: believe it or not, Java was once intended to be a small language.  However, there are not many unused bytecodes left, and a full set of signed instructions would use them all. 
Comment 11 Paolo Carlini 2009-04-23 16:26:26 UTC
Interesting, thanks Andrew.
Comment 12 Steven Bosscher 2010-07-15 22:43:08 UTC
I would be surprised if this is not fixed now. Can someone with Java-fu check?
Comment 13 Richard Biener 2012-01-10 15:42:47 UTC
We can't optimize this because System.out.println can change args[].  Thus
the testcase is too simple to be useful (and no, we are not treating
program arguments special - I doubt that would be useful, nor would it be
easily possible).
Comment 14 Andrew Haley 2012-01-10 15:52:07 UTC
(In reply to comment #13)
> We can't optimize this because System.out.println can change args[].

That's the whole point: System.out.println cannot change args[], which is a java array, and the length of a Java array is constant.  It is not an invalid test case.
Comment 15 Richard Biener 2012-01-10 16:20:13 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > We can't optimize this because System.out.println can change args[].
> 
> That's the whole point: System.out.println cannot change args[], which is a
> java array, and the length of a Java array is constant.  It is not an invalid
> test case.

I suppose

  public static void main(String[] args)

is passing args by value (but the implementation detail uses reference
passing for efficiency?).  In this case the Java frontend should do
like the C++ frontend and tell this to the middle-end by properly
marking args as 1) DECL_BY_REFERENCE, 2) use a TYPE_RESTRICT qualified
pointer for the reference.  Then we would optimize this case.

Java frontend issue.
Comment 16 Andrew Haley 2012-01-10 16:26:51 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > (In reply to comment #13)
> > > We can't optimize this because System.out.println can change args[].
> > 
> > That's the whole point: System.out.println cannot change args[], which is a
> > java array, and the length of a Java array is constant.  It is not an invalid
> > test case.
> 
> I suppose
> 
>   public static void main(String[] args)
> 
> is passing args by value (but the implementation detail uses reference
> passing for efficiency?).

args is indeed a reference to a Java array.  The length field of a Java
array is immutable.  The elements of an array are not immutable.

> In this case the Java frontend should do
> like the C++ frontend and tell this to the middle-end by properly
> marking args as 1) DECL_BY_REFERENCE, 2) use a TYPE_RESTRICT qualified
> pointer for the reference.  Then we would optimize this case.

If we could mark the length field as immutable that would fix it.  Is there any way to do that?
Comment 17 Richard Biener 2012-01-10 16:30:30 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > (In reply to comment #14)
> > > (In reply to comment #13)
> > > > We can't optimize this because System.out.println can change args[].
> > > 
> > > That's the whole point: System.out.println cannot change args[], which is a
> > > java array, and the length of a Java array is constant.  It is not an invalid
> > > test case.
> > 
> > I suppose
> > 
> >   public static void main(String[] args)
> > 
> > is passing args by value (but the implementation detail uses reference
> > passing for efficiency?).
> 
> args is indeed a reference to a Java array.  The length field of a Java
> array is immutable.  The elements of an array are not immutable.

You mean that System.out.println could change the elements of the array
(well, it doesn't, but theoretically it could)?

> > In this case the Java frontend should do
> > like the C++ frontend and tell this to the middle-end by properly
> > marking args as 1) DECL_BY_REFERENCE, 2) use a TYPE_RESTRICT qualified
> > pointer for the reference.  Then we would optimize this case.
> 
> If we could mark the length field as immutable that would fix it.  Is there any
> way to do that?

No.  What you can do is, via the method I outlined, tell GCC that
args is to be treated similar to a local automatic variable - thus
it cannot be refered to from other functions (unless you pass them
its address of course).
Comment 18 Richard Biener 2012-01-10 16:35:46 UTC
(In reply to comment #17)
> (In reply to comment #16)
> > (In reply to comment #15)
> > > (In reply to comment #14)
> > > > (In reply to comment #13)
> > > > > We can't optimize this because System.out.println can change args[].
> > > > 
> > > > That's the whole point: System.out.println cannot change args[], which is a
> > > > java array, and the length of a Java array is constant.  It is not an invalid
> > > > test case.
> > > 
> > > I suppose
> > > 
> > >   public static void main(String[] args)
> > > 
> > > is passing args by value (but the implementation detail uses reference
> > > passing for efficiency?).
> > 
> > args is indeed a reference to a Java array.  The length field of a Java
> > array is immutable.  The elements of an array are not immutable.
> 
> You mean that System.out.println could change the elements of the array
> (well, it doesn't, but theoretically it could)?
> 
> > > In this case the Java frontend should do
> > > like the C++ frontend and tell this to the middle-end by properly
> > > marking args as 1) DECL_BY_REFERENCE, 2) use a TYPE_RESTRICT qualified
> > > pointer for the reference.  Then we would optimize this case.
> > 
> > If we could mark the length field as immutable that would fix it.  Is there any
> > way to do that?
> 
> No.  What you can do is, via the method I outlined, tell GCC that
> args is to be treated similar to a local automatic variable - thus
> it cannot be refered to from other functions (unless you pass them
> its address of course).

Thus, similar to the C++ case with

struct Array { int length; void data[]; }

void foo (Array args)
{
 ...
}

foo cannot change the callers args.length (only its own copy) but it
can change the callers args.data[] contents.  If the C++ frontend
decides to pass args by reference then it sets DECL_BY_REFERENCE
and uses a TYPE_RESTRICT qualified pointer.  This way the optimization
will be the same as if it was passed "really" by value.

Not sure if the Java situation is similar enough.
Comment 19 Andrew Haley 2012-01-10 16:44:16 UTC
(In reply to comment #17)
> (In reply to comment #16)
> > (In reply to comment #15)
> > > (In reply to comment #14)
> > > > (In reply to comment #13)
> > > > > We can't optimize this because System.out.println can change args[].
> > > > 
> > > > That's the whole point: System.out.println cannot change args[], which is a
> > > > java array, and the length of a Java array is constant.  It is not an invalid
> > > > test case.
> > > 
> > > I suppose
> > > 
> > >   public static void main(String[] args)
> > > 
> > > is passing args by value (but the implementation detail uses reference
> > > passing for efficiency?).
> > 
> > args is indeed a reference to a Java array.  The length field of a Java
> > array is immutable.  The elements of an array are not immutable.
> 
> You mean that System.out.println could change the elements of the array
> (well, it doesn't, but theoretically it could)?

In theory yes, it could.

> > > In this case the Java frontend should do
> > > like the C++ frontend and tell this to the middle-end by properly
> > > marking args as 1) DECL_BY_REFERENCE, 2) use a TYPE_RESTRICT qualified
> > > pointer for the reference.  Then we would optimize this case.
> > 
> > If we could mark the length field as immutable that would fix it.  Is there any
> > way to do that?
> 
> No.  What you can do is, via the method I outlined, tell GCC that
> args is to be treated similar to a local automatic variable - thus
> it cannot be refered to from other functions (unless you pass them
> its address of course).

But that doesn't help.  args *can* potentially be referred to by other functions.  The special property we need to make use of its that fact that once an array is created, its length can never change.
Comment 20 Richard Biener 2012-01-10 16:55:52 UTC
(In reply to comment #19)
>
> > No.  What you can do is, via the method I outlined, tell GCC that
> > args is to be treated similar to a local automatic variable - thus
> > it cannot be refered to from other functions (unless you pass them
> > its address of course).
> 
> But that doesn't help.  args *can* potentially be referred to by other
> functions.  The special property we need to make use of its that fact that once
> an array is created, its length can never change.

That is something we cannot express at the moment, and I think it would
be hard to implement reliably (thinking of the RTX_UNCHANGING saga - we
do have to exclude the stmts that initialize the length field).
Comment 21 Richard Biener 2012-01-10 16:57:36 UTC
The Java frontend could handle this by performing loads of the length field
via a SAVE_EXPR and sharing this across a function.  That way CSE would
happen automagically.
Comment 22 Andrew Haley 2012-01-10 17:08:30 UTC
(In reply to comment #21)
> The Java frontend could handle this by performing loads of the length field
> via a SAVE_EXPR and sharing this across a function.  That way CSE would
> happen automagically.

Now that's a nice idea.  

In this specific case it should be easy, because array.length is used in the control expression for the loop.  So. we can create the SAVE_EXPR when the loop is initialized.

The problem with doing this in general is that we don't have a CFG, so we don't always know when to create that SAVE_EXPR.  If we can find an initial use of an array that dominates all other uses we can create the SAVE_EXPR at that point.
Comment 23 Tom Tromey 2012-01-10 17:17:50 UTC
I thought I wrote a pass to do this optimization, but I can't find it now.
Anyway I think that would be the simplest approach by far.
Comment 24 Tom Tromey 2012-01-10 17:21:02 UTC
I found my code and it turns out I never finished it.
(I did write a java-specific devirtualization pass.)
Here is an introductory comment that explains my plans:

/* This pass implements a few simple gcj-specific optimizations
   related to handling of invariants.  In Java, certain fields of
   some objects are known to be invariant after the object has been
   created.  For instance, the vtable pointer of an object cannot
   change; neither can the length of an array.

   Also, this pass knows that 'new Type[n]' sets the length of an
   array to 'n'.

   Ideally GCC would recognize these cases and optimize for us.
   However, currently there is no way to express these properties to
   the optimizers.
 */