Bug 26797

Summary: [4.3 regression] ACATS cxh1001 fails
Product: gcc Reporter: Laurent GUERBY <laurent>
Component: adaAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: baldrick, christian.joensson, danglin, ebotcazou, gcc-bugs, kenner, law, pinskia, rguenth
Priority: P5 Keywords: wrong-code
Version: 4.2.0   
Target Milestone: 4.3.0   
URL: http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01100.html
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2007-03-04 10:07:22
Bug Depends on:    
Bug Blocks: 30911, 31108    

Description Laurent GUERBY 2006-03-22 07:29:26 UTC
FAIL on both x86 and x86_64-linux. Was PASS on 20060314.

,.,. C35507M ACATS 2.5 06-03-22 02:43:29
---- C35507M CHECK THAT THE ATTRIBUTES 'POS' AND 'VAL' YIELD THE CORRECT
                RESULTS WHEN THE PREFIX IS A CHARACTER TYPE WITH AN
                ENUMERATION REPESENTATION CLAUSE.
   * C35507M NO EXCEPTION RAISED FOR CHAR'VAL (IDENT_INT(2)) - 2.
   * C35507M NO EXCEPTION RAISED FOR NEWCHAR'VAL (IDENT_INT (-1)) - 2.
**** C35507M FAILED ****************************.

,.,. CD2A23E ACATS 2.5 06-03-22 03:14:18
---- CD2A23E CHECK THAT WHEN A SIZE SPECIFICATION AND AN ENUMERATION
                REPRESENTATION CLAUSE ARE GIVEN FOR AN ENUMERATION TYPE,
                THEN SUCH A TYPE CAN BE PASSED AS AN ACTUAL PARAMETER TO
                A GENERIC PROCEDURE.
   * CD2A23E CONSTRAINT ERROR NOT RAISED FOR CHECK_TYPE'SUCC.
   * CD2A23E CONSTRAINT ERROR NOT RAISED FOR CHECK_TYPE'PRED.
**** CD2A23E FAILED ****************************.

,.,. CXH1001 ACATS 2.5 06-03-22 03:28:45
---- CXH1001 Check that the configuration pragma Normalize_Scalars
                causes uninitialized scalar objects to be set to a
                predictable value. Check that multiple compilation units
                are affected.  Check for uninitialized scalar objects
                that are subcomponents of composite objects, unassigned 
                out parameters, have been allocated without an initial
                value, and are stand alone..
   * CXH1001 Implicit initial value for local variable is a value of the
                type.
   * CXH1001 Implicit initial value for local variable is a value of the
                type.
**** CXH1001 FAILED ****************************.
Comment 2 Eric Botcazou 2006-03-22 18:48:39 UTC
The 3 failures have been introduced by the following change:

2006-03-21  Jeff Law  <law@redhat.com>

	* tree-vrp.c (extract_range_from_unary_expr): Derive ranges for
	type conversions of a VR_VARYING source to a wider type.
Comment 3 baldrick@free.fr 2006-03-23 13:19:54 UTC
I've had a look at c35507m.  I think it's a front-end bug.  The problem occurs
in this type support subprogram:

C35507M.CHARRP (A, F)
{
  if ((system__unsigned_types__unsigned) A - 4 <= 1)
    {
      return (integer) ((system__unsigned_types__unsigned) A - 4);
    }
  else
    {
      if ((boolean) F)
        {
          __gnat_rcheck_06 ("c35507m.adb", 39);
        }
      else
        {

        }
      return -1;
    }

The type of A is c35507m__char, which has [TYPE_MIN, TYPE_MAX] = [4,5].
Given this, VRP correctly eliminates the range check.  The code seems to
be generated by the front-end in exp_attr.adb, around line 3849.

Ciao,

Duncan.
Comment 4 Jeffrey A. Law 2006-03-27 17:09:15 UTC
It seems to me that the fundamental problem is that the testcase manages
to stuff the value "3" into an object which is supposed to only have the
range [4, 5].

ie, if you put a breakpoint in C35507M.NEWCHARBRP and examine the first
parameter (ie A) you'll find that on the first call and second calls
"A" has the value 4 and 5 respectively.  However, on the 3rd call it
has the value 3, which is clearly out of "A"'s range [3, 4].

I suspect there are similar issues with C35507M.CHARRP since it has
similar looking code.

Someone with a better knowledge of the Ada front-end, the langauge and the
testsuite is going to need to chime in at this point.  From the data I see,
VRP is doing exactly what we want and expect it to do.
Comment 5 Jeffrey A. Law 2006-03-27 17:22:58 UTC
The cd2a23e test seems to have the same root cause as c35507m.  No clue
about cxh1001 -- I'd rather not spend a lot of time analyzing cxh1001 without
first getting some resolution on the known issues with c35507m and cd2a23e.

Jeff
Comment 6 Laurent GUERBY 2006-03-27 20:12:26 UTC
c35507m tests an enumeration with representation clause in Ada, here the enum has two values, one assigned the representation 4 ('a') and the other the representation 5 (b)

At one point the test asks what's the enum value associated with 2, in Ada an exception must be raised so the front-end inserts a few convertions as shown by gcc -c -gnatdg :

before FE expansion:
          IF CHAR'VAL (IDENT_INT(2)) = B THEN
               FAILED ( "NO EXCEPTION RAISED FOR " &
                        "CHAR'VAL (IDENT_INT(2)) - 1" );
          ELSE
               FAILED ( "NO EXCEPTION RAISED FOR " &
                        "CHAR'VAL (IDENT_INT(2)) - 2" );
          END IF;

after:
      function c35507m__charRP (A : c35507m__char; F : boolean) return
        integer is
      begin
         if system__unsigned_types__unsigned!(A) in 4 .. 5 then
            return integer(system__unsigned_types__unsigned!(A) - 4);
         else
            [constraint_error when F "invalid data"]
            return -1;
         end if;
      end c35507m__charRP;
   ]


   B_4 : begin
      if c35507m__char!(4 + c35507m__charRP (c35507m__char!(4 + {
        report__ident_int (2)}), true)) = b then
         report__failed (
           "NO EXCEPTION RAISED FOR CHAR'VAL (IDENT_INT(2)) - 1");
      else
         report__failed (
           "NO EXCEPTION RAISED FOR CHAR'VAL (IDENT_INT(2)) - 2");
      end if;

I don't know how it goes through gigi and the BE but the check

system__unsigned_types__unsigned!(A) in 4 .. 5

must not go away. May be the front-end is wrong in using function c35507m__char as input type for the parameter though.

Richard?
Comment 7 Jeffrey A. Law 2006-03-27 20:31:51 UTC
Subject: Re:  [4.2 Regression] ACATS c35507m
	cd2a23e cxh1001 failures

On Mon, 2006-03-27 at 20:12 +0000, laurent at guerby dot net wrote:

> 
> I don't know how it goes through gigi and the BE but the check
> 
> system__unsigned_types__unsigned!(A) in 4 .. 5
> 
> must not go away. May be the front-end is wrong in using function c35507m__char
> as input type for the parameter though.
And this is the heart of why I'm not a fan of Ada's type
representation.   We have an object, A with a defined type
which only allows the values 4 & 5.   However, we have a
program which effectively demands that we put the value 3
into A.


VRP is doing exactly what it should be doing with the code it's
being presented.  Ada needs to either pass A in a different
type, or use some other kind of conversion mechanism which
doesn't allow VRP to propagate range information from the
parameter.  I've never looked closely, but VIEW_CONVERT_EXPR
may be what you want rather than a NOP_EXPR or CONVERT_EXPR.

jeff

Comment 8 Richard Kenner 2006-03-28 03:56:47 UTC
Subject: Re:   [4.2 Regression] ACATS c35507m cd2a23e cxh1001 failures

    I don't know how it goes through gigi and the BE but the check

    system__unsigned_types__unsigned!(A) in 4 .. 5

    must not go away. May be the front-end is wrong in using function
    c35507m__char as input type for the parameter though.

Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR
for this type of Unchecked_Conversion.  So already on my plate.
Comment 9 Jeffrey A. Law 2006-03-28 15:01:23 UTC
Subject: Re:  [4.2 Regression] ACATS c35507m
	cd2a23e cxh1001 failures

On Tue, 2006-03-28 at 03:56 +0000, kenner at vlsi1 dot ultra dot nyu dot
edu wrote:
> 
> ------- Comment #8 from kenner at vlsi1 dot ultra dot nyu dot edu  2006-03-28 03:56 -------
> Subject: Re:   [4.2 Regression] ACATS c35507m cd2a23e cxh1001 failures
> 
>     I don't know how it goes through gigi and the BE but the check
> 
>     system__unsigned_types__unsigned!(A) in 4 .. 5
> 
>     must not go away. May be the front-end is wrong in using function
>     c35507m__char as input type for the parameter though.
> 
> Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR
> for this type of Unchecked_Conversion.  So already on my plate.
OK.  When that's fixed c35507 and cd2a23e ought to be OK.  We'll
reevaluate cxh1001 at that time as well.

Jeff


Comment 10 Eric Botcazou 2006-03-30 07:58:36 UTC
> Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR
> for this type of Unchecked_Conversion.  So already on my plate.

I think it's worse, the problematic function reads:

      function c35507m__charRP (A : c35507m__char; F : boolean) return 
        integer is
      begin
         if system__unsigned_types__unsigned!(A) in 4 .. 5 then
            return integer(system__unsigned_types__unsigned!(A) - 4);
         else
            [constraint_error when F "invalid data"]
            return -1;
         end if;
      end c35507m__charRP;

I'm not sure a VIEW_CONVERT_EXPR will be sufficient because of the type of A.

 <function_decl 0x55722080 c35507m__charRP
    type <function_type 0x55729b24
        type <integer_type 0x557299b4 integer type <integer_type 0x556e5284 integer>
            sizes-gimplified public visited SI
            size <integer_cst 0x556d73d8 constant invariant 32>
            unit size <integer_cst 0x556d7168 constant invariant 4>
            user align 32 symtab 0 alias set 0 precision 32 min <integer_cst 0x556d7390 -2147483648> max <integer_cst 0x556d73a8 2147483647> RM size <integer_cst 0x556d73d8 32>>
        readonly QI
        size <integer_cst 0x556d71e0 constant invariant 8>
        unit size <integer_cst 0x556d71f8 constant invariant 1>
        align 8 symtab 0 alias set -1
        arg-types <tree_list 0x557280f0 value <enumeral_type 0x55729a10 c35507m__char>
            chain <tree_list 0x55728108 value <enumeral_type 0x55729a6c boolean>
                chain <tree_list 0x55728120 value <void_type 0x556e56d4 void>>>>
        pointer_to_this <pointer_type 0x5572e2e0>>
    readonly addressable static QI file c35507m.adb line 5 context <function_decl 0x55722000 c35507m> initial <block 0x556e31d4>
    arguments <parm_decl 0x557250f0 A
        type <enumeral_type 0x55729a10 c35507m__char readonly sizes-gimplified unsigned QI size <integer_cst 0x556d71e0 8> unit size <integer_cst 0x556d71f8 1>
            user align 8 symtab 0 alias set -1 precision 8 min <integer_cst 0x5571ae70 4> max <integer_cst 0x5571afd8 5>
            RM size <integer_cst 0x5571ad68 constant invariant 3>>
        readonly unsigned QI file c35507m.adb line 5 size <integer_cst 0x556d71e0 8> unit size <integer_cst 0x556d71f8 1>
        align 8 context <function_decl 0x55722080 c35507m__charRP> initial <integer_type 0x556e5284 integer>

So c35507m__charRP "converts" its argument to c35507m__char before the check that would allow it to do so!  Exactly the same issue as the one we have recently addressed in convert_with_check.

I think it's a front-end problem and the argument of c35507m__charRP should be in the base type.
Comment 11 Eric Botcazou 2006-03-30 08:25:46 UTC
> I think it's a front-end problem and the argument of c35507m__charRP should be
> in the base type.

Humpf... Freeze_Enumeration_Type has this big comment:

      --  Now we build the function that converts representation values to
      --  position values. This function has the form:

      --    function _Rep_To_Pos (A : etype; F : Boolean) return Integer is
      --    begin
      --       case ityp!(A) is
      --         when enum-lit'Enum_Rep => return posval;
      --         when enum-lit'Enum_Rep => return posval;
      --         ...
      --         when others   =>
      --           [raise Constraint_Error when F "invalid data"]
      --           return -1;
      --       end case;
      --    end;

      --  Note: the F parameter determines whether the others case (no valid
      --  representation) raises Constraint_Error or returns a unique value
      --  of minus one. The latter case is used, e.g. in 'Valid code.

      --  Note: the reason we use Enum_Rep values in the case here is to avoid
      --  the code generator making inappropriate assumptions about the range
      --  of the values in the case where the value is invalid. ityp is a
      --  signed or unsigned integer type of appropriate width.
Comment 12 Eric Botcazou 2006-03-30 08:32:05 UTC
So, in the end, the problem seems to be equivalent to the 'Valid stuff.

Richard, I know Robert and you have come up with a plan to address the 'Valid problem.  Is there an implementation sketch available somewhere?
Comment 13 Richard Kenner 2006-03-30 12:13:37 UTC
Subject: Re:   [4.2 Regression] ACATS c35507m cd2a23e cxh1001 failures

    Richard, I know Robert and you have come up with a plan to address the
    'Valid problem.  Is there an implementation sketch available
    somewhere?

No, because it's just a matter of having the unchecked conversion converted
to a VIEW_CONVERT_EXPR.
Comment 14 Andrew Pinski 2006-04-21 23:40:19 UTC
*** Bug 27242 has been marked as a duplicate of this bug. ***
Comment 15 Andrew Pinski 2006-04-21 23:40:55 UTC
*** Bug 27244 has been marked as a duplicate of this bug. ***
Comment 16 Andrew Pinski 2006-04-21 23:41:12 UTC
*** Bug 27245 has been marked as a duplicate of this bug. ***
Comment 17 Christian Joensson 2006-06-04 08:27:45 UTC
just a ping here, is anyone here able to say anything about status of this bug?
Comment 18 Mark Mitchell 2006-06-04 18:17:44 UTC
Ada is not release-critical.
Comment 19 Laurent GUERBY 2006-06-26 23:33:51 UTC
Richard Kenner:
> > Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR
> > for this type of Unchecked_Conversion.  So already on my plate.

Was a patch written for that at AdaCore?
Comment 20 Richard Kenner 2006-06-27 03:03:48 UTC
Subject: Re:  [4.2 Regression] ACATS c35507m cd2a23e cxh1001 fail

> > > Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR
> > > for this type of Unchecked_Conversion.  So already on my plate.
> 
> Was a patch written for that at AdaCore?

Yes, but only on Friday and it hasn't been tested yet.
Comment 21 Eric Botcazou 2006-10-29 23:17:12 UTC
After the latest changes to VRP, only cxh1001 still fails on mainline.
Comment 22 Jeffrey A. Law 2006-10-30 09:04:47 UTC
Subject: Re:  [4.2/4.3 regression] ACATS c35507m cd2a23e
	cxh1001 fail

On Sun, 2006-10-29 at 23:17 +0000, ebotcazou at gcc dot gnu dot org
wrote:
> 
> ------- Comment #21 from ebotcazou at gcc dot gnu dot org  2006-10-29 23:17 -------
> After the latest changes to VRP, only cxh1001 still fails on mainline.
Maybe so, but that would most likely be by accident, not by design.  In
fact, that probably indicates an optimization regression in the recent
changes to VRP.


The Ada front-end was generating bogus trees for some of those tests
and unless the Ada front-end has been fixed we shouldn't really
consider those problems fixed.

jeff


Comment 23 Eric Botcazou 2006-10-30 09:21:49 UTC
> Maybe so, but that would most likely be by accident, not by design.  In
> fact, that probably indicates an optimization regression in the recent
> changes to VRP.

No disagreement, I was merely updating the status of the PR, whose summary can
now be slightly misleading, precisely to acknowledge that the latter should not
be changed at this point.
Comment 24 Eric Botcazou 2007-03-04 10:05:55 UTC
Subject: Bug 26797

Author: ebotcazou
Date: Sun Mar  4 10:05:44 2007
New Revision: 122524

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=122524
Log:
	PR ada/26797
	* doc/invoke.texi (-O2): Document that Tree-VRP is not turned on
	for the Ada compiler.
ada/
	* lang.opt: Accept -ftree-vrp for Ada.
	* misc.c (OPT_ftree_vrp): New case.
	(gnat_post_options): Disable Tree-VRP unless explicitly turned on.


Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/ada/ChangeLog
    branches/gcc-4_2-branch/gcc/ada/lang.opt
    branches/gcc-4_2-branch/gcc/ada/misc.c
    branches/gcc-4_2-branch/gcc/doc/invoke.texi

Comment 25 Duncan Sands 2007-03-08 09:56:56 UTC
I can't help feeling that VIEW_CONVERT_EXPR is not the right tool
for implementing 'Valid.  I think an intrinsic would be better,
eg "int __builtin_nop(int)" which is defined to return its
argument unchanged.  Then 'Valid can be implemented as something
like:
x'Valid
->
y = __builtin_nop(x); valid = (y>=lower_bound && y <=upper_bound);
The point is that the intrinsic would be opaque to the optimizers,
and would only be lowered to the identity function *after* the tree
optimizers have run.  One annoyance is that presumably intrinsics
would be needed for all integer and float precisions, eg
__builtin_nop8, __builtin_nop16, etc.
Comment 26 Richard Kenner 2007-03-08 12:54:41 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> y = __builtin_nop(x); valid = (y>=lower_bound && y <=upper_bound);
> The point is that the intrinsic would be opaque to the optimizers,
> and would only be lowered to the identity function *after* the tree
> optimizers have run.  One annoyance is that presumably intrinsics
> would be needed for all integer and float precisions, eg
> __builtin_nop8, __builtin_nop16, etc.

More than each precision.  The VIEW_CONVERT_EXPR is to the base type
and there can be an unlimited number of them for each precision.

Because it has to work with arbitrary types, a builtin won't do it.
We could certainly add a new tree expression that says "don't look through
this for VRP purposes", but we already have V_C_E.
Comment 27 baldrick@free.fr 2007-03-08 16:06:18 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> > y = __builtin_nop(x); valid = (y>=lower_bound && y <=upper_bound);
> > The point is that the intrinsic would be opaque to the optimizers,
> > and would only be lowered to the identity function *after* the tree
> > optimizers have run.  One annoyance is that presumably intrinsics
> > would be needed for all integer and float precisions, eg
> > __builtin_nop8, __builtin_nop16, etc.
> 
> More than each precision.  The VIEW_CONVERT_EXPR is to the base type
> and there can be an unlimited number of them for each precision.

I don't see what the problem is - you don't have to convert to the base
type, you can always convert to some standard type of that precision,
eg int32, before calling the builtin.

> Because it has to work with arbitrary types, a builtin won't do it.

See above.

> We could certainly add a new tree expression that says "don't look through
> this for VRP purposes", but we already have V_C_E.

Sure, it's just that overloading V_C_E like this feels somehow wrong to me.
However I haven't been able to put my finger on any technical obstacle to
this use of V_C_E.

Ciao,

D.
Comment 28 Richard Kenner 2007-03-08 16:52:11 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> I don't see what the problem is - you don't have to convert to the base
> type, you can always convert to some standard type of that precision,
> eg int32, before calling the builtin.

Sure, but it's extra tree nodes and more to do.  See below.

> Sure, it's just that overloading V_C_E like this feels somehow wrong to me.

Why?  It's not "overloading".  V_C_E of an expression E of type X to
type Y means "interpret the bits of E as if it were type Y and not type X".
If Y is X'Base, then interpreting E as being Y means that it can now have
all the values of Y.  In other words, we could only change a V_C_E to a
NOP_EXPR if we can prove that the value of E is in range of *both* X
and Y.

Of course, we still have a bit of a mess here in that the real point is
a confusion between what in Ada is called a Bounded Error and C's notion
of "undefined" (Ada's "erroneous").  But I think we can do better in this
area: we just haven't gotten to take a really good look at it.

> However I haven't been able to put my finger on any technical obstacle to
> this use of V_C_E.

Nor can I ...
Comment 29 baldrick@free.fr 2007-03-09 10:41:18 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> > Sure, it's just that overloading V_C_E like this feels somehow wrong to me.
> 
> Why?  It's not "overloading".  V_C_E of an expression E of type X to
> type Y means "interpret the bits of E as if it were type Y and not type X".
> If Y is X'Base, then interpreting E as being Y means that it can now have
> all the values of Y.

For example, after the view conversion it is legal to place a value in it that
is in the range of Y but not of X.  This is logical.  What seems illogical to
me is that by doing the V_C_E you are supposed to forget anything you had
deduced about the bits before the conversion.  For example, suppose you had
deduced that the bits were all zero.  After the V_C_E the bits will still be
zero - are you really supposed to forget this information?  Another example:
suppose you had worked out somehow that the bits corresponded to an integer
in the range 4..5.  After the V_C_E this is still true - why is it logical
to forget this information?  You may well respond that you are only supposed
to forget information you deduced from the range of the type, not information
you worked out by some other means.  But these cannot really be distinguished,
c.f. VRP.

In short, it is overloading because it means: consider this bunch of bits
to be of the new type AND forget anything you ever deduced about them.

> In other words, we could only change a V_C_E to a 
> NOP_EXPR if we can prove that the value of E is in range of *both* X
> and Y.

Surely you only have to prove that it is in the range of Y, since if it
is not in the range of X then you have a bounded error, and doing a NOP_EXPR
should be perfectly legal.

> Of course, we still have a bit of a mess here in that the real point is
> a confusion between what in Ada is called a Bounded Error and C's notion
> of "undefined" (Ada's "erroneous").  But I think we can do better in this
> area: we just haven't gotten to take a really good look at it.
> 
> > However I haven't been able to put my finger on any technical obstacle to
> > this use of V_C_E.
> 
> Nor can I ...

Actually I do have a technical objection, however it comes from my LLVM
port and not mainline gcc.  LLVM is lower level than gimple.  Supposing
types X and Y have eg 32 bits, then they both get mapped to LLVM's Int32
type.  A V_C_E gets turned into nothing at all.  There are no types with
restricted ranges.  Suppose x is a variable of type X.  When a value is
loaded into x I'm planning on generating the following LLVM code:
	x = value;
	assert(x >= lower_bound_for_gcc_type && x <= upper_bound_for_gcc_type);
I don't generate the assert when you do a V_C_E, so for example suppose there
is a y = V_C_E(Y, x).  This gets turned into:
	y = x;
Since the optimizers have range info for x, they instantly deduce the same
ranges for y as for x.  Thus if you test for validity of y using
	valid = (y >= lower_bound_for_gcc_type_X && y <= upper_bound_for_gcc_type-X);
then the result is always true.  How to get around this?  The obvious method is to
use an intrinsic, such as __builtin_nop.  For the V_C_E I can then do:
	y = __builtin_nop(x);
and all will be well.  However, I only really need to use the __builtin_nop if
I'm doing the V_C_E in order to implement 'Valid.  For other uses of V_C_E it
is unnecessary and gets in the way of optimization.  So I would much rather that
'Valid directly made use of a builtin.

Ciao,

Duncan.
Comment 30 Richard Biener 2007-03-09 11:11:35 UTC
Well, the only problem with V_C_E is that if you assert on the range of the
base type like

  if (V_C_E <X'Base> (y) > 5)
    abort();

that you still want VRP to infer that V_C_E <X'Base> (y) is <= 5 after this
check (to eliminate further checks).  I believe this will not automatically
happen at the moment because V_C_E <X'Base> (y) will not have the same
ssa name assigned for evey occurance.  But of course we will never know until
the Ada FE people finally fix their frontend.

All the mess would be way easier if the FE would not expose the subtypes to
the middle-end...
Comment 31 Eric Botcazou 2007-03-09 11:19:39 UTC
> All the mess would be way easier if the FE would not expose the subtypes to
> the middle-end...

I personally agree, but there is no clear consensus among the Ada maintainers.
Comment 32 baldrick@free.fr 2007-03-09 11:34:56 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> Well, the only problem with V_C_E is that if you assert on the range of the
> base type like
> 
>   if (V_C_E <X'Base> (y) > 5)
>     abort();
> 
> that you still want VRP to infer that V_C_E <X'Base> (y) is <= 5 after this
> check (to eliminate further checks).

I don't think this is a very serious problem.  My understanding is that the checks
can be divided into two classes: normal checks and validity checks.  A normal check,
such as when you do a type conversion, does not use a V_C_E, it just does:

if (y < new_type_lb || y > new_type_ub)
	abort;
new_var = (new_type) y;

A validity check does:

if (V_C_E <X'Base> (y) < x_lb || V_C_E <X'Base> (y) > x_ub)
  abort();

Note how it checks that x is in the range of x's type, so this does not
give any new information about x.

It is true that multiple validity checks will not be eliminated (multiple normal
checks will be eliminated), but validity checks are not that common.  One place
that might hurt is in array accesses: IIRC the front-end does a validity check
on the index before accessing an array, rather than a normal check, because it
considers it too dangerous to do otherwise (eg: they want to catch the case
when the index has an out-of-range value because it is uninitialized).

My suggested use of a builtin would allow multiple redundant validity checks
to be safely eliminated, because the builtin would be a "pure" function.  The
validity check becomes:

y2 = __builtin_nop(y);
if (y2 < x_lb || y2 > x_ub)
  abort();

Now suppose you do another one:

y22 = __builtin_nop(y);
if (y22 < x_lb || y22 > x_ub)
  abort();

The compiler can deduce that y2 and y22 are the same, and eliminate the
second check.

> I believe this will not automatically 
> happen at the moment because V_C_E <X'Base> (y) will not have the same
> ssa name assigned for evey occurance.  But of course we will never know until
> the Ada FE people finally fix their frontend.
> 
> All the mess would be way easier if the FE would not expose the subtypes to
> the middle-end...

I agree.  The LLVM model seems more sensible in this regard.

D.
Comment 33 baldrick@free.fr 2007-03-09 11:50:28 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> My suggested use of a builtin would allow multiple redundant validity checks
> to be safely eliminated, because the builtin would be a "pure" function.

This is presumably also the case with V_C_E.
Comment 34 Richard Kenner 2007-03-09 13:59:29 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> You may well respond that you are only supposed to forget
> information you deduced from the range of the type, not information
> you worked out by some other means.  But these cannot really be
> distinguished, c.f. VRP.

Exactly.  That's indeed the problem.  It's similar to the whole
signed-overflow issue: what you're able to safely do with with
VRP-derived information depends on how you derived it and there's
no way of tracking that.

> Surely you only have to prove that it is in the range of Y, since if it
> is not in the range of X then you have a bounded error, and doing a NOP_EXPR
> should be perfectly legal.

Think of a range check being done in a subscript reference in the LHS of
an assignment.  A bounded error is not allowed to cause a memory store
outside the bounds of an array.
Comment 35 Richard Kenner 2007-03-09 14:18:41 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> All the mess would be way easier if the FE would not expose the subtypes to
> the middle-end...

I don't see how giving *additional* information could be bad: the middle-end
is certainly free to ignore the bounds if it wants to!

The point is that the bounds *do* say something and provide useful
information to optimizers.  The problem is that they don't say what the
middle-end *wants* them to say!

What they *do* mean is that it's a "bounded error" for a variable to have a
value outside of the given bounds.  What the middle-end *wants* them to mean
is that it's *erroneous* for a variable to have a value outside of the given
bounds.  We need to reconcile those two concepts.

From a pragmatic point of view, they aren't that different, but there are a
couple of important exceptions.  The array bound on the LHS is one of them.

A full solution to this is complex, but is indeed something that I'd like to
happen some day.  It involves work in various places.  One thing that would
help is that many variables can be proven to have the property that the only
way they can have an invalid value is if erroneous behavior has occurred.
For such variables, the meaning of the bounds is precisely what the
middle-end expects it to mean.

The most common case where you can't do this is for component references or
pointer dereferences.  In that case, we have to address this by
distinguishing between two types of tests, those that will just propagate the
bounded error condition and those that would convert a bounded error to
erroneous behavior.  We are allowed to use the bounds information in
reasoning about the former, but not the latter.

Note that there are similarities to the signed-overflow case here: there are
some tests that we can safely eliminate with knowlege that signed-overflow is
undefined, but others that we don't want to.

So I think the "final" solution here will be to have the Ada front end
provide the flag information on variables and tests mentioned above and
extend VRP to use them.  One of the two front-end issues (the latter) is
quite easy and I think the other is moderately-easy.  I don't know how much
work the required VRP changes would be.
Comment 36 Richard Kenner 2007-03-09 14:29:22 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> I don't think this is a very serious problem.  My understanding is
> that the checks can be divided into two classes: normal checks and
> validity checks.  A normal check, such as when you do a type
> conversion, does not use a V_C_E, it just does:
> 
> if (y < new_type_lb || y > new_type_ub)
>         abort;
> new_var = (new_type) y;

Not clear.  Consider the range check for a subscript on the LHS, for
example. These issues are *very* subtle ...
Comment 37 baldrick@free.fr 2007-03-09 22:44:34 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> Think of a range check being done in a subscript reference in the LHS of
> an assignment.  A bounded error is not allowed to cause a memory store
> outside the bounds of an array.

This is exactly a case when you should do what I called a "validity check"
rather than a "normal check".
Comment 38 baldrick@free.fr 2007-03-09 23:10:24 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> > All the mess would be way easier if the FE would not expose the subtypes to
> > the middle-end...
> 
> I don't see how giving *additional* information could be bad: the middle-end
> is certainly free to ignore the bounds if it wants to!

Yes, it can ignore them entirely if it wants to, but it can't ignore them
partially.  For example, fold is happy to create values which lie outside
the range of the type (i.e. fold is ignoring the range information); as a
consequence any part of the ME that exploits range information will give
wrong results right now.

> The point is that the bounds *do* say something and provide useful
> information to optimizers.  The problem is that they don't say what the
> middle-end *wants* them to say!
> 
> What they *do* mean is that it's a "bounded error" for a variable to have a
> value outside of the given bounds.  What the middle-end *wants* them to mean
> is that it's *erroneous* for a variable to have a value outside of the given
> bounds.  We need to reconcile those two concepts.

I don't think the middle-end needs to know about this distinction.  In fact
it would be wise to keep such subtleties in the front-end.  That can be done
as follows: in situations that might lead to erroneous behavior if a variable
has a value outside it's range, you add a validity check and raise Program_Error
if it fails (this is authorized by the Ada RM).  Thus in order to avoid
erroneousness you only need a reliable way of checking validity.

> From a pragmatic point of view, they aren't that different, but there are a
> couple of important exceptions.  The array bound on the LHS is one of them.

This is solved by adding a validity check before accessing the memory, see
above.

> A full solution to this is complex, but is indeed something that I'd like to
> happen some day.  It involves work in various places.  One thing that would
> help is that many variables can be proven to have the property that the only
> way they can have an invalid value is if erroneous behavior has occurred.
> For such variables, the meaning of the bounds is precisely what the
> middle-end expects it to mean.
> 
> The most common case where you can't do this is for component references or
> pointer dereferences.

I don't see that pointer dereferences are a problem - I'm pretty sure that
the RM places you firmly in the erroneous zone if you do something that can
give a pointer a bogus value.  That leaves array accesses.  If your index
is outside the range of the type, you can catch that and raise Program_Error.
Otherwise it is in range, so you load a value from the array (this value
may be uninitialized).  I believe this is OK for a bounded error.  In short
I don't see any problem here, or any need to track bounded/erroneous
behaviour around the program.  Do you have an example where you can't
prevent a bounded error being (wrongly) turned into erroneous behaviour
by the addition of a local check?

> In that case, we have to address this by 
> distinguishing between two types of tests, those that will just propagate the
> bounded error condition and those that would convert a bounded error to
> erroneous behavior.  We are allowed to use the bounds information in
> reasoning about the former, but not the latter.

Exactly, and I'm pretty sure that all you need to handle this is a reliable
way of doing validity checking.

> Note that there are similarities to the signed-overflow case here: there are
> some tests that we can safely eliminate with knowlege that signed-overflow is
> undefined, but others that we don't want to.
> 
> So I think the "final" solution here will be to have the Ada front end
> provide the flag information on variables and tests mentioned above and
> extend VRP to use them.  One of the two front-end issues (the latter) is
> quite easy and I think the other is moderately-easy.  I don't know how much
> work the required VRP changes would be.

I don't see why any such flags are required.  Why does validity checking
(implemented by V_C_E or an intrinsic or whatever) not suffice?

Ciao,

Duncan.
Comment 39 Richard Kenner 2007-03-09 23:13:21 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> I don't see that pointer dereferences are a problem - I'm pretty sure that
> the RM places you firmly in the erroneous zone if you do something that can
> give a pointer a bogus value.

No.  Referencing an uninitialized variable is a bounded error, not
erroneous.  That fact is basically what causes most of this trouble ...
Comment 40 baldrick@free.fr 2007-03-09 23:41:37 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

On Saturday 10 March 2007 00:13:27 kenner at vlsi1 dot ultra dot nyu dot edu wrote:
> 
> ------- Comment #39 from kenner at vlsi1 dot ultra dot nyu dot edu  2007-03-09 23:13 -------
> Subject: Re:  [4.3 regression] ACATS cxh1001 fails
> 
> > I don't see that pointer dereferences are a problem - I'm pretty sure that
> > the RM places you firmly in the erroneous zone if you do something that can
> > give a pointer a bogus value.
> 
> No.  Referencing an uninitialized variable is a bounded error, not
> erroneous.  That fact is basically what causes most of this trouble ...

It is not possible for a pointer value to be uninitialized.  The language
requires all pointers to be default initialized to null.
Comment 41 Richard Kenner 2007-03-10 00:17:38 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> It is not possible for a pointer value to be uninitialized.  The language
> requires all pointers to be default initialized to null.

I mean the thing that pointer points to!
Comment 42 baldrick@free.fr 2007-03-13 10:30:13 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> > It is not possible for a pointer value to be uninitialized.  The language
> > requires all pointers to be default initialized to null.
> 
> I mean the thing that pointer points to!

I think I now understand: I thought the problem we were discussing was
how to obtain correctness (which seems to be easy using local checks)
while in fact you were talking about how to optimize away unnecessary
validity checks (as compared to normal checks).  My current plan for
that in LLVM is to implement validity checks using a builtin_nop
intrinsic.  Multiple successive validity checks will be eliminated
as redundant because builtin_nop is a pure function.  That leaves
the question of how to teach the optimizers that an object is valid
in the various cases where the front-end for some reason knows this.
In such cases I'm thinking that you can issue an assert instruction
saying so, eg if type T has range 10..20 and at some point the front-end
knows that x (of type T) is valid, it can issue:
	assert(builtin_nop(x)>=10 && builtin_nop(x)<=20);
The dataflow and predicate analysis needed to exploit this can hopefully
be left to the optimizers, in particular to IPO (which LLVM is rather
good at).
Comment 43 Richard Kenner 2007-03-13 12:33:57 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

> I think I now understand: I thought the problem we were discussing was
> how to obtain correctness (which seems to be easy using local checks)
> while in fact you were talking about how to optimize away unnecessary
> validity checks (as compared to normal checks).  

Yes, I was talking about both.
Comment 44 Eric Botcazou 2007-09-12 15:53:12 UTC
Subject: Bug 26797

Author: ebotcazou
Date: Wed Sep 12 15:52:57 2007
New Revision: 128441

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=128441
Log:
	PR ada/26797
	PR ada/32407
	* utils.c (unchecked_convert): Use a subtype as the intermediate type
	in the special VIEW_CONVERT_EXPR case.


Modified:
    trunk/gcc/ada/ChangeLog
    trunk/gcc/ada/utils.c

Comment 45 Eric Botcazou 2007-09-12 15:55:19 UTC
At long last.
Comment 46 Jeffrey A. Law 2007-09-15 06:35:03 UTC
Subject: Re:  [4.3 regression] ACATS cxh1001 fails

On Wed, 2007-09-12 at 15:55 +0000, ebotcazou at gcc dot gnu dot org
wrote:
> 
> ------- Comment #45 from ebotcazou at gcc dot gnu dot org  2007-09-12 15:55 -------
> At long last.
Woo hoo.

An oldie, but a goodie.

jeff