Summary: | [4.3 regression] ACATS cxh1001 fails | ||
---|---|---|---|
Product: | gcc | Reporter: | Laurent GUERBY <laurent> |
Component: | ada | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | baldrick, christian.joensson, danglin, ebotcazou, gcc-bugs, kenner, law, pinskia, rguenth |
Priority: | P5 | Keywords: | wrong-code |
Version: | 4.2.0 | ||
Target Milestone: | 4.3.0 | ||
URL: | http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01100.html | ||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | 2007-03-04 10:07:22 | |
Bug Depends on: | |||
Bug Blocks: | 30911, 31108 |
Description
Laurent GUERBY
2006-03-22 07:29:26 UTC
Not present on 21/03: http://gcc.gnu.org/ml/gcc-testresults/2006-03/msg01395.html Present on 22/03: http://gcc.gnu.org/ml/gcc-testresults/2006-03/msg01449.html The 3 failures have been introduced by the following change: 2006-03-21 Jeff Law <law@redhat.com> * tree-vrp.c (extract_range_from_unary_expr): Derive ranges for type conversions of a VR_VARYING source to a wider type. I've had a look at c35507m. I think it's a front-end bug. The problem occurs in this type support subprogram: C35507M.CHARRP (A, F) { if ((system__unsigned_types__unsigned) A - 4 <= 1) { return (integer) ((system__unsigned_types__unsigned) A - 4); } else { if ((boolean) F) { __gnat_rcheck_06 ("c35507m.adb", 39); } else { } return -1; } The type of A is c35507m__char, which has [TYPE_MIN, TYPE_MAX] = [4,5]. Given this, VRP correctly eliminates the range check. The code seems to be generated by the front-end in exp_attr.adb, around line 3849. Ciao, Duncan. It seems to me that the fundamental problem is that the testcase manages to stuff the value "3" into an object which is supposed to only have the range [4, 5]. ie, if you put a breakpoint in C35507M.NEWCHARBRP and examine the first parameter (ie A) you'll find that on the first call and second calls "A" has the value 4 and 5 respectively. However, on the 3rd call it has the value 3, which is clearly out of "A"'s range [3, 4]. I suspect there are similar issues with C35507M.CHARRP since it has similar looking code. Someone with a better knowledge of the Ada front-end, the langauge and the testsuite is going to need to chime in at this point. From the data I see, VRP is doing exactly what we want and expect it to do. The cd2a23e test seems to have the same root cause as c35507m. No clue about cxh1001 -- I'd rather not spend a lot of time analyzing cxh1001 without first getting some resolution on the known issues with c35507m and cd2a23e. Jeff c35507m tests an enumeration with representation clause in Ada, here the enum has two values, one assigned the representation 4 ('a') and the other the representation 5 (b) At one point the test asks what's the enum value associated with 2, in Ada an exception must be raised so the front-end inserts a few convertions as shown by gcc -c -gnatdg : before FE expansion: IF CHAR'VAL (IDENT_INT(2)) = B THEN FAILED ( "NO EXCEPTION RAISED FOR " & "CHAR'VAL (IDENT_INT(2)) - 1" ); ELSE FAILED ( "NO EXCEPTION RAISED FOR " & "CHAR'VAL (IDENT_INT(2)) - 2" ); END IF; after: function c35507m__charRP (A : c35507m__char; F : boolean) return integer is begin if system__unsigned_types__unsigned!(A) in 4 .. 5 then return integer(system__unsigned_types__unsigned!(A) - 4); else [constraint_error when F "invalid data"] return -1; end if; end c35507m__charRP; ] B_4 : begin if c35507m__char!(4 + c35507m__charRP (c35507m__char!(4 + { report__ident_int (2)}), true)) = b then report__failed ( "NO EXCEPTION RAISED FOR CHAR'VAL (IDENT_INT(2)) - 1"); else report__failed ( "NO EXCEPTION RAISED FOR CHAR'VAL (IDENT_INT(2)) - 2"); end if; I don't know how it goes through gigi and the BE but the check system__unsigned_types__unsigned!(A) in 4 .. 5 must not go away. May be the front-end is wrong in using function c35507m__char as input type for the parameter though. Richard? Subject: Re: [4.2 Regression] ACATS c35507m
cd2a23e cxh1001 failures
On Mon, 2006-03-27 at 20:12 +0000, laurent at guerby dot net wrote:
>
> I don't know how it goes through gigi and the BE but the check
>
> system__unsigned_types__unsigned!(A) in 4 .. 5
>
> must not go away. May be the front-end is wrong in using function c35507m__char
> as input type for the parameter though.
And this is the heart of why I'm not a fan of Ada's type
representation. We have an object, A with a defined type
which only allows the values 4 & 5. However, we have a
program which effectively demands that we put the value 3
into A.
VRP is doing exactly what it should be doing with the code it's
being presented. Ada needs to either pass A in a different
type, or use some other kind of conversion mechanism which
doesn't allow VRP to propagate range information from the
parameter. I've never looked closely, but VIEW_CONVERT_EXPR
may be what you want rather than a NOP_EXPR or CONVERT_EXPR.
jeff
Subject: Re: [4.2 Regression] ACATS c35507m cd2a23e cxh1001 failures I don't know how it goes through gigi and the BE but the check system__unsigned_types__unsigned!(A) in 4 .. 5 must not go away. May be the front-end is wrong in using function c35507m__char as input type for the parameter though. Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR for this type of Unchecked_Conversion. So already on my plate. Subject: Re: [4.2 Regression] ACATS c35507m
cd2a23e cxh1001 failures
On Tue, 2006-03-28 at 03:56 +0000, kenner at vlsi1 dot ultra dot nyu dot
edu wrote:
>
> ------- Comment #8 from kenner at vlsi1 dot ultra dot nyu dot edu 2006-03-28 03:56 -------
> Subject: Re: [4.2 Regression] ACATS c35507m cd2a23e cxh1001 failures
>
> I don't know how it goes through gigi and the BE but the check
>
> system__unsigned_types__unsigned!(A) in 4 .. 5
>
> must not go away. May be the front-end is wrong in using function
> c35507m__char as input type for the parameter though.
>
> Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR
> for this type of Unchecked_Conversion. So already on my plate.
OK. When that's fixed c35507 and cd2a23e ought to be OK. We'll
reevaluate cxh1001 at that time as well.
Jeff
> Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR
> for this type of Unchecked_Conversion. So already on my plate.
I think it's worse, the problematic function reads:
function c35507m__charRP (A : c35507m__char; F : boolean) return
integer is
begin
if system__unsigned_types__unsigned!(A) in 4 .. 5 then
return integer(system__unsigned_types__unsigned!(A) - 4);
else
[constraint_error when F "invalid data"]
return -1;
end if;
end c35507m__charRP;
I'm not sure a VIEW_CONVERT_EXPR will be sufficient because of the type of A.
<function_decl 0x55722080 c35507m__charRP
type <function_type 0x55729b24
type <integer_type 0x557299b4 integer type <integer_type 0x556e5284 integer>
sizes-gimplified public visited SI
size <integer_cst 0x556d73d8 constant invariant 32>
unit size <integer_cst 0x556d7168 constant invariant 4>
user align 32 symtab 0 alias set 0 precision 32 min <integer_cst 0x556d7390 -2147483648> max <integer_cst 0x556d73a8 2147483647> RM size <integer_cst 0x556d73d8 32>>
readonly QI
size <integer_cst 0x556d71e0 constant invariant 8>
unit size <integer_cst 0x556d71f8 constant invariant 1>
align 8 symtab 0 alias set -1
arg-types <tree_list 0x557280f0 value <enumeral_type 0x55729a10 c35507m__char>
chain <tree_list 0x55728108 value <enumeral_type 0x55729a6c boolean>
chain <tree_list 0x55728120 value <void_type 0x556e56d4 void>>>>
pointer_to_this <pointer_type 0x5572e2e0>>
readonly addressable static QI file c35507m.adb line 5 context <function_decl 0x55722000 c35507m> initial <block 0x556e31d4>
arguments <parm_decl 0x557250f0 A
type <enumeral_type 0x55729a10 c35507m__char readonly sizes-gimplified unsigned QI size <integer_cst 0x556d71e0 8> unit size <integer_cst 0x556d71f8 1>
user align 8 symtab 0 alias set -1 precision 8 min <integer_cst 0x5571ae70 4> max <integer_cst 0x5571afd8 5>
RM size <integer_cst 0x5571ad68 constant invariant 3>>
readonly unsigned QI file c35507m.adb line 5 size <integer_cst 0x556d71e0 8> unit size <integer_cst 0x556d71f8 1>
align 8 context <function_decl 0x55722080 c35507m__charRP> initial <integer_type 0x556e5284 integer>
So c35507m__charRP "converts" its argument to c35507m__char before the check that would allow it to do so! Exactly the same issue as the one we have recently addressed in convert_with_check.
I think it's a front-end problem and the argument of c35507m__charRP should be in the base type.
> I think it's a front-end problem and the argument of c35507m__charRP should be
> in the base type.
Humpf... Freeze_Enumeration_Type has this big comment:
-- Now we build the function that converts representation values to
-- position values. This function has the form:
-- function _Rep_To_Pos (A : etype; F : Boolean) return Integer is
-- begin
-- case ityp!(A) is
-- when enum-lit'Enum_Rep => return posval;
-- when enum-lit'Enum_Rep => return posval;
-- ...
-- when others =>
-- [raise Constraint_Error when F "invalid data"]
-- return -1;
-- end case;
-- end;
-- Note: the F parameter determines whether the others case (no valid
-- representation) raises Constraint_Error or returns a unique value
-- of minus one. The latter case is used, e.g. in 'Valid code.
-- Note: the reason we use Enum_Rep values in the case here is to avoid
-- the code generator making inappropriate assumptions about the range
-- of the values in the case where the value is invalid. ityp is a
-- signed or unsigned integer type of appropriate width.
So, in the end, the problem seems to be equivalent to the 'Valid stuff. Richard, I know Robert and you have come up with a plan to address the 'Valid problem. Is there an implementation sketch available somewhere? Subject: Re: [4.2 Regression] ACATS c35507m cd2a23e cxh1001 failures Richard, I know Robert and you have come up with a plan to address the 'Valid problem. Is there an implementation sketch available somewhere? No, because it's just a matter of having the unchecked conversion converted to a VIEW_CONVERT_EXPR. *** Bug 27242 has been marked as a duplicate of this bug. *** *** Bug 27244 has been marked as a duplicate of this bug. *** *** Bug 27245 has been marked as a duplicate of this bug. *** just a ping here, is anyone here able to say anything about status of this bug? Ada is not release-critical. Richard Kenner:
> > Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR
> > for this type of Unchecked_Conversion. So already on my plate.
Was a patch written for that at AdaCore?
Subject: Re: [4.2 Regression] ACATS c35507m cd2a23e cxh1001 fail
> > > Looks like the same bug that Gigi doesn't currently use VIEW_CONVERT_EXPR
> > > for this type of Unchecked_Conversion. So already on my plate.
>
> Was a patch written for that at AdaCore?
Yes, but only on Friday and it hasn't been tested yet.
After the latest changes to VRP, only cxh1001 still fails on mainline. Subject: Re: [4.2/4.3 regression] ACATS c35507m cd2a23e
cxh1001 fail
On Sun, 2006-10-29 at 23:17 +0000, ebotcazou at gcc dot gnu dot org
wrote:
>
> ------- Comment #21 from ebotcazou at gcc dot gnu dot org 2006-10-29 23:17 -------
> After the latest changes to VRP, only cxh1001 still fails on mainline.
Maybe so, but that would most likely be by accident, not by design. In
fact, that probably indicates an optimization regression in the recent
changes to VRP.
The Ada front-end was generating bogus trees for some of those tests
and unless the Ada front-end has been fixed we shouldn't really
consider those problems fixed.
jeff
> Maybe so, but that would most likely be by accident, not by design. In
> fact, that probably indicates an optimization regression in the recent
> changes to VRP.
No disagreement, I was merely updating the status of the PR, whose summary can
now be slightly misleading, precisely to acknowledge that the latter should not
be changed at this point.
Subject: Bug 26797 Author: ebotcazou Date: Sun Mar 4 10:05:44 2007 New Revision: 122524 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=122524 Log: PR ada/26797 * doc/invoke.texi (-O2): Document that Tree-VRP is not turned on for the Ada compiler. ada/ * lang.opt: Accept -ftree-vrp for Ada. * misc.c (OPT_ftree_vrp): New case. (gnat_post_options): Disable Tree-VRP unless explicitly turned on. Modified: branches/gcc-4_2-branch/gcc/ChangeLog branches/gcc-4_2-branch/gcc/ada/ChangeLog branches/gcc-4_2-branch/gcc/ada/lang.opt branches/gcc-4_2-branch/gcc/ada/misc.c branches/gcc-4_2-branch/gcc/doc/invoke.texi I can't help feeling that VIEW_CONVERT_EXPR is not the right tool for implementing 'Valid. I think an intrinsic would be better, eg "int __builtin_nop(int)" which is defined to return its argument unchanged. Then 'Valid can be implemented as something like: x'Valid -> y = __builtin_nop(x); valid = (y>=lower_bound && y <=upper_bound); The point is that the intrinsic would be opaque to the optimizers, and would only be lowered to the identity function *after* the tree optimizers have run. One annoyance is that presumably intrinsics would be needed for all integer and float precisions, eg __builtin_nop8, __builtin_nop16, etc. Subject: Re: [4.3 regression] ACATS cxh1001 fails
> y = __builtin_nop(x); valid = (y>=lower_bound && y <=upper_bound);
> The point is that the intrinsic would be opaque to the optimizers,
> and would only be lowered to the identity function *after* the tree
> optimizers have run. One annoyance is that presumably intrinsics
> would be needed for all integer and float precisions, eg
> __builtin_nop8, __builtin_nop16, etc.
More than each precision. The VIEW_CONVERT_EXPR is to the base type
and there can be an unlimited number of them for each precision.
Because it has to work with arbitrary types, a builtin won't do it.
We could certainly add a new tree expression that says "don't look through
this for VRP purposes", but we already have V_C_E.
Subject: Re: [4.3 regression] ACATS cxh1001 fails > > y = __builtin_nop(x); valid = (y>=lower_bound && y <=upper_bound); > > The point is that the intrinsic would be opaque to the optimizers, > > and would only be lowered to the identity function *after* the tree > > optimizers have run. One annoyance is that presumably intrinsics > > would be needed for all integer and float precisions, eg > > __builtin_nop8, __builtin_nop16, etc. > > More than each precision. The VIEW_CONVERT_EXPR is to the base type > and there can be an unlimited number of them for each precision. I don't see what the problem is - you don't have to convert to the base type, you can always convert to some standard type of that precision, eg int32, before calling the builtin. > Because it has to work with arbitrary types, a builtin won't do it. See above. > We could certainly add a new tree expression that says "don't look through > this for VRP purposes", but we already have V_C_E. Sure, it's just that overloading V_C_E like this feels somehow wrong to me. However I haven't been able to put my finger on any technical obstacle to this use of V_C_E. Ciao, D. Subject: Re: [4.3 regression] ACATS cxh1001 fails > I don't see what the problem is - you don't have to convert to the base > type, you can always convert to some standard type of that precision, > eg int32, before calling the builtin. Sure, but it's extra tree nodes and more to do. See below. > Sure, it's just that overloading V_C_E like this feels somehow wrong to me. Why? It's not "overloading". V_C_E of an expression E of type X to type Y means "interpret the bits of E as if it were type Y and not type X". If Y is X'Base, then interpreting E as being Y means that it can now have all the values of Y. In other words, we could only change a V_C_E to a NOP_EXPR if we can prove that the value of E is in range of *both* X and Y. Of course, we still have a bit of a mess here in that the real point is a confusion between what in Ada is called a Bounded Error and C's notion of "undefined" (Ada's "erroneous"). But I think we can do better in this area: we just haven't gotten to take a really good look at it. > However I haven't been able to put my finger on any technical obstacle to > this use of V_C_E. Nor can I ... Subject: Re: [4.3 regression] ACATS cxh1001 fails > > Sure, it's just that overloading V_C_E like this feels somehow wrong to me. > > Why? It's not "overloading". V_C_E of an expression E of type X to > type Y means "interpret the bits of E as if it were type Y and not type X". > If Y is X'Base, then interpreting E as being Y means that it can now have > all the values of Y. For example, after the view conversion it is legal to place a value in it that is in the range of Y but not of X. This is logical. What seems illogical to me is that by doing the V_C_E you are supposed to forget anything you had deduced about the bits before the conversion. For example, suppose you had deduced that the bits were all zero. After the V_C_E the bits will still be zero - are you really supposed to forget this information? Another example: suppose you had worked out somehow that the bits corresponded to an integer in the range 4..5. After the V_C_E this is still true - why is it logical to forget this information? You may well respond that you are only supposed to forget information you deduced from the range of the type, not information you worked out by some other means. But these cannot really be distinguished, c.f. VRP. In short, it is overloading because it means: consider this bunch of bits to be of the new type AND forget anything you ever deduced about them. > In other words, we could only change a V_C_E to a > NOP_EXPR if we can prove that the value of E is in range of *both* X > and Y. Surely you only have to prove that it is in the range of Y, since if it is not in the range of X then you have a bounded error, and doing a NOP_EXPR should be perfectly legal. > Of course, we still have a bit of a mess here in that the real point is > a confusion between what in Ada is called a Bounded Error and C's notion > of "undefined" (Ada's "erroneous"). But I think we can do better in this > area: we just haven't gotten to take a really good look at it. > > > However I haven't been able to put my finger on any technical obstacle to > > this use of V_C_E. > > Nor can I ... Actually I do have a technical objection, however it comes from my LLVM port and not mainline gcc. LLVM is lower level than gimple. Supposing types X and Y have eg 32 bits, then they both get mapped to LLVM's Int32 type. A V_C_E gets turned into nothing at all. There are no types with restricted ranges. Suppose x is a variable of type X. When a value is loaded into x I'm planning on generating the following LLVM code: x = value; assert(x >= lower_bound_for_gcc_type && x <= upper_bound_for_gcc_type); I don't generate the assert when you do a V_C_E, so for example suppose there is a y = V_C_E(Y, x). This gets turned into: y = x; Since the optimizers have range info for x, they instantly deduce the same ranges for y as for x. Thus if you test for validity of y using valid = (y >= lower_bound_for_gcc_type_X && y <= upper_bound_for_gcc_type-X); then the result is always true. How to get around this? The obvious method is to use an intrinsic, such as __builtin_nop. For the V_C_E I can then do: y = __builtin_nop(x); and all will be well. However, I only really need to use the __builtin_nop if I'm doing the V_C_E in order to implement 'Valid. For other uses of V_C_E it is unnecessary and gets in the way of optimization. So I would much rather that 'Valid directly made use of a builtin. Ciao, Duncan. Well, the only problem with V_C_E is that if you assert on the range of the base type like if (V_C_E <X'Base> (y) > 5) abort(); that you still want VRP to infer that V_C_E <X'Base> (y) is <= 5 after this check (to eliminate further checks). I believe this will not automatically happen at the moment because V_C_E <X'Base> (y) will not have the same ssa name assigned for evey occurance. But of course we will never know until the Ada FE people finally fix their frontend. All the mess would be way easier if the FE would not expose the subtypes to the middle-end... > All the mess would be way easier if the FE would not expose the subtypes to
> the middle-end...
I personally agree, but there is no clear consensus among the Ada maintainers.
Subject: Re: [4.3 regression] ACATS cxh1001 fails > Well, the only problem with V_C_E is that if you assert on the range of the > base type like > > if (V_C_E <X'Base> (y) > 5) > abort(); > > that you still want VRP to infer that V_C_E <X'Base> (y) is <= 5 after this > check (to eliminate further checks). I don't think this is a very serious problem. My understanding is that the checks can be divided into two classes: normal checks and validity checks. A normal check, such as when you do a type conversion, does not use a V_C_E, it just does: if (y < new_type_lb || y > new_type_ub) abort; new_var = (new_type) y; A validity check does: if (V_C_E <X'Base> (y) < x_lb || V_C_E <X'Base> (y) > x_ub) abort(); Note how it checks that x is in the range of x's type, so this does not give any new information about x. It is true that multiple validity checks will not be eliminated (multiple normal checks will be eliminated), but validity checks are not that common. One place that might hurt is in array accesses: IIRC the front-end does a validity check on the index before accessing an array, rather than a normal check, because it considers it too dangerous to do otherwise (eg: they want to catch the case when the index has an out-of-range value because it is uninitialized). My suggested use of a builtin would allow multiple redundant validity checks to be safely eliminated, because the builtin would be a "pure" function. The validity check becomes: y2 = __builtin_nop(y); if (y2 < x_lb || y2 > x_ub) abort(); Now suppose you do another one: y22 = __builtin_nop(y); if (y22 < x_lb || y22 > x_ub) abort(); The compiler can deduce that y2 and y22 are the same, and eliminate the second check. > I believe this will not automatically > happen at the moment because V_C_E <X'Base> (y) will not have the same > ssa name assigned for evey occurance. But of course we will never know until > the Ada FE people finally fix their frontend. > > All the mess would be way easier if the FE would not expose the subtypes to > the middle-end... I agree. The LLVM model seems more sensible in this regard. D. Subject: Re: [4.3 regression] ACATS cxh1001 fails
> My suggested use of a builtin would allow multiple redundant validity checks
> to be safely eliminated, because the builtin would be a "pure" function.
This is presumably also the case with V_C_E.
Subject: Re: [4.3 regression] ACATS cxh1001 fails > You may well respond that you are only supposed to forget > information you deduced from the range of the type, not information > you worked out by some other means. But these cannot really be > distinguished, c.f. VRP. Exactly. That's indeed the problem. It's similar to the whole signed-overflow issue: what you're able to safely do with with VRP-derived information depends on how you derived it and there's no way of tracking that. > Surely you only have to prove that it is in the range of Y, since if it > is not in the range of X then you have a bounded error, and doing a NOP_EXPR > should be perfectly legal. Think of a range check being done in a subscript reference in the LHS of an assignment. A bounded error is not allowed to cause a memory store outside the bounds of an array. Subject: Re: [4.3 regression] ACATS cxh1001 fails
> All the mess would be way easier if the FE would not expose the subtypes to
> the middle-end...
I don't see how giving *additional* information could be bad: the middle-end
is certainly free to ignore the bounds if it wants to!
The point is that the bounds *do* say something and provide useful
information to optimizers. The problem is that they don't say what the
middle-end *wants* them to say!
What they *do* mean is that it's a "bounded error" for a variable to have a
value outside of the given bounds. What the middle-end *wants* them to mean
is that it's *erroneous* for a variable to have a value outside of the given
bounds. We need to reconcile those two concepts.
From a pragmatic point of view, they aren't that different, but there are a
couple of important exceptions. The array bound on the LHS is one of them.
A full solution to this is complex, but is indeed something that I'd like to
happen some day. It involves work in various places. One thing that would
help is that many variables can be proven to have the property that the only
way they can have an invalid value is if erroneous behavior has occurred.
For such variables, the meaning of the bounds is precisely what the
middle-end expects it to mean.
The most common case where you can't do this is for component references or
pointer dereferences. In that case, we have to address this by
distinguishing between two types of tests, those that will just propagate the
bounded error condition and those that would convert a bounded error to
erroneous behavior. We are allowed to use the bounds information in
reasoning about the former, but not the latter.
Note that there are similarities to the signed-overflow case here: there are
some tests that we can safely eliminate with knowlege that signed-overflow is
undefined, but others that we don't want to.
So I think the "final" solution here will be to have the Ada front end
provide the flag information on variables and tests mentioned above and
extend VRP to use them. One of the two front-end issues (the latter) is
quite easy and I think the other is moderately-easy. I don't know how much
work the required VRP changes would be.
Subject: Re: [4.3 regression] ACATS cxh1001 fails
> I don't think this is a very serious problem. My understanding is
> that the checks can be divided into two classes: normal checks and
> validity checks. A normal check, such as when you do a type
> conversion, does not use a V_C_E, it just does:
>
> if (y < new_type_lb || y > new_type_ub)
> abort;
> new_var = (new_type) y;
Not clear. Consider the range check for a subscript on the LHS, for
example. These issues are *very* subtle ...
Subject: Re: [4.3 regression] ACATS cxh1001 fails
> Think of a range check being done in a subscript reference in the LHS of
> an assignment. A bounded error is not allowed to cause a memory store
> outside the bounds of an array.
This is exactly a case when you should do what I called a "validity check"
rather than a "normal check".
Subject: Re: [4.3 regression] ACATS cxh1001 fails > > All the mess would be way easier if the FE would not expose the subtypes to > > the middle-end... > > I don't see how giving *additional* information could be bad: the middle-end > is certainly free to ignore the bounds if it wants to! Yes, it can ignore them entirely if it wants to, but it can't ignore them partially. For example, fold is happy to create values which lie outside the range of the type (i.e. fold is ignoring the range information); as a consequence any part of the ME that exploits range information will give wrong results right now. > The point is that the bounds *do* say something and provide useful > information to optimizers. The problem is that they don't say what the > middle-end *wants* them to say! > > What they *do* mean is that it's a "bounded error" for a variable to have a > value outside of the given bounds. What the middle-end *wants* them to mean > is that it's *erroneous* for a variable to have a value outside of the given > bounds. We need to reconcile those two concepts. I don't think the middle-end needs to know about this distinction. In fact it would be wise to keep such subtleties in the front-end. That can be done as follows: in situations that might lead to erroneous behavior if a variable has a value outside it's range, you add a validity check and raise Program_Error if it fails (this is authorized by the Ada RM). Thus in order to avoid erroneousness you only need a reliable way of checking validity. > From a pragmatic point of view, they aren't that different, but there are a > couple of important exceptions. The array bound on the LHS is one of them. This is solved by adding a validity check before accessing the memory, see above. > A full solution to this is complex, but is indeed something that I'd like to > happen some day. It involves work in various places. One thing that would > help is that many variables can be proven to have the property that the only > way they can have an invalid value is if erroneous behavior has occurred. > For such variables, the meaning of the bounds is precisely what the > middle-end expects it to mean. > > The most common case where you can't do this is for component references or > pointer dereferences. I don't see that pointer dereferences are a problem - I'm pretty sure that the RM places you firmly in the erroneous zone if you do something that can give a pointer a bogus value. That leaves array accesses. If your index is outside the range of the type, you can catch that and raise Program_Error. Otherwise it is in range, so you load a value from the array (this value may be uninitialized). I believe this is OK for a bounded error. In short I don't see any problem here, or any need to track bounded/erroneous behaviour around the program. Do you have an example where you can't prevent a bounded error being (wrongly) turned into erroneous behaviour by the addition of a local check? > In that case, we have to address this by > distinguishing between two types of tests, those that will just propagate the > bounded error condition and those that would convert a bounded error to > erroneous behavior. We are allowed to use the bounds information in > reasoning about the former, but not the latter. Exactly, and I'm pretty sure that all you need to handle this is a reliable way of doing validity checking. > Note that there are similarities to the signed-overflow case here: there are > some tests that we can safely eliminate with knowlege that signed-overflow is > undefined, but others that we don't want to. > > So I think the "final" solution here will be to have the Ada front end > provide the flag information on variables and tests mentioned above and > extend VRP to use them. One of the two front-end issues (the latter) is > quite easy and I think the other is moderately-easy. I don't know how much > work the required VRP changes would be. I don't see why any such flags are required. Why does validity checking (implemented by V_C_E or an intrinsic or whatever) not suffice? Ciao, Duncan. Subject: Re: [4.3 regression] ACATS cxh1001 fails
> I don't see that pointer dereferences are a problem - I'm pretty sure that
> the RM places you firmly in the erroneous zone if you do something that can
> give a pointer a bogus value.
No. Referencing an uninitialized variable is a bounded error, not
erroneous. That fact is basically what causes most of this trouble ...
Subject: Re: [4.3 regression] ACATS cxh1001 fails
On Saturday 10 March 2007 00:13:27 kenner at vlsi1 dot ultra dot nyu dot edu wrote:
>
> ------- Comment #39 from kenner at vlsi1 dot ultra dot nyu dot edu 2007-03-09 23:13 -------
> Subject: Re: [4.3 regression] ACATS cxh1001 fails
>
> > I don't see that pointer dereferences are a problem - I'm pretty sure that
> > the RM places you firmly in the erroneous zone if you do something that can
> > give a pointer a bogus value.
>
> No. Referencing an uninitialized variable is a bounded error, not
> erroneous. That fact is basically what causes most of this trouble ...
It is not possible for a pointer value to be uninitialized. The language
requires all pointers to be default initialized to null.
Subject: Re: [4.3 regression] ACATS cxh1001 fails
> It is not possible for a pointer value to be uninitialized. The language
> requires all pointers to be default initialized to null.
I mean the thing that pointer points to!
Subject: Re: [4.3 regression] ACATS cxh1001 fails
> > It is not possible for a pointer value to be uninitialized. The language
> > requires all pointers to be default initialized to null.
>
> I mean the thing that pointer points to!
I think I now understand: I thought the problem we were discussing was
how to obtain correctness (which seems to be easy using local checks)
while in fact you were talking about how to optimize away unnecessary
validity checks (as compared to normal checks). My current plan for
that in LLVM is to implement validity checks using a builtin_nop
intrinsic. Multiple successive validity checks will be eliminated
as redundant because builtin_nop is a pure function. That leaves
the question of how to teach the optimizers that an object is valid
in the various cases where the front-end for some reason knows this.
In such cases I'm thinking that you can issue an assert instruction
saying so, eg if type T has range 10..20 and at some point the front-end
knows that x (of type T) is valid, it can issue:
assert(builtin_nop(x)>=10 && builtin_nop(x)<=20);
The dataflow and predicate analysis needed to exploit this can hopefully
be left to the optimizers, in particular to IPO (which LLVM is rather
good at).
Subject: Re: [4.3 regression] ACATS cxh1001 fails
> I think I now understand: I thought the problem we were discussing was
> how to obtain correctness (which seems to be easy using local checks)
> while in fact you were talking about how to optimize away unnecessary
> validity checks (as compared to normal checks).
Yes, I was talking about both.
Subject: Bug 26797 Author: ebotcazou Date: Wed Sep 12 15:52:57 2007 New Revision: 128441 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=128441 Log: PR ada/26797 PR ada/32407 * utils.c (unchecked_convert): Use a subtype as the intermediate type in the special VIEW_CONVERT_EXPR case. Modified: trunk/gcc/ada/ChangeLog trunk/gcc/ada/utils.c At long last. Subject: Re: [4.3 regression] ACATS cxh1001 fails
On Wed, 2007-09-12 at 15:55 +0000, ebotcazou at gcc dot gnu dot org
wrote:
>
> ------- Comment #45 from ebotcazou at gcc dot gnu dot org 2007-09-12 15:55 -------
> At long last.
Woo hoo.
An oldie, but a goodie.
jeff
|