This is the mail archive of the
mailing list for the GCC project.
Re: [range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING
On Tue, Jul 16, 2019 at 8:37 PM Andrew MacLeod <email@example.com> wrote:
> On 7/9/19 5:56 AM, Richard Biener wrote:
> > On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <firstname.lastname@example.org> wrote:
> >> On 7/4/19 6:33 AM, Richard Biener wrote:
> >>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <email@example.com> wrote:
> >>>> On 7/3/19 7:08 AM, Richard Biener wrote:
> >>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <firstname.lastname@example.org> wrote:
> >>>> How about we keep VARYING and UNDEFINED typeless until right before we
> >>>> call into the ranger. At which point, we have can populate min/max
> >>>> because we have the tree_code and the type handy. So right before we
> >>>> call into the ranger do:
> >>>> if (varying_p ())
> >>>> foo->set_varying(TYPE);
> >>>> This would avoid the type cache, and keep the ranger happy.
> >>> you cannot do set_varying on the static const range but instead you'd do
> >>> value_range tem (*foo);
> >>> if (varying_p ())
> >>> tem->set_full_range (TYPE);
> >>> which I think we already do in some places. Thus my question _where_
> >>> you actually need this.
> >> Basically, everywhere. By having a type for varying/undefined, we don't
> >> have to special case anything. Sure, we could for example, special case
> >> the invert operation for undefined / varying. And we could special case
> >> everything dealing with ranges to handle varying and undefined, but why?
> >> We could also pass a type argument everywhere, but that's just ugly.
> >> However, I do understand your objection to the type cache.
> >> How about the attached approach? Set the type for varying/undefined
> >> when we know it, while avoiding touching the CONST varying. Then right
> >> before calling the ranger, pass down a new varying node with min/max for
> >> any varyings that were still typeless until that point.
> >> I have taken care of never adding a set_varying() that was not already
> >> there. Would this keep the const happy?
> >> Technically we don't need to set varying/undef types for every instance
> >> in VRP, but we need it at least for the code that will be shared with
> >> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
> >> I just figured if we have the information, might as well set it for
> >> consistency.
> >> If you like this approach, I can rebase the other patches that depend on
> >> this one.
> > OK, so I went ant checked what you do for class irange which has
> > a type but no kind member (but constructors with a kind). It also
> > uses wide_int members for storage. For a pure integer constant
> > range representation this represents somewhat odd choices; I'd
> > have elided the m_type member completely here, it seems fully
> > redundant. Only range operations need to be carried out in a
> > specific type (what I was suggesting above). Even the precision
> > encoded in the wide_int members is redundant then (I'd have
> > expected widest_int here and trailing-wide-ints for optimizing
> > storage).
> What irange contains internally is a bit arbitrary. It's really an API
> for managing a set of subranges. We could have used trees internally
> just as easily, then we wouldnt need a type field. Since we were doing
> lots of operations, rather than going back and forth from trees all the
> time, we just used the underlying wide_int directly. we could have
> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is
> there, has all the operations, and it works fine. so thats what it
> currently is on the branch.
But a range has no type. Period. The fact that with the in-tree implementation
there's access to a type is artificial. All workers in the in-tree
get a type for the operation they carry out.
But somehow irange doesn't get that.
In fact, an irange shouldn't be bound to any type, and the usual
"carry out multiplication of two integer typed vars with the result
in integer" on irange should be multiply two iranges (with unbound
result) plus a "truncate to integer type" operation.
> We are treating a range object as a unique self contained object.
> Therefore, the range has a type so we know how to print it, and can
> confirm before any operation that the ranges being operated on are
> appropriately matched. We found and opened bugzillas over the past
> couple years for places where our code caught bugs because a range was
> created and then operated on in a way that was not compatible with
> another range. I think there is a still an open one against ada(?)
> where the switch and case are different precision.
I'm fine with sanity-checking where it makes sense but looking at
the irange code the fact that there is a type is a fundamental requirement.
IMHO that's bad design.
> From my point of view, a range object is similar to a tree node. A tree
> node has the bits to indicate what the value is, but also associates a
> type with those bits within the same object. This is less error prone
> than passing around the bits and the type separately. As ranges are
> starting to be used in many places outside of VRP, we should do the same
> thing with ranges. WIth value_range it would actually be free since
> there is already a tree for the bounds already which contains the type.
Not for VARYING or UNDEFINED.
> > to fold_range/op_range? The API also seems to be oddly
> > constrained to binary ops. Anyway, the way you build
> > the operator table requires an awful lot of global C++ ctor
> > invocations, sth we generally try to avoid. But I'm getting
> > into too many details here.
> Its "oddly constrained" because what you are looking at is just the
> standardized unary/binary ops code.. ie the equivalent of
> extract_range_from_binary_expr() and extract_range_from_unary_expr().
> The other ops we care about have specialized requirements, like PHIs
> and the arbitrary numbers of parameters in a call, or anything less
> common than one or two operands. You are just not seeing those parts.
> > So - to answer your question above, I'd like you to pass down
> > a type to operations. Because that's what is fundamentally
> > required - a range doesn't have a "type" and the current
> > value_range_base doesn't fall into the trap of needing one.
> > Richard.
> Why is having a type associated with the data a "trap"? Perhaps the
> old VRP lattice idea didn't need a type with the UNDEFINED and VARYING
> lattice values, but we're moving past lattice values and into a realm
> where we have ranges as useful things outside of VRP, and trying to
> shoehorn lattice values does not seem appropriate anymore.
> I looked at implementing range-ops without a type in the range, and we
> end up passing 2 parameters everywhere each time we do anything with a
> range. This doubled the number of parameters in most routines, and when
> we had chains of calls, we were simply passing the type along with the
> range. It seems archaic to be constantly passing information around
> instead of associating it with the range itself. Its just another place
> for an error to creep in.. Aldy found a place where we were creating
> varying nodes for floats IIRC.. the type checking in the range
> operations caught it precisely because the range returned wasn't the
> type it was assumed to be.
Well, I don't agree. It's not archaic, it's the way GCC works everywhere
else. So it's consistent. Archaically consistend in your view.
But [1, 2] doesn't have a "type". [1, 2] and [10000, 10004] can
be added just fine even if [1, 2] is from 'char' and [10000, 10004] is
from 'int'. If you want sanity-checking you can do
irange_plus (tree in_type, irange op0, irange op1)
gcc_assert (range_fits_type (in_type, op0) && range_fits_type
irange res = op0 + op1;
return res.truncate_to (in_type);
or so easily. See - even the checking part doesn't need the
type in the range.
I also do not have to "invent" a type when I want to work with irange
on widest_ints. Oh, another comment when reading the code was
that irange storage should not be wide_int but trailing-widest-int.
Because obviously irange has storage requirement. It makes
in-place modification of iranges impossible but that's the way modern
C++ code works (and be consistent with how wide-int works).
I don't really like you invented an API unlike any existing in GCC.
You could have gone after wide-int at least.
> That said. I believe we can do away with the need for a type with an
> 'UNDEFINED' range. That is not too onerous, and there doesn't really
> seem to be too many ripple effect from have a typeless undefined range.
> I think we can contain that, and prevents us from having to add a hack
> to value_range for that situation.
> VARYING is another situation completely. We adopted the term 'varying'
> for ease of compatibility with VRP, but varying is simply a range going
> from MIN to MAX for a type. Removing the type from that would then
> require we always pass a type in with every range which gets back to
> doubling the number of of parameters everywhere for no good reason.
> If we standardize value_range so that MIN and MAX are set for varying,
> then everything works very smoothly, we can make value_range and irange
> interchangeable and facilitate getting the range ops code into trunk.
> It seems like the only reason we cant do this right now is the CONST
> varying nodes that are returned from get_value_range().
> Looking at that routine, it seems there are only 2 cases where that can
> be returned
> 1) we ask for an ssa-name beyond the end of the local ssa-name vector
> 2) once values_propagated is set to true and an ssa-name has no entry.
> Both seem pretty straightforward to fix...
> 1) if we ask for an ssa-Name beyond the end, simply reallocate the
> vector to be big enough. I doubt this will trigger a lot, and if we
> initially allocate it with room for an extra 10% names it'll probably
> never trigger. Or pick whatever number seems appropriate.
> 2) if values_propagated is true, simply allocate a node for the
> ssa-name, set it to varying and return it. THIs accomplishes the same
But it's completely pointless to do this work! I refuse to make the
existing code slower just to make ranger easier to merge (and your
compile-time comparisons worse for the old code - not that all the
ranger-driven C++ification helped here).
> the number of additional nodes we allocate will be pretty minimal, and
> it will never exceed the number of ssa-names. Then we never have to
> worry about having CONST set for a varying node either. I see places
> where there is special processing to avoid calling set_varying because
> we dont know if the vbalue_range in hand is in constant memory and would
> cause the compiler to trap. This seems like a really bad situation, and
> it would eliminate that issue. We can also then eliminate the places
> where VARYING is expanded to min/max for a given type.. instead we can
> just pick up min/max directly. It seems much cleaner overall.
> Something like the simple attached patch would resolve that issue, and
> remove any lurking concerns/bugs with the CONST code.
> Then we can associate a type with varying, canonicalize it consistently,
> and set MIN/MAX appropriately. This will give us full
> interchangeability between irange and value_range, establish a
> solid/consistent API, and we can move on to the next thing :-)
> Does this not seem reasonable?
I still do not agree with the design decision of the ranger API. You do
want that to be included in the end, no? Of course I have no veto power
or whatnot but the API and implementation is awkward. You didn't
answer my concern about all the global ctors, did you?