This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING
On 7/16/19 12:37 PM, Andrew MacLeod wrote:
> On 7/9/19 5:56 AM, Richard Biener wrote:
>> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>
>>>
>>> On 7/4/19 6:33 AM, Richard Biener wrote:
>>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>> On 7/3/19 7:08 AM, Richard Biener wrote:
>>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com>
>>>>>> wrote:
>>>>> How about we keep VARYING and UNDEFINED typeless until right before we
>>>>> call into the ranger. At which point, we have can populate min/max
>>>>> because we have the tree_code and the type handy. So right before we
>>>>> call into the ranger do:
>>>>>
>>>>> if (varying_p ())
>>>>> foo->set_varying(TYPE);
>>>>>
>>>>> This would avoid the type cache, and keep the ranger happy.
>>>> you cannot do set_varying on the static const range but instead
>>>> you'd do
>>>>
>>>> value_range tem (*foo);
>>>> if (varying_p ())
>>>> tem->set_full_range (TYPE);
>>>>
>>>> which I think we already do in some places. Thus my question _where_
>>>> you actually need this.
>>> Basically, everywhere. By having a type for varying/undefined, we don't
>>> have to special case anything. Sure, we could for example, special case
>>> the invert operation for undefined / varying. And we could special case
>>> everything dealing with ranges to handle varying and undefined, but why?
>>> We could also pass a type argument everywhere, but that's just ugly.
>>> However, I do understand your objection to the type cache.
>>>
>>> How about the attached approach? Set the type for varying/undefined
>>> when we know it, while avoiding touching the CONST varying. Then right
>>> before calling the ranger, pass down a new varying node with min/max for
>>> any varyings that were still typeless until that point.
>>>
>>> I have taken care of never adding a set_varying() that was not already
>>> there. Would this keep the const happy?
>>>
>>> Technically we don't need to set varying/undef types for every instance
>>> in VRP, but we need it at least for the code that will be shared with
>>> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
>>> I just figured if we have the information, might as well set it for
>>> consistency.
>>>
>>> If you like this approach, I can rebase the other patches that depend on
>>> this one.
>> OK, so I went ant checked what you do for class irange which has
>> a type but no kind member (but constructors with a kind). It also
>> uses wide_int members for storage. For a pure integer constant
>> range representation this represents somewhat odd choices; I'd
>> have elided the m_type member completely here, it seems fully
>> redundant. Only range operations need to be carried out in a
>> specific type (what I was suggesting above). Even the precision
>> encoded in the wide_int members is redundant then (I'd have
>> expected widest_int here and trailing-wide-ints for optimizing
>> storage).
>
> What irange contains internally is a bit arbitrary. It's really an API
> for managing a set of subranges. We could have used trees internally
> just as easily, then we wouldnt need a type field. Since we were doing
> lots of operations, rather than going back and forth from trees all the
> time, we just used the underlying wide_int directly. we could have
> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is
> there, has all the operations, and it works fine. so thats what it
> currently is on the branch.
>
> We are treating a range object as a unique self contained object.
> Therefore, the range has a type so we know how to print it, and can
> confirm before any operation that the ranges being operated on are
> appropriately matched. We found and opened bugzillas over the past
> couple years for places where our code caught bugs because a range was
> created and then operated on in a way that was not compatible with
> another range. I think there is a still an open one against ada(?)
> where the switch and case are different precision.
>
> From my point of view, a range object is similar to a tree node. A tree
> node has the bits to indicate what the value is, but also associates a
> type with those bits within the same object. This is less error prone
> than passing around the bits and the type separately. As ranges are
> starting to be used in many places outside of VRP, we should do the same
> thing with ranges. WIth value_range it would actually be free since
> there is already a tree for the bounds already which contains the type.
>
>
>
>
>
>> to fold_range/op_range? The API also seems to be oddly
>> constrained to binary ops. Anyway, the way you build
>> the operator table requires an awful lot of global C++ ctor
>> invocations, sth we generally try to avoid. But I'm getting
>> into too many details here.
>
> Its "oddly constrained" because what you are looking at is just the
> standardized unary/binary ops code.. ie the equivalent of
> extract_range_from_binary_expr() and extract_range_from_unary_expr().
> The other ops we care about have specialized requirements, like PHIs
> and the arbitrary numbers of parameters in a call, or anything less
> common than one or two operands. You are just not seeing those parts.
>
>>
>> So - to answer your question above, I'd like you to pass down
>> a type to operations. Because that's what is fundamentally
>> required - a range doesn't have a "type" and the current
>> value_range_base doesn't fall into the trap of needing one.
>>
>> Richard.
>>
>
> Why is having a type associated with the data a "trap"? Perhaps the
> old VRP lattice idea didn't need a type with the UNDEFINED and VARYING
> lattice values, but we're moving past lattice values and into a realm
> where we have ranges as useful things outside of VRP, and trying to
> shoehorn lattice values does not seem appropriate anymore.
>
> I looked at implementing range-ops without a type in the range, and we
> end up passing 2 parameters everywhere each time we do anything with a
> range. This doubled the number of parameters in most routines, and when
> we had chains of calls, we were simply passing the type along with the
> range. It seems archaic to be constantly passing information around
> instead of associating it with the range itself. Its just another place
> for an error to creep in.. Aldy found a place where we were creating
> varying nodes for floats IIRC.. the type checking in the range
> operations caught it precisely because the range returned wasn't the
> type it was assumed to be.
>
> That said. I believe we can do away with the need for a type with an
> 'UNDEFINED' range. That is not too onerous, and there doesn't really
> seem to be too many ripple effect from have a typeless undefined range.
> I think we can contain that, and prevents us from having to add a hack
> to value_range for that situation.
>
> VARYING is another situation completely. We adopted the term 'varying'
> for ease of compatibility with VRP, but varying is simply a range going
> from MIN to MAX for a type. Removing the type from that would then
> require we always pass a type in with every range which gets back to
> doubling the number of of parameters everywhere for no good reason.
>
> If we standardize value_range so that MIN and MAX are set for varying,
> then everything works very smoothly, we can make value_range and irange
> interchangeable and facilitate getting the range ops code into trunk.
>
> It seems like the only reason we cant do this right now is the CONST
> varying nodes that are returned from get_value_range().
> Looking at that routine, it seems there are only 2 cases where that can
> be returned
> 1) we ask for an ssa-name beyond the end of the local ssa-name vector
> 2) once values_propagated is set to true and an ssa-name has no entry.
>
> Both seem pretty straightforward to fix...
> 1) if we ask for an ssa-Name beyond the end, simply reallocate the
> vector to be big enough. I doubt this will trigger a lot, and if we
> initially allocate it with room for an extra 10% names it'll probably
> never trigger. Or pick whatever number seems appropriate.
> 2) if values_propagated is true, simply allocate a node for the
> ssa-name, set it to varying and return it. THIs accomplishes the same
> thing.
>
> the number of additional nodes we allocate will be pretty minimal, and
> it will never exceed the number of ssa-names. Then we never have to
> worry about having CONST set for a varying node either. I see places
> where there is special processing to avoid calling set_varying because
> we dont know if the vbalue_range in hand is in constant memory and would
> cause the compiler to trap. This seems like a really bad situation, and
> it would eliminate that issue. We can also then eliminate the places
> where VARYING is expanded to min/max for a given type.. instead we can
> just pick up min/max directly. It seems much cleaner overall.
>
> Something like the simple attached patch would resolve that issue, and
> remove any lurking concerns/bugs with the CONST code.
>
> Then we can associate a type with varying, canonicalize it consistently,
> and set MIN/MAX appropriately. This will give us full
> interchangeability between irange and value_range, establish a
> solid/consistent API, and we can move on to the next thing :-)
>
> Does this not seem reasonable?
One might even claim that this patch in and of itself is a step forward
independent of all the other work going on.
THe first time I saw that code when I copied it into vr-values I thought
it was ripe for fixing, but never got to it. As it stands, it's a hack,
no more, no less and it's a hack that we can easily remove and do
something better.
Jeff