This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: SSA range class and removal of VR_ANTI_RANGEs
- From: Richard Sandiford <richard dot sandiford at linaro dot org>
- To: Andrew MacLeod <amacleod at redhat dot com>
- Cc: Jakub Jelinek <jakub at redhat dot com>, Aldy Hernandez <aldyh at redhat dot com>, Richard Biener <richard dot guenther at gmail dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 23 May 2017 16:14:49 +0100
- Subject: Re: SSA range class and removal of VR_ANTI_RANGEs
- Authentication-results: sourceware.org; auth=none
- References: <dbad7708-d495-402c-c9d8-626a9c8202b9@redhat.com> <20170523121117.GU8499@tucnak> <8bb1e655-4d93-35d8-7bd5-be7ff188d8cd@redhat.com>
Andrew MacLeod <amacleod@redhat.com> writes:
> On 05/23/2017 08:11 AM, Jakub Jelinek wrote:
>> On Tue, May 23, 2017 at 06:48:15AM -0400, Aldy Hernandez wrote:
>>> --- a/gcc/tree-ssanames.h
>>> +++ b/gcc/tree-ssanames.h
>>> @@ -45,14 +45,12 @@ struct GTY(()) ptr_info_def
>>> unsigned int misalign;
>>> };
>>>
>>> -/* Value range information for SSA_NAMEs representing non-pointer variables. */
>>> -
>>> -struct GTY ((variable_size)) range_info_def {
>>> - /* Minimum, maximum and nonzero bits. */
>>> - TRAILING_WIDE_INT_ACCESSOR (min, ints, 0)
>>> - TRAILING_WIDE_INT_ACCESSOR (max, ints, 1)
>>> - TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 2)
>>> - trailing_wide_ints <3> ints;
>>> +/* Used bits information for SSA_NAMEs representing non-pointer variables. */
>>> +
>>> +struct GTY ((variable_size)) nonzero_bits_def {
>>> + /* Mask representing which bits are known to be used in an integer. */
>>> + TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 0)
>>> + trailing_wide_ints <1> ints;
>>> };
>>>
>>>
>>> --- a/gcc/tree-core.h
>>> +++ b/gcc/tree-core.h
>>> @@ -1416,11 +1413,15 @@ struct GTY(()) tree_ssa_name {
>>> union ssa_name_info_type {
>>> /* Pointer attributes used for alias analysis. */
>>> struct GTY ((tag ("0"))) ptr_info_def *ptr_info;
>>> - /* Value range attributes used for zero/sign extension elimination. */
>>> - struct GTY ((tag ("1"))) range_info_def *range_info;
>>> + /* Value range attributes. */
>>> + class GTY ((tag ("1"))) irange *range_info;
>>> } GTY ((desc ("%1.typed.type ?" \
>>> "!POINTER_TYPE_P (TREE_TYPE ((tree)&%1)) : 2"))) info;
>>>
>>> + /* For non-pointer types, this specifies a mask for the bits that
>>> + are known to be set. */
>>> + struct nonzero_bits_def *nonzero_bits;
>>> +
>>> /* Immediate uses list for this SSA_NAME. */
>>> struct ssa_use_operand_t imm_uses;
>>> };
>> I'm worried a lot here about compile time memory usage increase, especially
>> with EVRP and IPA-VRP and even more so with LTO.
>> The changes mean that for every SSA_NAME of any kind we now need 8 more
>> bytes, and for every SSA_NAME that has range info (typically both range info
>> and nonzero_bits; in the old implementation the two were tied together for a
>> good reason, many ranges also imply some non-zero bits and from non-zero
>> bits one can in many cases derive a range) much more than that (through
>
> As follow on work we have discussed an interface which would be able to
> calculate a bitmask (for either zeros or ones) from a range and vice
> versa.. Meaning we wouldn't to need to store both unless the ssa_name is
> used in such a way that it generates both. ie. when we create a range
> from a bit operation, we simply store the bit version and not a range,
> otherwise we create an irange but not a bitrange. When you query and ask
> the ssa_name for an irange, if there is no irange and there is a bit
> pattern, we can generate the irange from that on demand. Likewise, if
> there is an irange and no bit pattern, we can generate any known on or
> off bits from the irange. The only time there would be both would be
> when we can somehow find some real benefit from having both.. I doubt
> that would be very common.
>
> I think aldy simply copied the bitrange stuff wholesale in a separate
> array just to get it working.. converting between the two was follow on
> work to make things more efficient once it was fundamentally working.
>
>
>
>>
>> the nonzero_bits_def you get all the overhead of trailing_wide_ints -
>> the 3 fields it has, just save on the actual 2 lengths and 2 HWI sets,
>> but then you add 3x 8 bytes and 6x size of the whole wide_int which is
>> from what I understood not really meant as the memory storage of wide ints
>> in data structures, but as something not memory efficient you can work
>> quickly on (so ideal for automatic variables in functions that work with
>> those). So it is a 8 byte growth for SSA_NAMEs without ranges and
>> 8+3*8+6*32-2*4-2*8*x growth for SSA_NAMEs with ranges if x is the number
>> of HWIs needed to represent the integers of that type (so most commonly
>> x=1). For x=1 8+3*8+6*32-2*4-2*8*x is 200 bytes growth.
>> With say 10000000 SSA_NAMEs, 5000000 of them with ranges, that will be
>> already a 1GB difference, dunno how many SSA_NAMEs are there e.g. in firefox
>> LTO build.
> why don't we actually try measuring it and see if it is noticeable? and
> if it is, where the issues really are. You really think we have relevant
> range info for 50% of the ssa_names in a program? Id be surprised.
> Again, thats something we should probably measure and see whats
> typical. the range table should only be used when we can refine the
> global range to something other than range_for_type().
>
> We can also change the representation of the range to be 2 ranges like
> it is today.. we're just defaulting to 3 because it seems like it may
> carry more useful information some times. The long range plan is that
> this can be tweaked at runtime to only use as much as we need or ant to
> represent a range.. allowing an increase in the range precision for
> specific optimizations that might care, and decreasing it the rest of
> the time. None of the external API enforces a particular number of ranges.
>
> More importantly, when the on-demand range work comes online it will end
> the proliferation of SSA_NAMES which carry ASSERT_EXPR information. We
> won't see the trail of new ssa_names the ASSET_EXPR chains create. I
> suspect that will eliminate storing global ranges and bits for most SSA
> names, which is likely to make this class a win regardless.
> That we will have to wait and see until my code is more functional in a
> month or so
>> Can the nonzero_bits stuff be folded back into irange (and have code to
>> maintain nonzero_bits in addition to ranges as before (from setting a range
>> compute or update nonzero_bits and vice versa)? Can you use
> the two are completely different representations of information, it
> seems cleaner to have them separate but allow one to generate the other
> when required. I suspect most of the time we only need to save one or
> the other. having another class and the ability to convert from one to
> the other seems like a better solution.
>> trailing_wide_int for the irange storage of the values? Can you allocate
>> only as many HWIs as you really need, rather than always 6?
> probably.
>> Again, it can be different between a class you use for accessing the
>> information and manipulating it and for actual underlying storage on
>> SSA_NAMEs.
> Im not familiar with the details of how wide_int and host vs target
> integers really works. I don't think Aldy is either. If there is a more
> efficient way to store an integer fr this use case then by all means we
> can do that. to get things working we just used what was easily
> available. if there is a more efficient way to represent it, perhaps
> it makes sense to create a class for a"memory_efficient_wide_int" and
> allow it to convert back and forth from a wide_int as needed?
Like Jakub says, that's effectively what trailing_wide_int is.
You calculate exactly how many HWIs you need to represent the numbers,
then adjust the amount of allocated memory accordingly. So once you've
computed the range in a local irange, a version of irange with
trailing_wide_ints would be better for long-term storage.
It has an overhead of 64 bits for <= 5 integers, on top of the
HWIs themselves.
Thanks,
Richard