This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: SSA range class and removal of VR_ANTI_RANGEs


On 05/23/2017 08:11 AM, Jakub Jelinek wrote:
On Tue, May 23, 2017 at 06:48:15AM -0400, Aldy Hernandez wrote:
--- a/gcc/tree-ssanames.h
+++ b/gcc/tree-ssanames.h
@@ -45,14 +45,12 @@ struct GTY(()) ptr_info_def
    unsigned int misalign;
  };
-/* Value range information for SSA_NAMEs representing non-pointer variables. */
-
-struct GTY ((variable_size)) range_info_def {
-  /* Minimum, maximum and nonzero bits.  */
-  TRAILING_WIDE_INT_ACCESSOR (min, ints, 0)
-  TRAILING_WIDE_INT_ACCESSOR (max, ints, 1)
-  TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 2)
-  trailing_wide_ints <3> ints;
+/* Used bits information for SSA_NAMEs representing non-pointer variables.  */
+
+struct GTY ((variable_size)) nonzero_bits_def {
+  /* Mask representing which bits are known to be used in an integer.  */
+  TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 0)
+  trailing_wide_ints <1> ints;
  };
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1416,11 +1413,15 @@ struct GTY(()) tree_ssa_name {
    union ssa_name_info_type {
      /* Pointer attributes used for alias analysis.  */
      struct GTY ((tag ("0"))) ptr_info_def *ptr_info;
-    /* Value range attributes used for zero/sign extension elimination.  */
-    struct GTY ((tag ("1"))) range_info_def *range_info;
+    /* Value range attributes.  */
+    class GTY ((tag ("1"))) irange *range_info;
    } GTY ((desc ("%1.typed.type ?" \
  		"!POINTER_TYPE_P (TREE_TYPE ((tree)&%1)) : 2"))) info;
+ /* For non-pointer types, this specifies a mask for the bits that
+     are known to be set.  */
+  struct nonzero_bits_def *nonzero_bits;
+
    /* Immediate uses list for this SSA_NAME.  */
    struct ssa_use_operand_t imm_uses;
  };
I'm worried a lot here about compile time memory usage increase, especially
with EVRP and IPA-VRP and even more so with LTO.
The changes mean that for every SSA_NAME of any kind we now need 8 more
bytes, and for every SSA_NAME that has range info (typically both range info
and nonzero_bits; in the old implementation the two were tied together for a
good reason, many ranges also imply some non-zero bits and from non-zero
bits one can in many cases derive a range) much more than that (through

As follow on work we have discussed an interface which would be able to calculate a bitmask (for either zeros or ones) from a range and vice versa.. Meaning we wouldn't to need to store both unless the ssa_name is used in such a way that it generates both. ie. when we create a range from a bit operation, we simply store the bit version and not a range, otherwise we create an irange but not a bitrange. When you query and ask the ssa_name for an irange, if there is no irange and there is a bit pattern, we can generate the irange from that on demand. Likewise, if there is an irange and no bit pattern, we can generate any known on or off bits from the irange. The only time there would be both would be when we can somehow find some real benefit from having both.. I doubt that would be very common.

I think aldy simply copied the bitrange stuff wholesale in a separate array just to get it working.. converting between the two was follow on work to make things more efficient once it was fundamentally working.



the nonzero_bits_def you get all the overhead of trailing_wide_ints -
the 3 fields it has, just save on the actual 2 lengths and 2 HWI sets,
but then you add 3x 8 bytes and 6x size of the whole wide_int which is
from what I understood not really meant as the memory storage of wide ints
in data structures, but as something not memory efficient you can work
quickly on (so ideal for automatic variables in functions that work with
those).  So it is a 8 byte growth for SSA_NAMEs without ranges and
8+3*8+6*32-2*4-2*8*x growth for SSA_NAMEs with ranges if x is the number
of HWIs needed to represent the integers of that type (so most commonly
x=1).  For x=1 8+3*8+6*32-2*4-2*8*x is 200 bytes growth.
With say 10000000 SSA_NAMEs, 5000000 of them with ranges, that will be
already a 1GB difference, dunno how many SSA_NAMEs are there e.g. in firefox
LTO build.
why don't we actually try measuring it and see if it is noticeable? and if it is, where the issues really are. You really think we have relevant range info for 50% of the ssa_names in a program? Id be surprised. Again, thats something we should probably measure and see whats typical. the range table should only be used when we can refine the global range to something other than range_for_type().

We can also change the representation of the range to be 2 ranges like it is today.. we're just defaulting to 3 because it seems like it may carry more useful information some times. The long range plan is that this can be tweaked at runtime to only use as much as we need or ant to represent a range.. allowing an increase in the range precision for specific optimizations that might care, and decreasing it the rest of the time. None of the external API enforces a particular number of ranges.

More importantly, when the on-demand range work comes online it will end the proliferation of SSA_NAMES which carry ASSERT_EXPR information. We won't see the trail of new ssa_names the ASSET_EXPR chains create. I suspect that will eliminate storing global ranges and bits for most SSA names, which is likely to make this class a win regardless. That we will have to wait and see until my code is more functional in a month or so
Can the nonzero_bits stuff be folded back into irange (and have code to
maintain nonzero_bits in addition to ranges as before (from setting a range
compute or update nonzero_bits and vice versa)?  Can you use
the two are completely different representations of information, it seems cleaner to have them separate but allow one to generate the other when required. I suspect most of the time we only need to save one or the other. having another class and the ability to convert from one to the other seems like a better solution.
trailing_wide_int for the irange storage of the values?  Can you allocate
only as many HWIs as you really need, rather than always 6?
probably.
Again, it can be different between a class you use for accessing the
information and manipulating it and for actual underlying storage on
SSA_NAMEs.
Im not familiar with the details of how wide_int and host vs target integers really works. I don't think Aldy is either. If there is a more efficient way to store an integer fr this use case then by all means we can do that. to get things working we just used what was easily available. if there is a more efficient way to represent it, perhaps it makes sense to create a class for a"memory_efficient_wide_int" and allow it to convert back and forth from a wide_int as needed?

I had just assumed that if there was something better for storing it we'd have already be using it to save bytes in an ssa_name.

we also need to centralize accessing the range info for an ssa_name.... which hasn't been done yet. That would be a prerequisite for being able to convert back and forth between bits and ranges. so we could have something like:

bool ssa_get_irange (irange& i);
bool ssa_get_one_bits (bitmap **ones);
bool ssa_get_zero_bits (bitmap **zeros);

and whatever set_ routines are needed. those routines would manage the irange and bitrange information vectors and call any conversion routines as needed.


we'd just like to start pushing the range info out of a branch so it starts getting as much use as possible during stage 1. If later on in stage 1 we discover we aren't getting significant enough reduction from memory tweaks and the on-demand work, we can always change the underlying implementation of the irange class back to be the same implementation we have today... it'll just be through a central interface rather than everyone accessing the information directly themselves. So that would be our bail-out plan and still be a win in my books.

--- /dev/null
+++ b/gcc/range.h
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_RANGE_H
+#define GCC_RANGE_H
+#define MAX_RANGES	6
+
+typedef class irange *irange_p;
+enum irange_type { RANGE_PLAIN, RANGE_INVERT };
+
+class GTY(()) irange
+{
+ private:
+  bool overflow;
+  size_t n;
+  void prepend (wide_int x, wide_int y);
+  void append (wide_int x, wide_int y);
+  void remove (unsigned i, unsigned j);
+  void canonicalize ();
+  /* This is stupid.  These two should be private, but the GTY
+     machinery can't look inside an irange.  */
+ public:
+  tree type;
+  wide_int bounds[MAX_RANGES];
+
+public:
...
+  void Union (wide_int x, wide_int y);
+  bool Union (const irange &r);
+  bool Union (const irange &r1, const irange &r2);
Do we really want methods starting with capital letters?
I understand why you can't use union, but I don't think we use CamelCase
anywhere.

alternatives names are welcome.... it was just for convenience.. not intended to be CamelCase.. It is an artifact I introduced back when I was first coding up the API. We could change then to range_intersect, range_union, and range_not.. just a bit unwieldy for a method name within a range class, but certainly usable


Andrew


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]