This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: SSA range class and removal of VR_ANTI_RANGEs


On 05/23/2017 04:48 AM, Aldy Hernandez wrote:
Howdy!

For Andrew's upcoming on-demand range work I have created a range class
for use in his engine.  Currently, the range class is only for SSA
integers, but there is no reason why we couldn't adapt this for RTL or
non-integer types at a later time.

The class can live outside of his work, as can be demonstrated by the
attached patch.  With it, I was able to rewrite the post-VRP range
information to use this class and get rid of VR_ANTI_RANGE throughout
the compiler.  A VR_ANTI_RANGE of ~[5,10] becomes [-MIN,4][11,+MAX].

The internal structure of VRP is unchanged.  I have only changed the
outward facing structures.

A good example of how much cleaner using an actual range rather than
VR_ANTI_RANGEs is the change to get_size_range() in the attached patch.
We remove an assortment of contortions into:

+  /* Remove negative numbers from the range.  */
+  irange positives;
+  range_positives (&positives, exptype, allow_zero ? 0 : 1);
+  if (positives.Intersect (*ir))
+    {
+      /* Remove the unknown parts of a multi-range.
+        This will transform [5,10][20,MAX] into [5,10].  */
+      if (positives.num_ranges () > 1
+         && positives.upper_bound () == wide_int (TYPE_MAX_VALUE
(exptype)))
+       positives.remove_range (positives.num_ranges () - 1);
+
+      range[0] = wide_int_to_tree (exptype, positives.lower_bound ());
+      range[1] = wide_int_to_tree (exptype, positives.upper_bound ());
+    }
+  else
+    {
+      /* If removing the negative numbers didn't give us anything
+        back, the entire range was negative. Leave things as they
+        were, and let the caller sort it out.  */
+      range[0] = wide_int_to_tree (exptype, min);
+      range[1] = wide_int_to_tree (exptype, max);
     }

A few notes:

1. There are still two uses of VR_ANTI_RANGEs left after this patch:
ip-cp.c and ipa-prop.c.  I didn't look too hard at these two passes, as
they are using `struct value_range' which should only have been used
*before* VRP finishes.  I can work on this as a follow-up.

2. I have not included ChangeLog entries, as I would hate to write them,
and have the API or significant things change from under me as part of
the review.  I can certainly cobble the ChangeLog entries as soon as/if
we agree that this is an acceptable approach.

3. There are lots of unit tests to maintain sanity.  The cast ones in
particular will definitely need refinement as Andrew's work stresses the
class.

The attached patch passes bootstrap and tests.

I have also benchmarked an assortment of .ii files (from a previous
bootstrap) with no noticeable difference in -O2 compilation time.  So, I
would prefer to tackle any micro-optimizations either as a follow-up or
if it actually becomes noticeable--. Yeah, yeah, I know.  Wishful
thinking ;-).


Fire away!
Aldy

p.s. Please refer any comments as to why we need the class, or why we
need on-demand range information to Andrew :-P.

I'm pretty excited about this work (as you know :) so just a few
longish comments on/suggestions for the design of the class to
make it ever so slightly easier to work with.  Since YMMV, feel
free to disregard what you don't like.

+#define MAX_RANGES	6
+

This seems like an implementation detail that clients of the class
shouldn't know about or have access to.  Suggest to hide that from
the interface (e.g., by making a private member of irange).

+typedef class irange *irange_p;

FWIW, I find pointer typedefs more trouble than worth.  They
obscure the fact that they are pointers and cannot be made const
by adding the const qualifier.  E.g., this:

  void foo (const irange_p);

doesn't declare foo to take a pointer to a const range as one
might hope but rather a const pointer to a non-const range.

+enum irange_type { RANGE_PLAIN, RANGE_INVERT };

Consider making this a member and renaming it to something like
'kind' to avoid giving the impression that it's somehow a type
that's related to class irange.  Making it a member also avoids
the need to prefix the enumerators with RANGE_, and makes it
convenient to refer to PLAIN and INVERT using the dot and arrow
operators (in addition to irange::PLAIN).

+
+class GTY(()) irange
+{
+ private:
+  bool overflow;
+  size_t n;

Suggest to use a more descriptive name for the member.

+  void prepend (wide_int x, wide_int y);
+  void append (wide_int x, wide_int y);
+  void remove (unsigned i, unsigned j);
+  void canonicalize ();
+  /* This is stupid.  These two should be private, but the GTY
+     machinery can't look inside an irange.  */
+ public:
+  tree type;
+  wide_int bounds[MAX_RANGES];
+
+public:
+  irange () { type = NULL_TREE; n = 0; }
+  irange (tree t);

Should the range be implicitly convertible from any tree or would
it better if it required explicit conversion (i.e., should the ctor
be declared explicit)?  The usual rule of thumb is to err on the
side of safety to avoid accidentally creating objects of the class).

+  irange (tree t, wide_int lbound, wide_int ubound,
+	  irange_type rt = RANGE_PLAIN);
+  irange (const irange &r);
+
+  void set_range (tree t);
+  void set_range (tree t, wide_int lbound, wide_int ubound,
+		  irange_type rt = RANGE_PLAIN);
+
+  bool overflow_p () { return overflow && !TYPE_OVERFLOW_WRAPS (type);

Suggest to declare this function const.

}
+  void set_overflow () { overflow = true; }
+  void clear_overflow () { overflow = false; }
+
+  unsigned num_ranges () { return n / 2; }
+  wide_int lower_bound () const { return bounds[0]; }
+  wide_int lower_bound (unsigned index);

The implementation does this:

+  gcc_assert (n != 0 && index <= n/2);

Why is n required to be non-zero when overload just above has
well-defined behavior for the same index?  That seems needlessly
error prone (could cause ICEs), and will require callers to use
different algorithms for accessing the first lower bound versus
those with non-zero indices.  Suggest to define a single inline
form the function instead (taking an index, possibly with
a default argument).  That way the same algorithm can be used
to access any element.

+  wide_int upper_bound () const { return bounds[n - 1]; }
+  wide_int upper_bound (unsigned index);

Suggest to declare these five functions const.

+
+  void remove_range (unsigned i) { remove (i, i + 1); }

Since there is a remove() function, suggest to rename the one
argument form to remove() to avoid having to remember which is
which when calling it.

+  void clear () { n = 0; }
+  void clear (tree t) { type = t; n = 0; overflow = false; }
+  bool empty_p () const { return !n; }


+  bool range_for_type_p () const;
+  bool simple_range_p () const { return n == 2; }
+
+  void dump ();
+  void dump (pretty_printer *pp);
+  void dump (FILE *);

Suggest to declare these three const.

+
+  bool valid_p ();

Suggest to make it const.

+  void cast (tree type);
+  bool contains_p (wide_int element);
+  bool contains_p (tree);
+
+  tree get_type () { return type; }

Suggest to declare these three functions const (even if they call
functions that aren't const, as long as they don't modify any members
they should be const; it's better to cast the constness away in the
implementation than let "const-incorrect" implementation details leak
into the interface).

+
+  irange& operator= (const irange &r);
+  irange& operator= (tree t);
+
+  bool operator== (const irange &r) const;
+  bool operator!= (const irange &r) const { return !(*this == r); }

Suggest to define equality and inequality as non-members for
symmetry.  Since there is an implicit conversion from tree to
irange, this is valid:

  void f (irange ir, tree t)
  {
    ir == t;
  }

but with operator==() as a member, this isn't

  void f (irange ir, tree t)
  {
    t == ir;
  }

making the operator asymmetric (the opposite is usually expected).

+
+  void Union (wide_int x, wide_int y);
+  bool Union (const irange &r);
+  bool Union (const irange &r1, const irange &r2);
+
+  // THIS = THIS ^ [X,Y].  Return TRUE if result is non-empty.
+  bool Intersect (wide_int x, wide_int y, bool readonly = false);
+  // THIS = THIS ^ R.  Return TRUE if result is non-empty.
+  // THIS = R1 ^ R2.  Return TRUE if result is non-empty.
+ bool Intersect (const irange &r1, const irange &r2, bool readonly = false);
+  // Return TRUE if THIS ^ R will be non-empty.
+  bool Intersect_p (const irange &r)
+    { return Intersect (r, /*readonly=*/true); }

I would suggest the following changes to Union, Intersect, and Not:

1) Define all three members without the readonly argument and
   returning irange& (i.e., *this).  The return value can be
   used wherever irange& is expected, and the is_empty() member
   function can be called on it to obtain the same result.  E.g.,
   Intersect A with B, storing the result in A:

   irange A, B;
   if (A.Intersect (B).is_empty ()) { ... }

2) Add non-members like so:

  irange range_union (const irange &lhs, const irange &rhs)
  {
    return irange (lhs).Union (rhs);
  }

  and find out if the union of A or B is empty without modifying
  either argument:

  irange A, B;
  if (range_union (A, B).is_empty ()) { ... }

+
+  bool Not ();
+  bool Not (const irange& r);

Given that this function computes the inverted range of *this,
should it be called something like Invert instead?

Martin

PS For the names, have you considered using the bitwise operators?
I.e., operator| for union, operator& for intersection, and operator~
for not (aka invert).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]