[PATCH] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction

Richard Biener richard.guenther@gmail.com
Thu Jun 24 12:29:20 GMT 2021

On Thu, Jun 24, 2021 at 11:55 AM Di Zhao via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
> This patch enhances FRE by recording equivalences generated by conditional
> statements, so that we can find more optimize opportunities in situations like
> following two cases:
> case 1:
>     void f (unsigned int a, unsigned int b)
>     {
>       if (a == b)
>         {
>           for (unsigned i = 0; i < a; i++)
>                       {
>                         if (i == b)
>                           foo (); /* Unreachable */
>                       }
>         }
>     }
> The "i == b" condition is always false, yet original FRE cannot predict this.
> Because VN only stores "i < a" and "a == b", so there won't be "i == b"'s
> result. (Moreover, VRP can't infer "i == b" is false either, as its boundary
> calculation is hindered by the "unsigned" modifier.)
> case 2:
> Consider GIMPLE code:
>               <bb 2> :
>               if (a != 0)
>               goto <bb 3>; [INV]
>               else
>               goto <bb 4>; [INV]
>               <bb 3> :
>               <bb 4> :
>               # c = PHI <y(2), x(3)>
>               if (a != 0)
>               goto <bb 5>; [INV]
>               else
>               goto <bb 7>; [INV]
>               <bb 5> :
>               if (c > x)
>               goto <bb 6>; [INV]
>               else
>               goto <bb 8>; [INV]
>               <bb 6> :
>               __builtin_puts (&"Unreachable!"[0]);
>               <bb 7> :
>               __builtin_puts (&"a"[0]);
>               <bb 8> :
>               ...
> The result of "if (c > x)" in bb4 is unknown. However bb2 and bb4 have the same
> conditional statement, meanwhile bb2 dominates bb4, so it is predictable that
> c==x at bb5 and c==y at bb7. Keeping record of this might bring further
> optimizations, such as removing the conditional in bb5.
> The basic idea is to use a hashmap to record additional equivalence information
> generated by conditional statements.
> Insert to the map:
>     1. When recording an EQ_EXPR=true predicate, e.g. "x==y is true", record
>     the equivalence of x and y at edge destiny in the map.
>     2. Consider case 2 above, when we fail to predict the result of a
>     conditional expression (at bb4), if following conditions be satisfied:
>               a. There is a previous corresponding predicate. In this case, it will
>               be "a != 0 is true" at bb3.
>               b. bb3's single predecessor bb2 dominates bb4.
>               c. bb3's conditional statement's predicate code (or inverted code) is
>               identical with that of bb4. (This condition can be relaxed.)
>     Then we can try to find a PHI operation along A's true and false edge, and
>     record the equivalence of PHI operation's result and arguments at bb4's
>     true and false destinations. Regarding this case, "c==x at bb5" and
>     "c==y at bb7" will be recorded.
> Use the map:
>     When we cannot find a prediction result for a conditional statement by
>     original process, replace conditional expression's operands with qualified
>     equivalence and try again.
> As searching predicates and the ssa names to record are based on
> value numbering, we need to "unwind" the equivalence map for iteration.
> Bootstrapped and tested on x86_64-pc-linux-gnu and aarch64-unknown-linux-gnu.

I have some reservations about extending the ad-hoc "predicated value" code.

Some comments on the patch:

+/* hashtable & helpers to record equivalences at given bb.  */
+typedef struct val_equiv_s
+  val_equiv_s *next;
+  val_equiv_s *unwind_to;
+  hashval_t hashcode;
+  /* SSA name this val_equiv_s is associated with.  */
+  tree name;
+  /* RESULT in a vn_pval entry is SSA name of a equivalence.  */
+  vn_pval *values;
+} * val_equiv_t;

all of this (and using a hashtable for recording) is IMHO a bit overkill.
Since you only ever record equivalences for values the more natural
place to hook those in is the vn_ssa_aux structure where we also
record the availability chain.

There's little commentary in the new code, in particular function-level
comments are missing everywhere.

There's complexity issues, like I see val_equiv_insert has a "recurse"
feature but also find_predicated_value_by_equiv is quadratic in
the number of equivalences of the lhs/rhs.  Without knowing what
the recursion on the former is for - nothing tells me - I suspect
either of both should be redundant.

You seem to record equivalences at possible use points which
looks odd at best - I'd expected equivalences being recorded
at the same point we record predicated values and for the
current condition, not the one determining some other predication.
What was the motivation to do it the way you do it?

Why is the code conditional on 'iterate'?

You've probably realized that there's no "nice" way to
handle temporary equivalences in a VN scheme using
hashing for expressions (unless you degrade hashes a lot).

You quote opportunities that are catched with this like

+  if (a != 0)
+    {
+      c = b;
+    }
+  for (unsigned i = 0; i < c; i++)
+    {
+      if (a != 0)
+       {
+         if (i >= b)
+           /* Should be eliminated.
+            */
+           foo ();

but say other "obvious" cases like

+  if (a != 0)
+    {
+      c = b;
+    }
+  for (unsigned i = 0; i < c; i++)
+    {
+      if (a != 0)
+       {
+           /* Should be zero.  */
             return b - c;

are not handled.  That's in line with the current "value predication"
which mainly aims at catching simple jump threading opportunities;
you only simplify conditions with the recorded equivalences.  But
then the complexity of handling equivalences does probably not
outweight the opportunities catched - can you share some numbers
on how many branches are newly known taken during VN with
this patch during say bootstrap or build of SPEC CPU?

I've hoped to ditch the current "value predication" code by eventually
using the relation oracle from ranger but did not yet have the chance
to look into that.  Now, the predicated equivalences are likely not
something that infrastructure can handle?

In the end I think we should research into maintaining an alternate
expression table for conditions (those we like to simplify with
equivalences) and use a data structure that more easily allows to
introduce (temporary) equivalences.  Like by maintaining
back-references of values we've recorded a condition for and a way
to quickly re-canonicalize conditions.  Well - it requires some research,
as said.


> Regards,
> Di Zhao
> Extend FRE with an "equivalence map" for condition prediction.
> 2021-06-24  Di Zhao  <dizhao@os.amperecomputing.com>
> gcc/ChangeLog:
>         PR tree-optimization/101186
>         * tree-ssa-sccvn.c (vn_tracking_edge): Extracted utility function.
>         (dominated_by_p_w_unex): Moved upward, no change.
>         (vn_nary_op_get_predicated_value): Moved upward, no change.
>         (struct val_equiv_hasher): Hasher for the "equivalence map".
>         (compute_single_op_hash): Compute hashcode from ssa name.
>         (is_vn_qualified_at_bb): Check if vn_pval is valid at BB.
>         (val_equiv_insert): Insert into "equivalence map".
>         (vn_lookup_cond_result): Lookup conditional expression's result by VN.
>         (find_predicated_value_by_equiv): Search for predicated value,
>         utilizing equivalences that we have recorded.
>         (record_equiv_from_previous_edge): Record equivalence relation from a
>         previouse edge on current edge.
>         (record_equiv_from_previous_cond): Search for equivalences generated by
>         previous conditional statements, and record them on current BB's
>         successors.
>         (vn_nary_op_insert_pieces_predicated): Extract utility function. Insert
>         into the "equivalence map" for predicate like "x==y is true".
>         (free_rpo_vn): Free the "equivalence map".
>         (process_bb): Insert into & lookup from the "equivalence map".
>         (struct unwind_state): Add "equivalence map" unwind state.
>         (do_unwind): Unwind the "equivalence map".
>         (do_rpo_vn): Update "equivalence map" unwind state.
> gcc/testsuite/ChangeLog:
>         PR tree-optimization/101186
>         * gcc.dg/tree-ssa/vrp03.c: Disable fre.
>         * gcc.dg/tree-ssa/ssa-fre-95.c: New test.
>         * gcc.dg/tree-ssa/ssa-fre-96.c: New test.

More information about the Gcc-patches mailing list