91789 – Value ranges determined from comparisons not used transitively

Bug 91789 - Value ranges determined from comparisons not used transitively

Summary: Value ranges determined from comparisons not used transitively

Status:	NEW

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	10.0

Importance:	P3 enhancement
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:	VRP
	Show dependency tree / graph

Reported:	2019-09-17 02:49 UTC by Ulrich Drepper
Modified:	2021-12-16 03:04 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2019-09-17 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ulrich Drepper 2019-09-17 02:49:58 UTC

Take the following code:

int foo(int a, int b)
{
  if (b < a)
    __builtin_unreachable();
  if (a < 0)
    return -1;
  if (b < 0)
    return 0;
  return 1;
}

The compiler should be able to determine that the b < 0 can never be true.  At that point in the code a >= 0 and b >= a, therefore transitively b >= 0.


The problem is not tied to __builtin_unreachable as can be seen by changing the code slightly:

int foo(int a, int b)
{
  if (b < a)
    return 2;
  if (a < 0)
    return -1;
  if (b < 0)
    return 0;
  return 1;
}

After the initial test b < a is handled there is still a threeway comparison.

The problem can be seen with 9.2.1 as well as the current trunk version.  clang 8.0.0 generates pretty much the same code as gcc.

Comment 1 Eric Gallager 2019-09-17 03:36:35 UTC

If I understood the "Project Ranger" talk at Cauldron correctly, the new version of VRP that it'll provide will probably fix this.

Comment 2 Marc Glisse 2019-09-17 06:12:27 UTC

We do manage if you swap the order of the first 2 comparisons, because this way we don't need to remember symbolic ranges: a<0 yields a range [0,inf] for a, b<a yields [0,inf] for b, and b<0 folds to false.
If ranger "fixes" this by working backwards, please make sure it works with both orders.

Comment 3 Richard Biener 2019-09-17 07:14:53 UTC

So EVRP sees

Visiting controlling predicate if (a_3(D) < 0)
Adding assert for a_3(D) from a_3(D) >= 0
Intersecting
  int [0, +INF]  EQUIVALENCES: { a_3(D) } (1 elements)
and
  int [-INF, b_2(D)]  EQUIVALENCES: { a_3(D) } (1 elements)
to
  int [0, +INF]  EQUIVALENCES: { a_3(D) } (1 elements)
Intersecting
  int [-INF, b_2(D)]
and
  int [0, +INF]
to
  int [-INF, b_2(D)]

where the intersection could produce [0, b_2(D)] with the twist
that this range might be empty in case b_2(D) < 0.

The issue in the intersection routine is that we use predicates like

  else if ((maxeq || operand_less_p (vr1max, *vr0max) == 1)
           && (mineq || operand_less_p (*vr0min, vr1min) == 1))

when we mean less-or-equal but for symbolic values with
maxeq = vrp_operand_equal_p (*vr0max, vr1max) this isn't the same.

compare_values could in theory represent this (but it doesn't).

IMHO this has nothing to do with "backward" or "forward" processing
but simply with how combination of the predicates/ranges works.

Comment 4 Andrew Macleod 2019-09-18 13:09:03 UTC

The ranger isn't really about forwards vs backwards, its about an approach in which the basics are understood without special casing, and when applied properly, direction won't matter... eliminating this sort of ordering issue.

This is another example which we intend to just always "get", but requires multiple components to work together:

It requires 2 different things. 
1) ranges to know that after the second conditional a = [0, MAX]
2) relationals such that the property b >= a is known after the first conditional.

- For strictly ranges, the final conditional produces b = [MIN, -1] on the TRUE edge.
- If the known relations are queried, and the known relation b >= a is applied to a = [0, MAX], rangeops calculates b >= [0, MAX] and we get a range of [0, MAX] for b

Since both must be true simultaneously, the results [MIN, -1] and [0, MAX] will be intersected and produces an empty range on the TRUE edge, which in ranger world means the range is impossible. Thus we know the edge is not executable and can be removed. 

This range evaluation code is present in the ranger now and resolves the range parts, but wont make trunk this release.  

The relational code is being developed for next stage 1, so does not exist yet.
It is required to replace VRP and the symbolic representation currently used by value_range for equivalences and relations. This will be my focus over the next few months.

I would anticipate this being fixed in the next release when the ranger goes live. I will add this test to my testsuite to ensure it works as expected.

Comment 5 Andrew Macleod 2020-11-06 22:20:59 UTC

Ranger made it in, but relations have not been implemented for this release.
GCC 12 for sure!

Comment 6 Andrew Pinski 2021-12-16 00:09:20 UTC

We have in evrp:
 Registering value_relation (b_2(D) >= a_3(D)) on (2->4)

...

Relational : (b_2(D) >= a_3(D))
    <bb 4> :
    if (a_3(D) < 0)
      goto <bb 5>; [INV]
    else
      goto <bb 6>; [INV]

4->5  (T) a_3(D) : 	int [-INF, -1]
4->6  (F) a_3(D) : 	int [0, +INF]

But then we don't update the value relationship for what we had for b_2 for the branch where a_3 >= 0, we should also a range for b_2(D) of [0, +INF] for 4->6.

Comment 7 Andrew Macleod 2021-12-16 03:04:34 UTC

hmm. yeah. we have the knowledge... but how to apply it efficiently.



=========== BB 4 ============
Imports: a_3(D)
Exports: a_3(D)
b_2(D)  int VARYING
a_3(D)  int VARYING
Relational : (b_2(D) >= a_3(D))
    <bb 4> :
    if (a_3(D) < 0)
      goto <bb 5>; [INV]
    else
      goto <bb 6>; [INV]

4->5  (T) a_3(D) :      int [-INF, -1]
4->6  (F) a_3(D) :      int [0, +INF]

We know the relationship upon entry to BB 4, but with no reference to b_2, there is no reason to update b_2's range.  b_2 may never be reference again.

=========== BB 6 ============
Imports: b_2(D)
Exports: b_2(D)
b_2(D)  int VARYING
    <bb 6> :
    if (b_2(D) < 0)
      goto <bb 7>; [INV]
    else
      goto <bb 8>; [INV]

6->7  (T) b_2(D) :      int [-INF, -1]
6->8  (F) b_2(D) :      int [0, +INF]

And upon entry to BB6, we do know that a_3 is [0, +INF], but with no reference to a_3 in this block, we don't see any reason to look to see if there is a relationship.  And with no further references to a_3 in the IL, we wont propagate its value to here either.


THe problem is that unless we do an exhaustive search of every possible relation of b_2, there is nothing in the IL at each point to indicate we should look at the other name.

Perhaps it isnt as bad as I make it sound.

r perhaps I can add relations to the export list for a block and have gori figure it out.
IE, in bb4  a_3 is the only export, and we can calculate a range for it. 
If we note at this point that there is a relation between a_3 and b_2, perhaps I can add b_2 to the export list as well, and the GORI how to process the relation much like it process recalculations.

IF we have c_9 = a_3 - 10 somewhere in the program, we note the dependence of c_9 on a_3, and we "recalculate" c_9 as [0,+INF] - 10  == [-10, +INF-10] on that edge.

Perhaps I can likewise add relations to this recomputation and flag b_3 as a recomputation.. Since the relation is technically "true", it could be solved for b_2 as 
  [1, 1] = b_2(D) >= [1, +INF] and everything should just work... 
hmm.  
I shall give it a go after a bit more thought.