Bug 96397 - GCC Fails to exploit ranges from overflow tests
Summary: GCC Fails to exploit ranges from overflow tests
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: VRP
  Show dependency treegraph
 
Reported: 2020-07-31 04:18 UTC by Jeffrey A. Law
Modified: 2021-10-02 22:25 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-10-02 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jeffrey A. Law 2020-07-31 04:18:38 UTC
Compile with -O2.  We should be able to eliminate the x > p1 test if we were smart about back propagating equivalences to generate a range from the __builtin_add_overflow.

This was derived from a bogus warning in tpm2-pkcs11's testsuite.

#include <stddef.h>
#include <stdlib.h>
extern void frob (void);

void
foo(size_t p1)
{
  size_t x = p1 - 4;
  size_t y;
  if (__builtin_add_overflow (x, 8, &y))
    {
      frob ();
    }
  else
    {
      if (x > p1)
        abort ();
    }
}
Comment 1 Aldy Hernandez 2020-07-31 07:20:23 UTC
From a previous conversation with Jeff, I believe the corresponding IL is this:

;;   basic block 11, loop depth 0, count 487351684 (estimated locally), maybe hot
;;   Invalid sum of incoming counts 749887186 (estimated locally), should be 487351684 (estimated locally)
;;    prev block 10, next block 12, flags: (NEW, REACHABLE, VISITED)
;;    pred:       18 [82.6% (guessed)]  count:749887186 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  _109 = .ADD_OVERFLOW (_106, 1);
  _110 = REALPART_EXPR <_109>;
  # DEBUG size => _110
  _111 = IMAGPART_EXPR <_109>;
  # DEBUG fail => (_Bool) _111
  if (_111 != 0)
    goto <bb 17>; [17.38%]
  else
    goto <bb 12>; [82.62%]

The ranger knows that _111 can only be [0,1], but we don't currently have a backwards solver for the IMAGPART_EXPR code.  Currently it's unimplemented as "op_unknown".  I suppose we could implement it, and the ranger should be then able to solve for _109 (and eventually _106).

It would take a little twiddling, since range-ops currently doesn't look at USE/DEF chains, and it would have to for the IMAGPART_EXPR since not all IMAGPART_EXPR's are created equal.  E.g. for DEF's of .ADD_OVERFLOW, it means something totally different than the IMAGPART_EXPR of a complex number.

Andrew, can you verify this is doable with minor tweaks?
Comment 2 Andrew Macleod 2020-07-31 14:05:25 UTC
if I read this right, the basic premise is:
 x = p1 - 4, so if x + 8 doesn't overflow, then p1 - 4 couldn't have underflowed, and therefore x must be > p1

which is a little more complicated than just back propagation.

Firstly, we have to understand that builtin_add_overflow is actually x + 8 and that the IMAG_PART is in fact the overflow flag. we currently understand that the IMAG_PART is a boolean, but we dont "understand" the add yet.   so we'll eventually start treating those built-ins like tree-codes and provide the appropriate understanding.

secondly, the x = p1 - 4 + 8 calculation is handle-able by rangeops, we just need slightly better range tracking of overflows on unsigned values.. which is already on my todo list.

and then the if (x > p1) would be handled by the upcoming relational code automatically once we get the calculation right.

so its not quite as simple as back propagation, but it is on the radar, possibly during this stage 1.
Comment 3 Andrew Pinski 2021-10-02 22:25:46 UTC
Confirmed.