Bug 96480

Summary: [10 Regression] missed optimisation: unnecessary compare in standard algorithms
Product: gcc Reporter: Dmitry Kurkin <kurkindmit>
Component: tree-optimizationAssignee: Jakub Jelinek <jakub>
Status: RESOLVED FIXED    
Severity: normal CC: jakub
Priority: P2 Keywords: missed-optimization
Version: 10.2.0   
Target Milestone: 11.0   
See Also: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88702
Host: Target:
Build: Known to work: 11.0
Known to fail: 10.5.0 Last reconfirmed: 2020-08-05 00:00:00
Attachments: gcc11-pr96480.patch

Description Dmitry Kurkin 2020-08-05 08:53:43 UTC
Consider the C++ code:

#include <algorithm>
#include <array>

enum ENUM { A, B, C, D, E, F, G, };

const std::array<ENUM, 4> foo{A, B, C, D};

bool is_foo(ENUM e) {
    return std::any_of(foo.begin(), foo.end(),
    [e] (auto ee) { return e == ee; });
}

GCC 7.4 optimizes if_foo to a single compare operation e <= 3,
but newer GCC versions do two comare operations e <= 2 || e == 3.

see https://godbolt.org/z/5a6aPa
Comment 1 Jakub Jelinek 2020-08-05 11:22:26 UTC
Started with r8-565-g7581ce9a1ad6df9c8998a3c74256837a1ff6f7cc
Comment 2 Jakub Jelinek 2020-08-05 11:26:47 UTC
That commit changes the pre-reassoc2 dump like:
--- pr96480.ii.172t.printf-return-value2_	2020-08-05 07:22:42.000000000 -0400
+++ pr96480.ii.172t.printf-return-value2	2020-08-05 07:23:32.000000000 -0400
@@ -31,13 +31,13 @@ bool is_foo(ENUM) (ENUM e)
 
   <bb 5> [12.87%]:
   if (e_2(D) == 3)
-    goto <bb 7> (<L4>); [3.00%]
+    goto <bb 6>; [3.00%]
   else
-    goto <bb 6>; [97.00%]
+    goto <bb 7> (<L4>); [97.00%]
 
-  <bb 6> [12.48%]:
+  <bb 6> [0.39%]:
 
-  # prephitmp_5 = PHI <1(2), 1(3), 1(4), 1(5), 0(6)>
+  # prephitmp_5 = PHI <1(2), 1(3), 1(4), 1(6), 0(5)>
 <L4> [14.13%]:
   return prephitmp_5;
 
So I guess I need to debug why reassoc doesn't like that form.
Comment 3 Jakub Jelinek 2020-08-05 11:44:34 UTC
Simplified testcase:
int v[4];

int
foo (int x)
{
  int *p;
  if (x == 0)
    p = &v[0];
  else if (x == 1)
    p = &v[1];
  else if (x == 2)
    p = &v[2];
  else if (x == 3)
    p = &v[3];
  else
    p = &v[4];
  return p != &v[4];
}
Comment 4 Jakub Jelinek 2020-08-05 14:29:58 UTC
Created attachment 49005 [details]
gcc11-pr96480.patch

Untested fix.
Comment 5 GCC Commits 2020-08-06 13:48:23 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:b3aa137212b1af0c824a9890eba2ca27a7271d56

commit r11-2593-gb3aa137212b1af0c824a9890eba2ca27a7271d56
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Aug 6 15:47:25 2020 +0200

    reassoc: Improve maybe_optimize_range_tests [PR96480]
    
    On the following testcase, if the IL before reassoc would be:
    ...
      <bb 4> [local count: 354334800]:
      if (x_3(D) == 2)
        goto <bb 7>; [34.00%]
      else
        goto <bb 5>; [66.00%]
    
      <bb 5> [local count: 233860967]:
      if (x_3(D) == 3)
        goto <bb 7>; [34.00%]
      else
        goto <bb 6>; [66.00%]
    
      <bb 6> [local count: 79512730]:
    
      <bb 7> [local count: 1073741824]:
      # prephitmp_7 = PHI <1(3), 1(4), 1(5), 1(2), 0(6)>
    then we'd optimize it properly, but as bb 5-7 is instead:
      <bb 5> [local count: 233860967]:
      if (x_3(D) == 3)
        goto <bb 6>; [34.00%]
      else
        goto <bb 7>; [66.00%]
    
      <bb 6> [local count: 79512730]:
    
      <bb 7> [local count: 1073741824]:
      # prephitmp_7 = PHI <1(3), 1(4), 0(5), 1(2), 1(6)>
    (i.e. the true/false edges on the last bb with condition swapped
    and ditto for the phi args), we don't recognize it.  If bb 6
    is empty, there should be no functional difference between the two IL
    representations.
    
    This patch handles those special cases.
    
    2020-08-06  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/96480
            * tree-ssa-reassoc.c (suitable_cond_bb): Add TEST_SWAPPED_P argument.
            If TEST_BB ends in cond and has one edge to *OTHER_BB and another
            through an empty bb to that block too, if PHI args don't match, retry
            them through the other path from TEST_BB.
            (maybe_optimize_range_tests): Adjust callers.  Handle such LAST_BB
            through inversion of the condition.
    
            * gcc.dg/tree-ssa/pr96480.c: New test.
Comment 6 Jakub Jelinek 2020-08-06 13:53:08 UTC
Should be fixed on the trunk.
As it is essentially a new optimization, not sure if it is a good idea to backport it, even when it has been a regression on this testcase.
Comment 7 Jakub Jelinek 2021-05-14 09:53:48 UTC
GCC 8 branch is being closed.
Comment 8 Richard Biener 2021-06-01 08:18:15 UTC
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
Comment 9 Richard Biener 2022-05-27 09:43:15 UTC
GCC 9 branch is being closed
Comment 10 Jakub Jelinek 2022-06-28 10:41:34 UTC
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
Comment 11 Richard Biener 2023-07-07 09:01:26 UTC
Fixed in GCC 11.