This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC][PR82479] missing popcount builtin detection
- From: Kugan Vivekanandarajah <kugan dot vivekanandarajah at linaro dot org>
- To: "Bin.Cheng" <amker dot cheng at gmail dot com>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 31 May 2018 13:51:54 +1000
- Subject: Re: [RFC][PR82479] missing popcount builtin detection
- References: <CAELXzTNXvW1jo4hbn3LHgo5ne=u-0tFJB5Ri3-P=1z-_YHNnFA@mail.gmail.com> <CAFiYyc39pjEXHwzM-jJjgLZRSZcnaMDWBREKgEEws5W3p2SF9Q@mail.gmail.com> <CAELXzTOXPkz+Uy15PO=X8S8qvE2znQNXz8knqKHHUEU9wHRP2w@mail.gmail.com> <CAFiYyc15+U_WgDmNys=ep=+uKm9usLCz-2YHYYBGJGZBN8djKg@mail.gmail.com> <CAELXzTOztCE+KXr8wM49pqGKMsF6dTm-1gXixOikmNT5yEe62Q@mail.gmail.com> <CAFiYyc0fr=O5z2td-=X8LNTea13qR7AAyJ0=_VyMZ-4z3WkAOQ@mail.gmail.com> <CAELXzTPuVQWyLKQOie5vPtYsx=_06xCXeum4NG3mT2R+SMPWrw@mail.gmail.com> <CAFiYyc0E5RmjBw17MfDkr9eEhJfgL_=NhBSD5DmmRw4Gr=LRiw@mail.gmail.com> <CAELXzTNWmeULCBA8rgn5zq3UZHrNP0qK4dW_0gYHTqxyDy9CJA@mail.gmail.com> <CAHFci2_4J3MgyjNDUPkBknTORQ8Uq1nExnN=KzFLnU_pQ47BOg@mail.gmail.com>
Hi Bin,
Thanks for the review. Please find the revised patch based on the
review comments.
Thanks,
Kugan
On 17 May 2018 at 19:56, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Thu, May 17, 2018 at 2:39 AM, Kugan Vivekanandarajah
> <kugan.vivekanandarajah@linaro.org> wrote:
>> Hi Richard,
>>
>> On 6 March 2018 at 02:24, Richard Biener <richard.guenther@gmail.com> wrote:
>>> On Thu, Feb 8, 2018 at 1:41 AM, Kugan Vivekanandarajah
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>> Hi Richard,
>>>>
>>>> On 1 February 2018 at 23:21, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>> On Thu, Feb 1, 2018 at 5:07 AM, Kugan Vivekanandarajah
>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>> Hi Richard,
>>>>>>
>>>>>> On 31 January 2018 at 21:39, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>>> On Wed, Jan 31, 2018 at 11:28 AM, Kugan Vivekanandarajah
>>>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>>> Hi Richard,
>>>>>>>>
>>>>>>>> Thanks for the review.
>>>>>>>> On 25 January 2018 at 20:04, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>>>>> On Wed, Jan 24, 2018 at 10:56 PM, Kugan Vivekanandarajah
>>>>>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>>>>> Hi All,
>>>>>>>>>>
>>>>>>>>>> Here is a patch for popcount builtin detection similar to LLVM. I
>>>>>>>>>> would like to queue this for review for next stage 1.
>>>>>>>>>>
>>>>>>>>>> 1. This is done part of loop-distribution and effective for -O3 and above.
>>>>>>>>>> 2. This does not distribute loop to detect popcount (like
>>>>>>>>>> memcpy/memmove). I dont think that happens in practice. Please correct
>>>>>>>>>> me if I am wrong.
>>>>>>>>>
>>>>>>>>> But then it has no business inside loop distribution but instead is
>>>>>>>>> doing final value
>>>>>>>>> replacement, right? You are pattern-matching the whole loop after all. I think
>>>>>>>>> final value replacement would already do the correct thing if you
>>>>>>>>> teached number of
>>>>>>>>> iteration analysis that niter for
>>>>>>>>>
>>>>>>>>> <bb 3> [local count: 955630224]:
>>>>>>>>> # b_11 = PHI <b_5(5), b_8(6)>
>>>>>>>>> _1 = b_11 + -1;
>>>>>>>>> b_8 = _1 & b_11;
>>>>>>>>> if (b_8 != 0)
>>>>>>>>> goto <bb 6>; [89.00%]
>>>>>>>>> else
>>>>>>>>> goto <bb 8>; [11.00%]
>>>>>>>>>
>>>>>>>>> <bb 6> [local count: 850510900]:
>>>>>>>>> goto <bb 3>; [100.00%]
>>>>>>>>
>>>>>>>> I am looking into this approach. What should be the scalar evolution
>>>>>>>> for b_8 (i.e. b & (b -1) in a loop) should be? This is not clear to me
>>>>>>>> and can this be represented with the scev?
>>>>>>>
>>>>>>> No, it's not affine and thus cannot be represented. You only need the
>>>>>>> scalar evolution of the counting IV which is already handled and
>>>>>>> the number of iteration analysis needs to handle the above IV - this
>>>>>>> is the missing part.
>>>>>> Thanks for the clarification. I am now matching this loop pattern in
>>>>>> number_of_iterations_exit when number_of_iterations_exit_assumptions
>>>>>> fails. If the pattern matches, I am inserting the _builtin_popcount in
>>>>>> the loop preheater and setting the loop niter with this. This will be
>>>>>> used by the final value replacement. Is this what you wanted?
>>>>>
>>>>> No, you shouldn't insert a popcount stmt but instead the niter
>>>>> GENERIC tree should be a CALL_EXPR to popcount with the
>>>>> appropriate argument.
>>>>
>>>> Thats what I tried earlier but ran into some ICEs. I wasn't sure if
>>>> niter in tree_niter_desc can take such.
>>>>
>>>> Attached patch now does this. Also had to add support for CALL_EXPR in
>>>> few places to handle niter with CALL_EXPR. Does this look OK?
>>>
>>> Overall this looks ok - the patch includes changes in places that I don't think
>>> need changes such as chrec_convert_1 or extract_ops_from_tree.
>>> The expression_expensive_p change should be more specific than making
>>> all calls inexpensive as well.
>>
>> Changed it.
>>
>>>
>>> The verify_ssa change looks bogus, you do
>>>
>>> + dest = gimple_phi_result (count_phi);
>>> + tree var = make_ssa_name (TREE_TYPE (dest), NULL);
>>> + tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
>>> +
>>> + var = build_call_expr (fn, 1, src);
>>> + *niter = fold_build2 (MINUS_EXPR, TREE_TYPE (dest), var,
>>> + build_int_cst (TREE_TYPE (dest), 1));
>>>
>>> why do you allocate a new SSA name here? It seems unused
>>> as you overwrive 'var' with the CALL_EXPR immediately.
>> Changed now.
>>
>>>
>>> I didn't review the pattern matching thoroughly nor the exact place you
>>> call it. But
>>>
>>> + if (check_popcount_pattern (loop, &count))
>>> + {
>>> + niter->assumptions = boolean_false_node;
>>> + niter->control.base = NULL_TREE;
>>> + niter->control.step = NULL_TREE;
>>> + niter->control.no_overflow = false;
>>> + niter->niter = count;
>>> + niter->assumptions = boolean_true_node;
>>> + niter->may_be_zero = boolean_false_node;
>>> + niter->max = -1;
>>> + niter->bound = NULL_TREE;
>>> + niter->cmp = ERROR_MARK;
>>> + return true;
>>> + }
>>>
>>> simply setting may_be_zero to false looks fishy.
>> Should I set this to (argument to popcount == zero)?
> No, I think that's unnecessary. The number of iterations is computed
> as: may_be_zero ? 0 : niter;
> Here niter is ZERO even when may_be_zero is set to false, and niters
> is computed correctly.
>
> I think the point is that may_be_zero is false doesn't imply that
> niters is non-zero.
>
>>
>>> Try with -fno-tree-loop-ch.
>> I changed the pattern matching to handle loop without header copying
>> too. Looks a bit complicated checking all the conditions. Wondering if
>> this can be done in a simpler and easier to read way.
>>
>>> Also max should not be negative,
>>> it should be the number of bits in the IV type?
>>
>> Changed this too.
>>>
>>> A related testcase could be that we can completely peel
>>> a loop like the following which iterates at most 8 times:
>>>
>>> int a[8];
>>> void foo (unsigned char ctrl)
>>> {
>>> int c = 0;
>>> while (ctrl)
>>> {
>>> ctrl = ctrl & (ctrl - 1);
>>> a[c++] = ctrl;
>>> }
>>> }
>>
>> Hmm, this is an interesting test case but as I am now trying to match
>> a loop which does popcount, this is not handled.
>>
>>
>> Attaching the current version for review.
> Here are some comments.
>
>> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
>> index 7a54c5f..f390321 100644
>> --- a/gcc/tree-ssa-loop-niter.c
>> +++ b/gcc/tree-ssa-loop-niter.c
>> @@ -2430,6 +2430,134 @@ number_of_iterations_exit_assumptions (struct loop *loop, edge exit,
>> return (!integer_zerop (niter->assumptions));
>> }
>>
>> +/* See if LOOP is a popcout implementation of the form
>> +
>> + int c = 0;
>> + while (b) {
>> + b = b & (b - 1);
>> + c++;
>> + }
>> +
>> + If so, Set NITER to __builtin_popcount (b) - 1
>> + return true if we did, false otherwise. */
>> +
>> +static bool
>> +check_popcount_pattern (loop_p loop, tree *niter, HOST_WIDE_INT *max)
>> +{
>> + tree lhs, rhs;
>> + tree dest;
>> + gimple *and_minus_one;
>> + gimple *phi;
>> + int count = 0;
>> + gimple *count_stmt = NULL;
>> + bool adjust = true;
>> +
>> + if (!single_exit (loop))
>> + return false;
>> +
>> + /* Check loop terminating branch is like
>> + if (b != 0). */
>> + gimple *stmt = last_stmt (loop->header);
>> + if (!stmt
>> + || gimple_code (stmt) != GIMPLE_COND
>> + || !zerop (gimple_cond_rhs (stmt)))
> The check doesn't fully match the comment. NE is not checked here.
> Also can move below "(TREE_CODE (lhs) != SSA_NAME)" check up to this
> point, making simple checks earlier.
>
>> + return false;
>> +
>> + /* Check the loop closed SSA definition for just the variable c defined in
>> + loop. */
>> + basic_block bb = single_exit (loop)->dest;
> single_exit is repeatedly called various times, call it once and use
> the returning edge instead? I am not sure GCC is smart enough
> removing repeated call instances. Same to loop_latch_edge.
>
>> + for (gphi_iterator gpi = gsi_start_phis (bb);
>> + !gsi_end_p (gpi); gsi_next (&gpi))
>> + {
>> + phi = gpi.phi ();
>> + count++;
>> + }
>> +
>> + if (count != 1)
>> + return false;
>> +
>> + rhs = gimple_phi_arg_def (phi, single_exit (loop)->dest_idx);
>> + if (TREE_CODE (rhs) != SSA_NAME)
>> + return false;
>> + count_stmt = SSA_NAME_DEF_STMT (rhs);
>> + lhs = gimple_cond_lhs (stmt);
>> + if (TREE_CODE (lhs) != SSA_NAME)
>> + return false;
>> + gimple *and_stmt = SSA_NAME_DEF_STMT (lhs);
>> +
>> + /* Depending on copy-header is performed, feeding PHI stmts might be in
>> + the loop header or loop exit, handle this. */
>> + if (gimple_code (count_stmt) == GIMPLE_PHI)
>> + {
>> + tree t;
>> + if (gimple_code (and_stmt) != GIMPLE_PHI
>> + || gimple_bb (and_stmt) != single_exit (loop)->src
>> + || gimple_bb (count_stmt) != single_exit (loop)->src)
>> + return false;
>> + t = gimple_phi_arg_def (count_stmt, loop_latch_edge (loop)->dest_idx);
>> + if (TREE_CODE (t) != SSA_NAME)
>> + return false;
>> + count_stmt = SSA_NAME_DEF_STMT (t);
>> + t = gimple_phi_arg_def (and_stmt, loop_latch_edge (loop)->dest_idx);
>> + if (TREE_CODE (t) != SSA_NAME)
>> + return false;
>> + and_stmt = SSA_NAME_DEF_STMT (t);
>> + adjust = false;
>> + }
>> +
>> + /* Make sure we have a count by one. */
>> + if (!is_gimple_assign (count_stmt)
>> + || (gimple_assign_rhs_code (count_stmt) != PLUS_EXPR)
>> + || !integer_onep (gimple_assign_rhs2 (count_stmt)))
>> + return false;
>> +
>> + /* Cheeck "b = b & (b - 1)" is calculated. */
> Typo.
>
>> + if (!is_gimple_assign (and_stmt)
>> + || gimple_assign_rhs_code (and_stmt) != BIT_AND_EXPR)
>> + return false;
>> +
>> + lhs = gimple_assign_rhs1 (and_stmt);
>> + rhs = gimple_assign_rhs2 (and_stmt);
>> + if (TREE_CODE (lhs) == SSA_NAME
>> + && (and_minus_one = SSA_NAME_DEF_STMT (lhs))
>> + && is_gimple_assign (and_minus_one)
>> + && (gimple_assign_rhs_code (and_minus_one) == PLUS_EXPR)
>> + && integer_minus_onep (gimple_assign_rhs2 (and_minus_one)))
>> + lhs = rhs;
>> + else if (TREE_CODE (rhs) == SSA_NAME
>> + && (and_minus_one = SSA_NAME_DEF_STMT (rhs))
>> + && is_gimple_assign (and_minus_one)
>> + && (gimple_assign_rhs_code (and_minus_one) == PLUS_EXPR)
>> + && integer_minus_onep (gimple_assign_rhs2 (and_minus_one)))
> Could you avoid duplication by factoring the condition into an inline
> function? They are exactly the same for lhs/rhs.
>
>> + ;
>> + else
>> + return false;
>> +
>> + if ((gimple_assign_rhs1 (and_stmt) != gimple_assign_rhs1 (and_minus_one))
>> + && (gimple_assign_rhs2 (and_stmt) != gimple_assign_rhs1 (and_minus_one)))
> Here you already got lhs correctly, so don't need to check on and_stmt
> rhs directly. You can even merge this check into above one.
>
>> + return false;
>> +
>> + /* Check the recurrence. */
>> + phi = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (and_minus_one));
>> + gimple *src_phi = SSA_NAME_DEF_STMT (lhs);
>> + if (gimple_code (phi) != GIMPLE_PHI
>> + || gimple_code (src_phi) != GIMPLE_PHI)
> I think this is redundant since you have lhs equals to
> gimple_assign_rhs1 (and_minus_one). So phi == src_phi is always true?
>
>> + return false;
>> +
>> + dest = gimple_assign_lhs (count_stmt);
>> + tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
>> + tree src = gimple_phi_arg_def (src_phi, loop_preheader_edge (loop)->dest_idx);
>> + if (adjust)
>> + *niter = fold_build2 (MINUS_EXPR, TREE_TYPE (dest),
>> + build_call_expr (fn, 1, src),
>> + build_int_cst (TREE_TYPE (dest), 1));
>> + else
>> + *niter = build_call_expr (fn, 1, src);
>> + *max = int_cst_value (TYPE_MAX_VALUE (TREE_TYPE (dest)));
>> + return true;
>> +}
>> +
>> +
>> /* Like number_of_iterations_exit_assumptions, but return TRUE only if
>> the niter information holds unconditionally. */
>>
>> @@ -2441,7 +2569,25 @@ number_of_iterations_exit (struct loop *loop, edge exit,
>> gcond *stmt;
>> if (!number_of_iterations_exit_assumptions (loop, exit, niter,
>> &stmt, every_iteration))
>> - return false;
>> + {
>> + tree count;
>> + HOST_WIDE_INT max;
>> + if (check_popcount_pattern (loop, &count, &max))
>> + {
>> + niter->assumptions = boolean_false_node;
>> + niter->control.base = NULL_TREE;
>> + niter->control.step = NULL_TREE;
>> + niter->control.no_overflow = false;
>> + niter->niter = count;
>> + niter->assumptions = boolean_true_node;
>> + niter->may_be_zero = boolean_false_node;
>> + niter->max = max;
>> + niter->bound = NULL_TREE;
>> + niter->cmp = ERROR_MARK;
>> + return true;
>> + }
> Better to merge these compound statement into check_popcount_pattern
> and rename it into something like number_of_iterations_popcount.
>
> I wondered if the more inefficient version popcount should be checked, like:
>
> int count = 0;
> while (x)
> {
> count += x & 1;
> x = x >> 1;
> }
>
> Thanks,
> bin
>
>> + return false;
>> + }
>>
>> if (integer_nonzerop (niter->assumptions))
>> return true;
>> --
>> 2.7.4
>>
From 413aafbb5d53812546c4af5f556f362aafdb32d4 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 10 May 2018 21:41:53 +1000
Subject: [PATCH] popcount
Change-Id: I2f796dd4518937495cc1855852b0dfa5c4ff1143
---
gcc/ipa-fnsummary.c | 4 +
gcc/testsuite/gcc.dg/tree-ssa/popcount.c | 41 ++++++++
gcc/testsuite/gcc.dg/tree-ssa/popcount2.c | 29 ++++++
gcc/testsuite/gcc.dg/tree-ssa/popcount3.c | 28 ++++++
gcc/tree-scalar-evolution.c | 7 ++
gcc/tree-ssa-loop-ivopts.c | 10 ++
gcc/tree-ssa-loop-niter.c | 156 +++++++++++++++++++++++++++++-
7 files changed, 274 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount.c
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index bdf9ba1..4dc156a 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -1485,6 +1485,10 @@ will_be_nonconstant_expr_predicate (struct ipa_node_params *info,
nonconstant_names);
return p2.or_with (summary->conds, p1);
}
+ else if (TREE_CODE (expr) == CALL_EXPR)
+ {
+ return false;
+ }
else
{
debug_tree (expr);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount.c b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
new file mode 100644
index 0000000..86a66cb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+extern int foo (int);
+
+int PopCount (long b) {
+ int c = 0;
+ b++;
+
+ while (b) {
+ b &= b - 1;
+ c++;
+ }
+ return c;
+}
+int PopCount2 (long b) {
+ int c = 0;
+
+ while (b) {
+ b &= b - 1;
+ c++;
+ }
+ foo (c);
+ return foo (c);
+}
+
+void PopCount3 (long b1) {
+
+ for (long i = 0; i < b1; ++i)
+ {
+ long b = i;
+ int c = 0;
+ while (b) {
+ b &= b - 1;
+ c++;
+ }
+ foo (c);
+ }
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_popcount" 3 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c b/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
new file mode 100644
index 0000000..52afc2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
@@ -0,0 +1,29 @@
+/* { dg-do execute } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+__attribute__ ((noinline, noclone))
+foo (int i, long b)
+{
+ int c = i;
+
+ while (b) {
+ b &= b - 1;
+ c++;
+ }
+ return c;
+}
+
+int main()
+{
+
+ if (foo (1, 7) != 4)
+ __builtin_abort ();
+ if (foo (0, 0) != 0)
+ __builtin_abort ();
+ if (foo (8, 0xff) != 16)
+ __builtin_abort ();
+ return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c b/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
new file mode 100644
index 0000000..0c69d97
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
@@ -0,0 +1,28 @@
+/* { dg-do execute } */
+/* { dg-options "-O2 -fno-tree-ch -fdump-tree-optimized" } */
+
+int
+__attribute__ ((noinline, noclone))
+foo (long b)
+{
+ int c = i;
+
+ while (b) {
+ b &= b - 1;
+ c++;
+ }
+ return c;
+}
+
+int main()
+{
+ if (foo (7) != 3)
+ __builtin_abort ();
+ if (foo (0) != 0)
+ __builtin_abort ();
+ if (foo (0xff) != 8)
+ __builtin_abort ();
+ return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } */
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index fefc9de..967b154 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -1984,6 +1984,7 @@ interpret_expr (struct loop *loop, gimple *at_stmt, tree expr)
return expr;
if (TREE_CODE (expr) == POLYNOMIAL_CHREC
+ || TREE_CODE (expr) == CALL_EXPR
|| get_gimple_rhs_class (TREE_CODE (expr)) == GIMPLE_TERNARY_RHS)
return chrec_dont_know;
@@ -3472,6 +3473,7 @@ bool
expression_expensive_p (tree expr)
{
enum tree_code code;
+ tree fndecl;
if (is_gimple_val (expr))
return false;
@@ -3492,6 +3494,11 @@ expression_expensive_p (tree expr)
if (!integer_pow2p (TREE_OPERAND (expr, 1)))
return true;
}
+ if (code == CALL_EXPR
+ && (fndecl = get_callee_fndecl (expr))
+ && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
+ && (BUILT_IN_POPCOUNT == DECL_FUNCTION_CODE (fndecl)))
+ return false;
switch (TREE_CODE_CLASS (code))
{
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index b313571..519649a 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -985,6 +985,16 @@ contains_abnormal_ssa_name_p (tree expr)
code = TREE_CODE (expr);
codeclass = TREE_CODE_CLASS (code);
+ if (code == CALL_EXPR)
+ {
+ tree arg;
+ call_expr_arg_iterator iter;
+ FOR_EACH_CALL_EXPR_ARG (arg, iter, expr)
+ if (contains_abnormal_ssa_name_p (arg))
+ return true;
+ return false;
+ }
+
if (code == SSA_NAME)
return SSA_NAME_OCCURS_IN_ABNORMAL_PHI (expr) != 0;
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 7a54c5f..66f9b4f 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2430,6 +2430,156 @@ number_of_iterations_exit_assumptions (struct loop *loop, edge exit,
return (!integer_zerop (niter->assumptions));
}
+/* Utility function to check if OP is defined by a stmt
+ that is a val - 1. If that is the case, set it to STMT. */
+
+static bool
+ssa_defined_by_and_minus_one_stmt_p (tree op, tree val, gimple **stmt)
+{
+ if (TREE_CODE (op) == SSA_NAME
+ && (*stmt = SSA_NAME_DEF_STMT (op))
+ && is_gimple_assign (*stmt)
+ && (gimple_assign_rhs_code (*stmt) == PLUS_EXPR)
+ && val == gimple_assign_rhs1 (*stmt)
+ && integer_minus_onep (gimple_assign_rhs2 (*stmt)))
+ return true;
+ else
+ return false;
+}
+
+/* See if LOOP is a popcout implementation of the form
+
+ int c = 0;
+ while (b) {
+ b = b & (b - 1);
+ c++;
+ }
+
+ If popcount pattern, update NITER accordingly.
+ i.e., set NITER to __builtin_popcount (b)
+ return true if we did, false otherwise.
+
+ */
+
+static bool
+number_of_iterations_popcount (loop_p loop, struct tree_niter_desc *niter)
+{
+ tree rhs1, rhs2;
+ tree dest;
+ gimple *and_minus_one;
+ gimple *phi;
+ int count = 0;
+ gimple *count_stmt = NULL;
+ bool adjust = true;
+ edge exit;
+ tree iter;
+ HOST_WIDE_INT max;
+
+ if (!(exit = single_exit (loop)))
+ return false;
+
+ /* Check loop terminating branch is like
+ if (b != 0). */
+ gimple *stmt = last_stmt (loop->header);
+ if (!stmt
+ || gimple_code (stmt) != GIMPLE_COND
+ || gimple_cond_code (stmt) != NE_EXPR
+ || !zerop (gimple_cond_rhs (stmt))
+ || TREE_CODE (gimple_cond_lhs (stmt)) != SSA_NAME)
+ return false;
+
+ /* Check the loop closed SSA definition for just the variable c defined in
+ loop. */
+ basic_block bb = exit->dest;
+ for (gphi_iterator gpi = gsi_start_phis (bb);
+ !gsi_end_p (gpi); gsi_next (&gpi))
+ {
+ phi = gpi.phi ();
+ count++;
+ }
+
+ if (count != 1)
+ return false;
+
+ rhs1 = gimple_phi_arg_def (phi, exit->dest_idx);
+ if (TREE_CODE (rhs1) != SSA_NAME)
+ return false;
+ count_stmt = SSA_NAME_DEF_STMT (rhs1);
+ gimple *and_stmt = SSA_NAME_DEF_STMT (gimple_cond_lhs (stmt));
+
+ /* Depending on copy-header is performed, feeding PHI stmts might be in
+ the loop header or loop exit, handle this. */
+ if (gimple_code (count_stmt) == GIMPLE_PHI)
+ {
+ tree t;
+ if (gimple_code (and_stmt) != GIMPLE_PHI
+ || gimple_bb (and_stmt) != exit->src
+ || gimple_bb (count_stmt) != exit->src)
+ return false;
+ t = gimple_phi_arg_def (count_stmt, loop_latch_edge (loop)->dest_idx);
+ if (TREE_CODE (t) != SSA_NAME)
+ return false;
+ count_stmt = SSA_NAME_DEF_STMT (t);
+ t = gimple_phi_arg_def (and_stmt, loop_latch_edge (loop)->dest_idx);
+ if (TREE_CODE (t) != SSA_NAME)
+ return false;
+ and_stmt = SSA_NAME_DEF_STMT (t);
+ adjust = false;
+ }
+
+ /* Make sure we have a count by one. */
+ if (!is_gimple_assign (count_stmt)
+ || (gimple_assign_rhs_code (count_stmt) != PLUS_EXPR)
+ || !integer_onep (gimple_assign_rhs2 (count_stmt)))
+ return false;
+
+ /* Check "b = b & (b - 1)" is calculated. */
+ if (!is_gimple_assign (and_stmt)
+ || gimple_assign_rhs_code (and_stmt) != BIT_AND_EXPR)
+ return false;
+
+ rhs1 = gimple_assign_rhs1 (and_stmt);
+ rhs2 = gimple_assign_rhs2 (and_stmt);
+
+ if (ssa_defined_by_and_minus_one_stmt_p (rhs1, rhs2, &and_minus_one))
+ rhs1 = rhs2;
+ else if (ssa_defined_by_and_minus_one_stmt_p (rhs2, rhs1, &and_minus_one))
+ ;
+ else
+ return false;
+
+ /* Check the recurrence. */
+ phi = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (and_minus_one));
+ gimple *src_phi = SSA_NAME_DEF_STMT (rhs2);
+ if (gimple_code (phi) != GIMPLE_PHI
+ || gimple_code (src_phi) != GIMPLE_PHI)
+ return false;
+
+ dest = gimple_assign_lhs (count_stmt);
+ tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
+ tree src = gimple_phi_arg_def (src_phi, loop_preheader_edge (loop)->dest_idx);
+ if (adjust)
+ iter = fold_build2 (MINUS_EXPR, TREE_TYPE (dest),
+ build_call_expr (fn, 1, src),
+ build_int_cst (TREE_TYPE (dest), 1));
+ else
+ iter = build_call_expr (fn, 1, src);
+ max = int_cst_value (TYPE_MAX_VALUE (TREE_TYPE (dest)));
+
+ niter->assumptions = boolean_false_node;
+ niter->control.base = NULL_TREE;
+ niter->control.step = NULL_TREE;
+ niter->control.no_overflow = false;
+ niter->niter = iter;
+ niter->assumptions = boolean_true_node;
+ niter->may_be_zero = boolean_false_node;
+ niter->max = max;
+ niter->bound = NULL_TREE;
+ niter->cmp = ERROR_MARK;
+ return true;
+}
+
+
/* Like number_of_iterations_exit_assumptions, but return TRUE only if
the niter information holds unconditionally. */
@@ -2441,7 +2591,11 @@ number_of_iterations_exit (struct loop *loop, edge exit,
gcond *stmt;
if (!number_of_iterations_exit_assumptions (loop, exit, niter,
&stmt, every_iteration))
- return false;
+ {
+ if (number_of_iterations_popcount (loop, niter))
+ return true;
+ return false;
+ }
if (integer_nonzerop (niter->assumptions))
return true;
--
2.7.4