This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Improve detection of constant conditions during jump threading


On 04/20/2016 03:02 AM, Richard Biener wrote:
On Tue, Apr 19, 2016 at 7:50 PM, Patrick Palka <patrick@parcs.ath.cx> wrote:
This patch makes the jump threader look through the BIT_AND_EXPRs and
BIT_IOR_EXPRs within a condition so that we could find dominating
ASSERT_EXPRs that could help make the overall condition evaluate to a
constant.  For example, we currently don't perform any jump threading in
the following test case even though it's known that if the code calls
foo() then it can't possibly call bar() afterwards:
I'd always envisioned we'd do more simplifications than we're doing now and this fits well within how I expected to exploit ASSERT_EXPRs and DOM's available expressions/const/copies tables.

However, I do have some long term direction plans that may make how we do this change a bit. In the mean time I don't see a reason not to go forward with your change.




void
baz_1 (int a, int b, int c)
{
  if (a && b)
    foo ();
  if (!b && c)
    bar ();
}

   <bb 2>:
   _4 = a_3(D) != 0;
   _6 = b_5(D) != 0;
   _7 = _4 & _6;
   if (_7 != 0)
     goto <bb 3>;
   else
     goto <bb 4>;

   <bb 3>:
   b_15 = ASSERT_EXPR <b_5(D), b_5(D) != 0>;
   foo ();

   <bb 4>:
   _10 = b_5(D) == 0;
   _12 = c_11(D) != 0;
   _13 = _10 & _12;
   if (_13 != 0)
     goto <bb 5>;
   else
     goto <bb 6>;

   <bb 5>:
   bar ();

   <bb 6>:
   return;

So we here miss a jump threading opportunity that would have made bb 3 jump
straight to bb 6 instead of falling through to bb 4.

If we inspect the operands of the BIT_AND_EXPR of _13 we'll notice that
there is an ASSERT_EXPR that says its left operand b_5 is non-zero.  We
could use this ASSERT_EXPR to deduce that the condition (_13 != 0) is
always false.  This is what this patch does, basically by making
simplify_control_stmt_condition recurse into BIT_AND_EXPRs and
BIT_IOR_EXPRs.

Does this seem like a good idea/approach?
So the other approach I've been pondering for a while is backwards substitution.

So given _13 != 0, we expand that to

(_10 & _12) != 0

Which further expands into
((b_5 == 0) & (c_11 != 0)) != 0

And we follow b_5 back to the ASSERT_EXPR which allows us to start simplifying terms.


The glitch in that plan is there is no easy linkage between the use of b_5 in bb4 and the ASSERT_EXPR in bb3. That's something Aldy, Andrew and myself are looking at independently for some of Aldy's work.

But that's all future work...  Back to your patch...


Notes:

1. This patch introduces a "regression" in gcc.dg/tree-ssa/ssa-thread-11.c
in that we no longer perform FSM threading during vrp2 but instead we
detect two new jump threading opportunities during vrp1.  Not sure if
the new code is better but it is shorter.  I wonder how this should be
resolved...

Try adjusting the testcase so that it performs the FSM threading again
or adjust the expected outcome...
Right. We just need to look closely at the before/after dumps, make a decision about whether the result is better or worse. If it's better, then we adjust the output to the new better result (and I would claim that the same threading, but done earlier is better).

Shorter isn't generally a good indicator of whether or not something is better. The thing to look at is the number of conditional executed on the various paths through the CFG.

In this specific instance, there's a good chance your analysis is catching something earlier and allowing it to be better simplified. But let's do the analysis to make sure.


2. I haven't tested the performance impact of this patch.  What would be
a good way to do this?
I have ways to do that. Jump threading inherently is about reducing the number of conditionals we have to evaluate at runtime. Valgrind can give us that information. So what I do is...

Build a control compiler. Use that to build an older version of gcc (gcc-4.7.3 in particular) . I then use the just-built gcc-4.7.3 to build a bunch of .i files under valgrind control.

I then repeat, but the first compiler has whatever patch I want to evaluate installed.

I can then compare the output from valgrind to see which has better branching behaviour.

It takes a few hours, but it's not terrible and has provided good data through the years. I'll throw your patch into the tester and see what it spits out.

I'll also have a look at the details of the patch.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]