Bug 92233 - missed optimisation for multiplication when it's known that at least one of the arguments is 0
Summary: missed optimisation for multiplication when it's known that at least one of t...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 9.2.1
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2019-10-26 05:41 UTC by SztfG
Modified: 2020-04-18 21:41 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2019-10-28 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description SztfG 2019-10-26 05:41:08 UTC
testcase:

unsigned test_mult(unsigned a, unsigned b)
{
  if ((a == 0) || (b == 0))
  {
    return a*b; // here a*0 or 0*b or 0*0 - always 0
  }
  return 0;
}

So this function should always return 0 no matter what, but GCC generate comparisons and imul instruction, even with -O3
Comment 1 Marc Glisse 2019-10-26 09:24:05 UTC
(llvm doesn't do it either)
Would some kind of threading be the most natural way to handle this? If the compiler duplicates the code as

if (a==0) return a*b;
else if (b==0) return a*b;

then it becomes easy to optimize. There could be a heuristic to encourage the compiler to do that when the test is var == cst and var is used in an operation that greatly simplifies for cst.

On powerpc64le-linux-gnu (so the 2 tests aren't combined as bit_ior), if I test test_mul(a,b)==0 in another function, it does simplify to true, with DOM2 doing the interesting part.
Comment 2 Richard Biener 2019-10-28 08:25:15 UTC
It's kind-of tail-duplication that is required here.  The jump threading code
can likely be abused here but the important thing is of course the costing
where unlike with jump-threading, there's no branch that will go away.

In theory (and with --param logical-op-non-short-circuit=0) GVN PRE could
also see that the multiplication result is fully available on both
arms (but the VN part doesn't know about conditional equivalences [yet]).

That said, it's a value-numbering issue as soon as (like here) a value
is always known to have some specific value.

But yes, it might be easier to have another transform simplify the problem
for us.

Oh, and logical-op-non-short-circuit manifests itself too early.
Comment 3 Jeffrey A. Law 2020-04-18 21:41:55 UTC
Rather than jump threading what I think we want is the ability to note that the two object's values are related.  In this specific example, in the THEN arm we know that at least one of them must be zero and thus the result of the multiplication must be zero.

This kind of relational tracking is also helpful to jump threading and false positive elimination for the various middle end warnings.  There's one or more other BZs which show that, but I don't know their #s offhand.