Bug 45397 - [6 Regression] Issues with integer narrowing conversions
Summary: [6 Regression] Issues with integer narrowing conversions
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.6.0
: P2 normal
Target Milestone: 7.4
Assignee: Richard Biener
URL:
Keywords: missed-optimization
Depends on: 79740 79777
Blocks: 65964
  Show dependency treegraph
 
Reported: 2010-08-24 13:19 UTC by Jakub Jelinek
Modified: 2018-11-19 11:53 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Known to work: 7.0
Known to fail:
Last reconfirmed: 2011-12-05 14:13:05


Attachments
minmax.patch (980 bytes, patch)
2010-08-24 13:41 UTC, Jakub Jelinek
Details | Diff
patch doing this in SCCVN (2.15 KB, patch)
2017-02-24 09:38 UTC, Richard Biener
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2010-08-24 13:19:21 UTC
In http://gcc.gnu.org/ml/gcc/2010-08/msg00326.html
Revital complained about MAX_EXPR no longer being recognized in:
int foo (const unsigned char *tmp, int i, int val)
{
  return (unsigned char)(((tmp[i] + val)>0xFF)?0xFF:(((tmp[i] + val)<0)?0:(tmp[i] + val)));
}
It is still recognized when using:
int bar (const unsigned char *tmp, int i, int val)
{
  int x = (((tmp[i] + val)>0xFF)?0xFF:(((tmp[i] + val)<0)?0:(tmp[i] + val)));
  return (unsigned char)x;
}
The regression is caused by folding being delayed in the C FE, while before
the inner COND_EXPR has been optimized by fold_ternary* into MAX_EXPR, now it isn't immediately, thus convert_to_integer on the narrowing conversion propagates the (unsigned char) narrowing casts down into the operands and when fold_ternary* is actually called, it is too late, as the operands aren't considered equal.

Perhaps it would be nice for -O2 to consider for e.g. int a and b
that (unsigned char) (a + b) is the same as (unsigned char) a + (unsigned char) b during VN.
Comment 1 Jakub Jelinek 2010-08-24 13:35:12 UTC
On the C FE side, perhaps these optimizations should be moved over from convert_to_integer to e.g. c_fully_fold_internal, where the operand would be folded first and only afterwards the narrowing conversion optimization would be applied.  That would solve this testcase too, but wouldn't help the case where the narrowing conversion is done by the user already.

int foo (const unsigned char *a, int b, int c)
{
  int x = (unsigned char) (a[b] + c);
  int y = a[b] + c;
  int z = (unsigned char) y;
  return x == z;
}

is another testcase which isn't currently optimized at the tree level (the RTL level can figure this out though).
Comment 2 Jakub Jelinek 2010-08-24 13:41:37 UTC
Created attachment 21556 [details]
minmax.patch

I've also noticed that the folder only recognizes the inner MAX_EXPR, but doesn't recognize the outer MIN_EXPR.  Apparently (when convert_to_integer doesn't harm it, i.e. the bar testcase instead of foo) phiopt1 later on figures this out.  The attached (untested) patch handles that case early, but not sure it is worth it.
Comment 3 Richard Biener 2010-08-24 14:13:05 UTC
Confirmed.
Comment 4 Richard Biener 2010-12-16 13:03:18 UTC
GCC 4.5.2 is being released, adjusting target milestone.
Comment 5 Richard Biener 2011-04-28 14:51:32 UTC
GCC 4.5.3 is being released, adjusting target milestone.
Comment 6 Kai Tietz 2012-01-02 13:09:09 UTC
Hmm, we could solve this via forward-propagation in gimple, too.
I have a patch which does this, but not sure if it is material for current stage.  The missing patterns in forward-propagation are:
  - X ==/!= X -> true/false,
  - (type) X ==/!= (type) X -> true/false
  - X code Y ==/!= X code Z -> Y ==/!= Z (with code +, -, or ^).
  - (type) X ==/!= Y & CST -> X ==/!= (type-of-X) (Y & CST) if type-of-X has smaller, or equal precision as type and is unsigned, and type-of-x and type are of integral kind,
  and CST fits into type-of-X.
  - (type) (X op CST) -> (type) X op CST' with CST' = (type) X; and type has smaller or equal precision to type-of-x and both types are of integral kind. (for op = +,-,&,|,^)
  - (type) ((type2) X op Y) -> X op (type) Y, if type has smaller or equal precision to type2 and type-of-x is compatible to type, all types are of integral kind, and op is a +,-,&,|,or ^ expression.
Comment 7 Jakub Jelinek 2012-01-04 20:13:11 UTC
Not a stage4 material.
Comment 8 Jakub Jelinek 2012-03-02 09:36:52 UTC
Another testcase (lightly based on PR50182), -O3 -mavx:

signed char a[1024], b[1024];

void
foo (void)
{
  int i, s, t;
  for (i = 0; i < 1024; i++)
    { s = a[i]; t = b[i]; s += t; a[i] = s; }
}

void
bar (void)
{
  int i;
  for (i = 0; i < 1024; i++)
    a[i] += b[i];
}

shows terrible code for the first loop and nice for the second one, eventhough both can be implemented as the second loop.
Comment 9 Jakub Jelinek 2012-03-02 09:39:16 UTC
Note that once can't use signed type in the narrowing + of course, it needs to be unsigned char addition to avoid overflows.
Comment 10 Jakub Jelinek 2012-03-02 09:48:17 UTC
Another testcase:

signed char a[1024], b[1024];

void
baz (void)
{
  int i, s, t;
  for (i = 0; i < 1024; i++)
    { s = a[i]; t = b[i]; s += t + 0x12345600; a[i] = s; }
}

The addition of constant we should be able to optimize away completely, but we don't.
Comment 11 Jakub Jelinek 2012-03-06 12:48:19 UTC
Related to PR47477.
Comment 12 Jakub Jelinek 2013-03-22 14:43:54 UTC
GCC 4.8.0 is being released, adjusting target milestone.
Comment 13 Jakub Jelinek 2013-05-31 10:58:16 UTC
GCC 4.8.1 has been released.
Comment 14 Jakub Jelinek 2013-10-16 09:50:18 UTC
GCC 4.8.2 has been released.
Comment 15 Richard Biener 2014-05-22 09:03:06 UTC
GCC 4.8.3 is being released, adjusting target milestone.
Comment 16 Jakub Jelinek 2014-12-19 13:29:03 UTC
GCC 4.8.4 has been released.
Comment 17 Richard Biener 2015-06-23 08:18:22 UTC
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.
Comment 18 Jakub Jelinek 2015-06-26 19:54:24 UTC
GCC 4.9.3 has been released.
Comment 19 Marek Polacek 2015-10-19 11:39:36 UTC
When this deficiency is finally resolved, we should be able to remove the code in convert_to_integer dealing with narrowing operands:

 738         case PLUS_EXPR:
 739         case MINUS_EXPR:
 740         case BIT_AND_EXPR:
 741         case BIT_IOR_EXPR:
 742         case BIT_XOR_EXPR:
 743         trunc1:
 744           {
 ...
 821                     return convert (type,
 822                                     fold_build2 (ex_form, typex,
 823                                                  convert (typex, arg0),
 824                                                  convert (typex, arg1)));
Comment 20 Richard Biener 2016-08-03 10:48:53 UTC
GCC 4.9 branch is being closed
Comment 21 Jeffrey A. Law 2017-02-14 16:14:24 UTC
Couldn't we solve this with a pattern like this in match.pd:

(simplify
  (convert (plus:c @0 (convert @1)))
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
       && INTEGRAL_TYPE_P (TREE_TYPE (@0))
       && INTEGRAL_TYPE_P (type)
       && types_match (@1, type)
       && (TYPE_PRECISION (TREE_TYPE (@1)) > TYPE_PRECISION (TREE_TYPE (@0))))
    (bit_and (plus (convert @0) @1)
       { wide_int_to_tree
                (type, wi::mask (TYPE_PRECISION (TREE_TYPE (@0)), false, TYPE_PRECISION (type))); })))

Essentially this widens the arithmetic, then masks off undesirable bits in the result.  As written I think this doesn't reduce the number of expressions consistently even though it helps the testcase.

The reason it helps the testcase is the (convert @0) in the output pattern turns out to be a common subexpression.  If (convert @0) is CSE-able, then this does reduce the number of expressions.  Though we probably want to avoid widening beyond a target's wordsize (which I loathe to test for).

Thoughts anyone?
Comment 22 Richard Biener 2017-02-16 08:21:40 UTC
I think the pattern to identify is the *tmp + val saturating addition.  One of
the issues is that we have tmp[i] + val in two forms:

  _5 = (int)tmp[i] + val;

and

  _iftmp.1_13 = (int)(tmp[i] + (unsigned char)val);

where the latter is guarded with an overflow check (_5 <= 255 && _5 >= 0).

IMHO the unfortunate thing is that we arrive with this inconsistent
narrowing here in the first place before we had a chance to clean things up.
Of course as Jakub notices in comment#1 the user could have written this
in this way.  And _only_ if the no-overflow condition holds are the two
expressions the same (contrary to Jakubs example that compares both values
as unsigned char).

As suggested above we may want to do "saturated OP" pattern matching here
and then re-expand in more optimal form (using min/max as suggested or
using an IFN like we have for the overflow cases).

Another option is to teach VRP to detect the equivalence (but that looks hard
and even more ad-hoc).  As for using CSE in some way it would need to see
the range of _5 and conditionally add (int)(unsigned char)expr-of-_5 simplified
and hope things get detected that way... (but first of all it would need a
value-range lattice).

So the only "reasonable" suggestion is pattern matching the whole thing
which fits phiopt.

I do not think that having a 'widening' general pattern is the way to go.
That has the same issues as the current narrowing one.

Moving down the narrowing optimization from the FEs folding code is a step
in the right direction anyway (but it would need putting elsewhere to not "regress").

I think that all but maybe a saturating-OP replacement in phiopt is GCC 8 material.  Adjusting target milestone to GCC 7 meanwhile.
Comment 23 Jeffrey A. Law 2017-02-16 17:30:51 UTC
The model of shortening as much as possible for gimple, then widening to word mode at the gimple/RTL boundary is probably too simplistic.  We really need the ability to widen when doing so allows for simplification.

That's what the prototype pattern I posted was meant to show.  By widening we can expose the common subexpressions, and once the common subexpressions are exposed, min/max can do its job.  The problem is we don't have range data, so we have to do masking of the result, nor do we know if the widened operand is a common subexpression or not.  Thus profitability is unknown at transformation time.

I was about to write off VRP again because it lacks much of the CSE capability we want, but VRP has the information to know that the result is value preserving if done in the wider mode.  So at VRP time we could do the widening without needing to mask the result, at which point widening is known to be profitable.  As written we have two casts + addition.  VRP could turn that into a addition with a single cast (both of which are common subexpressions, but VRP doesn't need to detect that to ensure profitability).

DOM/FRE have CSE infrastructure, but the range data presented to them is not path sensitive and thus useless in this case.

I'm hesitant to try to do all this in phi-opt -- phi-opt would have to look through the casts, prove equivalences, etc.  ick.
Comment 24 Richard Biener 2017-02-17 09:34:52 UTC
(In reply to Jeffrey A. Law from comment #23)
> The model of shortening as much as possible for gimple, then widening to
> word mode at the gimple/RTL boundary is probably too simplistic.  We really
> need the ability to widen when doing so allows for simplification.

But the key point here is "when doing so allows for simplification" ...

> That's what the prototype pattern I posted was meant to show.  By widening
> we can expose the common subexpressions, and once the common subexpressions
> are exposed, min/max can do its job.  The problem is we don't have range
> data, so we have to do masking of the result, nor do we know if the widened
> operand is a common subexpression or not.  Thus profitability is unknown at
> transformation time.
> 
> I was about to write off VRP again because it lacks much of the CSE
> capability we want, but VRP has the information to know that the result is
> value preserving if done in the wider mode.  So at VRP time we could do the
> widening without needing to mask the result, at which point widening is
> known to be profitable.  As written we have two casts + addition.  VRP could
> turn that into a addition with a single cast (both of which are common
> subexpressions, but VRP doesn't need to detect that to ensure profitability).
> 
> DOM/FRE have CSE infrastructure, but the range data presented to them is not
> path sensitive and thus useless in this case.

Note that mid-term I'd like our value-numbering infrastructure (VRP is also
a kind of value-numbering) to improve to a point where we simply do VRP
alongside CSE, constant, copy and known-zero/one-bits propagation.  You
"simply" track multiple lattices (configurable) from a common iteration
(configurable as either non-iterating, pessimistically handling backedges,
or iterating, optimistically handling them).

> I'm hesitant to try to do all this in phi-opt -- phi-opt would have to look
> through the casts, prove equivalences, etc.  ick.

Yeah, ICK.  But still this case is very much about pattern matching
(and getting min/max in the end).  It _is_ a saturating operation which
might be common enough to pattern-match (heh, some targets even have
instruction support for them and the middle-end has saturating types!)

So short-term pattern matching in phiopt is the way to go (I'd consider
that even as a regression fix).
Comment 25 Jeffrey A. Law 2017-02-17 16:31:13 UTC
"When doing so allows for simplification" is actually a pretty low bar here.  We just have to prove the widening cast is a common subexpression.  Once we do that, we know widening is a win.  And that's relatively easy to discover.  You go back to the SSA_NAME_DEF_STMT of the object you want to widen, then walk over its immediate uses to see if one is a proper widening op that dominates the expression we're trying to simplify. 

That's it, nothing else needed to prove profitability.  We only get ~20 hits in a bootstrap, but I didn't expect this idiom to show up all that much.


--

Trying to do all the pattern matching in phi-opt and ultimately generate the min/max is going to be *hideous* because we're trying to do too  many things at once.  We have components of CSE, VRP and DCE that we'd have to bring into phi-opt, which seems wrong to me.

Contrast that to simplifying the IL like my match.pd pattern does.   We make a simple, profitable, change to the IL.  That one profitable change in the IL has ripple effects that allow subsequent passes to do their jobs with the final result that the minmax detection code is presented with exactly the IL it knows how to transform.
 
--

Alternately (and I haven't prototyped this) we could try again to look at this as a redundancy elimination problem.

Given X = A + B in DOM, where A comes from a narrowed operand (A').  Lookup a widened B in the expression table (B').  If found, then lookup A' + B'.  If found (lhs), then replace X = A + B with X = (typeof X) ((lhs) & mask).

That's essentially doing the same thing as the prototype match.pd pattern.  Except that we know the operand widening as well as the widened arithmetic are both CSEs.
Comment 26 rguenther@suse.de 2017-02-20 10:35:37 UTC
On Fri, 17 Feb 2017, law at redhat dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397
> 
> --- Comment #25 from Jeffrey A. Law <law at redhat dot com> ---
> "When doing so allows for simplification" is actually a pretty low bar here. 
> We just have to prove the widening cast is a common subexpression.  Once we do
> that, we know widening is a win.  And that's relatively easy to discover.  You
> go back to the SSA_NAME_DEF_STMT of the object you want to widen, then walk
> over its immediate uses to see if one is a proper widening op that dominates
> the expression we're trying to simplify. 
> 
> That's it, nothing else needed to prove profitability.  We only get ~20 hits in
> a bootstrap, but I didn't expect this idiom to show up all that much.

I find this kind of "technique" of finding CSE opportunities quite ugly
and I'm not sure where you'd actually implement this...

> 
> --
> 
> Trying to do all the pattern matching in phi-opt and ultimately generate the
> min/max is going to be *hideous* because we're trying to do too  many things at
> once.  We have components of CSE, VRP and DCE that we'd have to bring into
> phi-opt, which seems wrong to me.

Well, we are pattern matching the overflow builtins as well.  I think
it would be quite natural to pattern match saturating operations with
the premise to generate better code for them (either by generic expansion
to min/max or by utilizing CPU instructions -- I think SSE has saturating
ops for example).  You don't really need CSE, VRP and DCE, you "simply"
pattern match the comparisons.  You'd then replace the PHI node by
the saturated op and let DCE do the rest for you.

> Contrast that to simplifying the IL like my match.pd pattern does.   We make a
> simple, profitable, change to the IL.  That one profitable change in the IL has
> ripple effects that allow subsequent passes to do their jobs with the final
> result that the minmax detection code is presented with exactly the IL it knows
> how to transform.
> 
> --
> 
> Alternately (and I haven't prototyped this) we could try again to look at this
> as a redundancy elimination problem.
> 
> Given X = A + B in DOM, where A comes from a narrowed operand (A').  Lookup a
> widened B in the expression table (B').  If found, then lookup A' + B'.  If
> found (lhs), then replace X = A + B with X = (typeof X) ((lhs) & mask).
> 
> That's essentially doing the same thing as the prototype match.pd pattern. 
> Except that we know the operand widening as well as the widened arithmetic are
> both CSEs.

CSE already does similar stuff for loads, transparently adding 
VIEW_CONVERT_EXPRs.  I've added generic helpers for this
(vn_nary_build_or_lookup).  The question is if it's worth it.
You'd basically add to the list of special patterns in visit_nary_op
where we already handle ops in different sign (you'd add "op in different
width" and possibly add two stmts instead of one - the conversion and
the masking).

Oh, happens to be a patch in my local tree only (no longer remember
what was the reason to invent it but surely the question is how
many of these special-cases are worth the extra lookups)

Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c        (revision 245417)
+++ gcc/tree-ssa-sccvn.c        (working copy)
@@ -3453,17 +3493,78 @@ visit_copy (tree lhs, tree rhs)
 static bool
 visit_nary_op (tree lhs, gimple *stmt)
 {
-  bool changed = false;
   tree result = vn_nary_op_lookup_stmt (stmt, NULL);
-
   if (result)
-    changed = set_ssa_val_to (lhs, result);
-  else
-    {
-      changed = set_ssa_val_to (lhs, lhs);
-      vn_nary_op_insert_stmt (stmt, lhs);
+    return set_ssa_val_to (lhs, result);
+  else if (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+          && (TYPE_PRECISION (TREE_TYPE (lhs))
+              == GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (lhs))))
+          && is_gimple_assign (stmt))
+    {
+      /* For operations on integers that have the same result independent
+         of the signedness of their operands try to lookup a result with
+        opposite sign.  */
+      enum tree_code code = gimple_assign_rhs_code (stmt);
+      switch (code)
+       {
+       case NEGATE_EXPR:
+         {
+           tree type = TREE_TYPE (lhs);
+           type = signed_or_unsigned_type_for (! TYPE_UNSIGNED (type), 
type);
+           tree ops[3];
+           ops[0] = gimple_assign_rhs1 (stmt);
+           ops[0] = vn_nary_op_lookup_pieces (1, NOP_EXPR, type, ops, 
NULL);
+           if (ops[0])
+             {
+               ops[0] = vn_nary_op_lookup_pieces (1, code, type,
+                                                  ops, NULL);
+               if (ops[0])
+                 {
+                   result = vn_nary_build_or_lookup (NOP_EXPR,
+                                                     TREE_TYPE (lhs), 
ops);
+                   if (result)
+                     return set_ssa_val_to (lhs, result);
+                 }
+             }
+           break;
+         }
+       case PLUS_EXPR:
+       case MINUS_EXPR:
+       case MULT_EXPR:
+       case LSHIFT_EXPR:
+         {
+           tree type = TREE_TYPE (lhs);
+           type = signed_or_unsigned_type_for (! TYPE_UNSIGNED (type), 
type);
+           tree ops[3];
+           ops[0] = gimple_assign_rhs1 (stmt);
+           ops[0] = vn_nary_op_lookup_pieces (1, NOP_EXPR, type, ops, 
NULL);
+           if (ops[0])
+             {
+               ops[1] = gimple_assign_rhs2 (stmt);
+               if (code != LSHIFT_EXPR)
+                 ops[1] = vn_nary_op_lookup_pieces (1, NOP_EXPR, type,
+                                                    &ops[1], NULL);
+               if (ops[1])
+                 {
+                   ops[0] = vn_nary_op_lookup_pieces (2, code, type,
+                                                      ops, NULL);
+                   if (ops[0])
+                     {
+                       result = vn_nary_build_or_lookup (NOP_EXPR,
+                                                         TREE_TYPE (lhs), 
ops);
+                       if (result)
+                         return set_ssa_val_to (lhs, result);
+                     }
+                 }
+             }
+           break;
+         }
+       default:;
+       }
     }
 
+  bool changed = set_ssa_val_to (lhs, lhs);
+  vn_nary_op_insert_stmt (stmt, lhs);
   return changed;
 }
Comment 27 Richard Biener 2017-02-20 12:12:09 UTC
patch misses some ops = {} and ops[1] = NULL_TREE; to avoid ICEing.
Comment 28 Jeffrey A. Law 2017-02-24 06:18:07 UTC
WRT c#26.  

Yes, I would agree that finding CSE's that way is rather gross, but significantly less so than what will be required to address this problem in phi-opt.

Pattern matching this is going to be significantly more complex than the ADD_OVERFLOW/SUB_OVERFLOW.  I've looked at that code via pr79095 and catching these saturating idioms is significantly more complex.

I prototyped the idea of having DOM do extra lookups for a widened version of X OP Y.  It's a bit ugly relative to the match.pd approach, but possibly manageable.

However, doing that in DOM exposes some lameness we'd have to address in VRP.

Prior to DOM2 we have:

;   basic block 4, loop depth 0, count 0, freq 4665, maybe hot
;;    prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
;;    pred:       3 [67.6%]  (TRUE_VALUE,EXECUTABLE)
  _6 = (unsigned char) val_12(D);
  _7 = _3 + _6;
  iftmp.1_13 = (int) _7;

We transform that into:

;;   basic block 4, loop depth 0, count 0, freq 4665, maybe hot
;;    prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
;;    pred:       3 [67.6%]  (TRUE_VALUE,EXECUTABLE)
  iftmp.1_13 = _15 & 255

That's value preserving and clearly an improvement.  Unfortunately we have to wait until vrp2 to discover that the masking is redundant and simplify the statement into:


iftmp.1_13 = _15

The problem is we do not propagate that copy into the PHI node at BB4's successor.  By the time we do finally propagate away the copy, there aren't any additional phi-opt passes to turn things into a MIN/MAX.

THe lack of copy propagation when VRP simplifies a statement like that is due to using op_with_constant_singleton_value_range as the callback for substitute_and_fold.  op_with_constant_singleton_value_range only returns exactly what you would think -- constant singleton ranges.  Thus we don't discover or exploit copy propagation opportunities created by VRP's statement simplification.

Enhancing the callback to return an SSA_NAME for cases were VRP simplifies arithmetic/logicals to copies allows the propagation step to propagate the copy into the PHI node in BB4's successor.

That in turn allows phi-opt to do its job and by the .optimized dump we have:

;;   basic block 2, loop depth 0, count 0, freq 10000, maybe hot
;;    prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED)
;;    pred:       ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
  _1 = (sizetype) i_9(D);
  _2 = tmp_10(D) + _1;
  _3 = *_2;
  _4 = (int) _3;
  _5 = _4 + val_12(D);
  _16 = MAX_EXPR <_5, 0>;
  iftmp.0_7 = MIN_EXPR <_16, 255>;
  return iftmp.0_7;
;;    succ:       EXIT [100.0%]


Which is what we want at this stage.  Transforming something like that into saturating arithmetic for processors which support such insns is much easier (but IMHO out of the scope of this BZ).

Anyway, I'm offline the next few days and largely booked on non-technical stuff much of March.  I don't know if I'll be able to push this further over the next few weeks or not.
Comment 29 Richard Biener 2017-02-24 09:38:38 UTC
Created attachment 40822 [details]
patch doing this in SCCVN

This instead of in DOM implements it in SCCVN which makes the saturated ops recognized in phiopt1.

The other testcases in this bug involve other peculiarities like having
& 255 instead of a conversion to match or having A + B + CST.  This shows
that this kind of matching is really a hack... (or well, it show for another
time that handling CSE and larger expressions that are associatable is difficult).
Comment 30 Jeffrey A. Law 2017-02-24 15:45:26 UTC
Dropping the regression marker.  THe other cases are requests for additional optimizations, not regressions.
Comment 31 Richard Biener 2017-02-27 08:52:00 UTC
Author: rguenth
Date: Mon Feb 27 08:51:28 2017
New Revision: 245752

URL: https://gcc.gnu.org/viewcvs?rev=245752&root=gcc&view=rev
Log:
2017-02-27  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/45397
	* tree-ssa-pre.c (eliminate_insert): Handle BIT_AND_EXPR.
	* tree-ssa-sccvn.c (valueized_wider_op): New helper.
	(visit_nary_op): Add pattern matching for CSEing sign-changed
	or truncated operations with wider ones.

	* gcc.dg/tree-ssa/pr45397.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/pr45397.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-pre.c
    trunk/gcc/tree-ssa-sccvn.c
Comment 32 Richard Biener 2017-02-27 09:00:45 UTC
Regression fixed on trunk.
Comment 33 Jakub Jelinek 2017-05-02 15:56:46 UTC
GCC 7.1 has been released.
Comment 34 Richard Biener 2018-01-25 08:27:09 UTC
GCC 7.3 is being released, adjusting target milestone.
Comment 35 Martin Liška 2018-11-19 11:53:44 UTC
GCC 6 branch is closed, thus closing.