Bug 14844 - [tree-ssa] narrow types if wide result is not needed for unsigned types or when wrapping is true
Summary: [tree-ssa] narrow types if wide result is not needed for unsigned types or wh...
Status: WAITING
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: tree-ssa
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization, TREE
Depends on: 15459
Blocks: 19986 65964
  Show dependency treegraph
 
Reported: 2004-04-04 16:15 UTC by Kazu Hirata
Modified: 2017-01-13 03:49 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2011-07-18 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kazu Hirata 2004-04-04 16:15:57 UTC
Consider:

int
foo (int a, int b)
{
  long long aa = a;
  long long bb = b;
  return aa + bb;
}

The last tree looks like:

foo (a, b)
{
<bb 0>:
  return (int)(long long int)a + (int)(long long int)b;

}

We may be able to solve this problem by propagating narrowing casts backward.
Comment 1 Andrew Pinski 2004-04-04 16:20:45 UTC
Confirmed, but already fixed by my cast patch: <http://gcc.gnu.org/ml/gcc-patches/2004-04/
msg00169.html>.
Comment 2 Andrew Pinski 2004-04-05 01:12:10 UTC
An other expample but it is not fixed by my patch (but I should handle though):
int
foo (int a, int b)
{
  long long aa = a;
  long long bb = b;
  long long cc = aa + bb;
  int dd = cc;
  return dd;
}
Comment 3 Andrew Pinski 2004-04-06 00:06:23 UTC
It is a memory hug and a compile time hog because the number of RTL produced is too 
much.
Comment 4 Andrew Pinski 2004-04-06 18:08:55 UTC
Also it does not get optimized untill after register allocation which is bad news for some 
targets like x86.
Comment 5 Andrew Pinski 2004-05-17 14:05:06 UTC
Once fold is able to do (int)(a+b) into (int)a + (int)b, the combine pass (15459) should be able to fix 
this.
Comment 6 Andrew Pinski 2004-06-21 05:08:35 UTC
Well I am not going to fix this one yet.
Comment 7 Andrew Pinski 2005-02-14 02:23:43 UTC
Note the code from build_binop from the C and C++ front-ends need to be moved to fold and then 
when that happens my tree combiner will just work.
Comment 8 Paul Schlie 2005-02-14 05:47:39 UTC
(In reply to comment #7)
> Note the code from build_binop from the C and C++ front-ends need to be moved to fold and then 
> when that happens my tree combiner will just work.

Sorry, but a little confused, as it's perfectly correct to shorten these operands to the width
of the operation's assignment, and in fact should be done? (so there's nothing to correct
and arguably should have been identifyed as such by the front-ends to begin with it would seem)
Comment 9 Andrew Pinski 2005-02-14 05:57:31 UTC
(In reply to comment #8)
> Sorry, but a little confused, as it's perfectly correct to shorten these operands to the width
> of the operation's assignment, and in fact should be done? (so there's nothing to correct
> and arguably should have been identifyed as such by the front-ends to begin with it would seem)

What I was trying to say, is that this should not be done in the front-end as the front-end has almost 
no business to deal with optimizations.
Comment 10 Andrew Pinski 2005-11-29 01:03:01 UTC
(In reply to comment #5)
> Once fold is able to do (int)(a+b) into (int)a + (int)b, the combine pass
> (15459) should be able to fix this.

I should note this is only valid for unsigned types and when -fwrapv is supplied as take a being INT_MAX and b being 1 (both have the same type which is larger than int), with (int)(a) + (int)(b)  we get an overflow which is undefined but (int)(a+b) is defined as 0.  See PR 25125 for an example of where this goes wrong currently from the front-end/convert.c mess.

We could define a new trees for addition where overflow is defined as wrapping.
Comment 11 Richard Biener 2006-04-20 14:57:20 UTC
We now generate

  return (int) ((unsigned int) (long long int) b + (unsigned int) (long long int) a);

which is harder to optimize.  But with -fwrapv and GVN tree-combiner we get

  aa_2 = (long long int) a_1;
  bb_4 = (long long int) b_3;
  D.1527_5 = a_1;
  D.1528_6 = b_3;
  D.1526_7 = D.1527_5 + D.1528_6;
  return D.1526_7;
Comment 12 Richard Biener 2006-05-04 13:57:07 UTC
Subject: Bug 14844

Author: rguenth
Date: Thu May  4 13:56:52 2006
New Revision: 113527

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113527
Log:
2006-05-04  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/14287
	PR tree-optimization/14844
	PR tree-optimization/19792
	PR tree-optimization/21608
	PR tree-optimization/27090
	* tree-ssa-pre.c (try_combine_conversion): New function.
	(compute_avail): After constructing the value-handle
	expression, use try_combine_conversion to combine NOP_EXPRs
	with previous value-handle expressions and use the result if it
	is available.

	* gcc.dg/tree-ssa/ssa-fre-1.c: New testcase.
	* gcc.dg/tree-ssa/ssa-fre-2.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-4.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-5.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-1.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-pre.c

Comment 13 Richard Biener 2006-05-04 15:05:58 UTC
We now optimize like outlined in comment #11
Comment 14 Martin Sebor 2017-01-13 03:49:36 UTC
In light of comment #12, should this be resolved as fixed?