Bug 14287 - [tree-ssa] does not remove unnecessary extensions
Summary: [tree-ssa] does not remove unnecessary extensions
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: tree-ssa
: P2 enhancement
Target Milestone: 4.2.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on: 15459
Blocks:
  Show dependency treegraph
 
Reported: 2004-02-25 04:11 UTC by Kazu Hirata
Modified: 2023-12-30 01:17 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-12-07 03:13:36


Attachments
patch (573 bytes, patch)
2004-02-25 18:44 UTC, Andrew Macleod
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kazu Hirata 2004-02-25 04:11:56 UTC
/* test.c */

short g, h;

void
foo (long a)
{
  short b = a & 3;
  long c = b;
  g = c;
  h = c;
}

test.c.t20.dom1 looks like so:

foo (a)
{
  long int c;
  short int b;
  short int T.1;
  short int T.0;

<bb 0>:
  T.0_2 = (short int)a_1;
  b_3 = T.0_2 & 3;
  c_4 = (long int)b_3;
  T.1_5 = (short int)c_4; <- Hey, T.1_5 == b_3!
  g = T.1_5;
  T.1_6 = T.1_5;
  h = T.1_5;
  return;
}

Here is the asm:

foo:
	movl	4(%esp), %eax
	andl	$3, %eax
	cwtl                   <- ugly
	movw	%ax, g
	movw	%ax, h
	ret

I inserted "g" and "h" to kill the combiner as it performs badly
when there are multiple uses of variables. :-)
The exactly same problem appears on H8.
Comment 1 Andrew Pinski 2004-02-25 04:55:03 UTC
Confirmed, this is a performance regression from the mainline.
Comment 2 Andrew Pinski 2004-02-25 05:18:12 UTC
No this is not a regression, I was looking at the asm wrong as the compiler that Kazu used was an i386 
compiler while mine is for i686.  This happens on the mainline too.
Comment 3 Andrew Macleod 2004-02-25 18:44:51 UTC
Created attachment 5797 [details]
patch

I was thinking somethinhg more like the following.

It seems awfully restrictive however.

if I am storing into a volatile:

*vol_9 = T_2 + T_8

I should be able to substitute the expressions for T_@ and T_8 with no
problems.

However, this appears to fix the problem, I'll do a testrun on it and see if it
causes any other difficulties.

Andrew
Comment 4 Andrew Macleod 2004-02-25 18:51:13 UTC
huh, how did that happen, this went to the wrong bugzilla case. 
Sorry.... this patch is not for this case . I didnt even have this one open
anywhere. huh.

Andrew

Comment 5 Kazu Hirata 2004-03-03 18:13:08 UTC
Mine.
Comment 6 Kazu Hirata 2004-03-04 04:08:08 UTC
I'm not sure if I can quickly implement remove sign/zero extensions elimination
in a (almost always) profitable way.

It's more involved than I thought. :-(
Comment 7 Andrew Pinski 2004-03-16 21:50:39 UTC
Mine.
Comment 8 Andrew Pinski 2004-03-17 01:46:43 UTC
Actually it is not that involved at all.
Basically here is how my pass works:
For each block:
  For each statement in the block:
    is the statement is a modify expression and the RHS is a cast then:
      is the cast's operand's definition a modifiy express and its RHS a cast also:
        Do the inner and outer types match and is the intermediate's type size is larger or 
equal to the outer's type then remove change the statement to point to the outer variable.

Let DCE do its work with respect to getting rid of the intermediate variable.
Comment 9 Andrew Pinski 2004-04-03 05:06:44 UTC
Patch here: <http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00169.html>, I forgot to 
mention this PR in the patch as this was the PR which got me thinking about casts in the 
first place.
Comment 10 Andrew Pinski 2004-05-17 01:36:05 UTC
Actually I found out that fold can do the same simplier and it also can be the done using 
the combine pass I am poposing in PR 15459.
Comment 11 Andrew Pinski 2004-06-21 05:07:43 UTC
With the tree-combiner (which I am going to post soon), I get:

foo (a)
{
  short int T.1;

<bb 0>:
  T.1 = (short int)a & 3;
  g = T.1;
  h = T.1;
  return;

}
Comment 12 Andrew Pinski 2005-07-12 21:27:44 UTC
It might be a while for me to rewrite the tree combiner so unassigning for now.
Comment 13 Richard Biener 2006-05-04 13:57:13 UTC
Subject: Bug 14287

Author: rguenth
Date: Thu May  4 13:56:52 2006
New Revision: 113527

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113527
Log:
2006-05-04  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/14287
	PR tree-optimization/14844
	PR tree-optimization/19792
	PR tree-optimization/21608
	PR tree-optimization/27090
	* tree-ssa-pre.c (try_combine_conversion): New function.
	(compute_avail): After constructing the value-handle
	expression, use try_combine_conversion to combine NOP_EXPRs
	with previous value-handle expressions and use the result if it
	is available.

	* gcc.dg/tree-ssa/ssa-fre-1.c: New testcase.
	* gcc.dg/tree-ssa/ssa-fre-2.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-4.c: Likewise.
	* gcc.dg/tree-ssa/ssa-fre-5.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-1.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-pre.c

Comment 14 Richard Biener 2006-05-04 15:13:34 UTC
Fixed.

after 034.t.fre:


;; Function foo (foo)

foo (a)
{
  long int c;
  short int b;
  short int D.1528;
  short int D.1527;

<bb 2>:
  D.1527_2 = (short int) a_1;
  b_3 = D.1527_2 & 3;
  c_4 = (long int) b_3;
  D.1528_5 = b_3;
  g = D.1528_5;
  D.1528_8 = b_3;
  h = D.1528_8;
  return;

}