/* test.c */ short g, h; void foo (long a) { short b = a & 3; long c = b; g = c; h = c; } test.c.t20.dom1 looks like so: foo (a) { long int c; short int b; short int T.1; short int T.0; <bb 0>: T.0_2 = (short int)a_1; b_3 = T.0_2 & 3; c_4 = (long int)b_3; T.1_5 = (short int)c_4; <- Hey, T.1_5 == b_3! g = T.1_5; T.1_6 = T.1_5; h = T.1_5; return; } Here is the asm: foo: movl 4(%esp), %eax andl $3, %eax cwtl <- ugly movw %ax, g movw %ax, h ret I inserted "g" and "h" to kill the combiner as it performs badly when there are multiple uses of variables. :-) The exactly same problem appears on H8.
Confirmed, this is a performance regression from the mainline.
No this is not a regression, I was looking at the asm wrong as the compiler that Kazu used was an i386 compiler while mine is for i686. This happens on the mainline too.
Created attachment 5797 [details] patch I was thinking somethinhg more like the following. It seems awfully restrictive however. if I am storing into a volatile: *vol_9 = T_2 + T_8 I should be able to substitute the expressions for T_@ and T_8 with no problems. However, this appears to fix the problem, I'll do a testrun on it and see if it causes any other difficulties. Andrew
huh, how did that happen, this went to the wrong bugzilla case. Sorry.... this patch is not for this case . I didnt even have this one open anywhere. huh. Andrew
Mine.
I'm not sure if I can quickly implement remove sign/zero extensions elimination in a (almost always) profitable way. It's more involved than I thought. :-(
Actually it is not that involved at all. Basically here is how my pass works: For each block: For each statement in the block: is the statement is a modify expression and the RHS is a cast then: is the cast's operand's definition a modifiy express and its RHS a cast also: Do the inner and outer types match and is the intermediate's type size is larger or equal to the outer's type then remove change the statement to point to the outer variable. Let DCE do its work with respect to getting rid of the intermediate variable.
Patch here: <http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00169.html>, I forgot to mention this PR in the patch as this was the PR which got me thinking about casts in the first place.
Actually I found out that fold can do the same simplier and it also can be the done using the combine pass I am poposing in PR 15459.
With the tree-combiner (which I am going to post soon), I get: foo (a) { short int T.1; <bb 0>: T.1 = (short int)a & 3; g = T.1; h = T.1; return; }
It might be a while for me to rewrite the tree combiner so unassigning for now.
Subject: Bug 14287 Author: rguenth Date: Thu May 4 13:56:52 2006 New Revision: 113527 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113527 Log: 2006-05-04 Richard Guenther <rguenther@suse.de> PR tree-optimization/14287 PR tree-optimization/14844 PR tree-optimization/19792 PR tree-optimization/21608 PR tree-optimization/27090 * tree-ssa-pre.c (try_combine_conversion): New function. (compute_avail): After constructing the value-handle expression, use try_combine_conversion to combine NOP_EXPRs with previous value-handle expressions and use the result if it is available. * gcc.dg/tree-ssa/ssa-fre-1.c: New testcase. * gcc.dg/tree-ssa/ssa-fre-2.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-3.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-4.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-5.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-2.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-4.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-5.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-pre.c
Fixed. after 034.t.fre: ;; Function foo (foo) foo (a) { long int c; short int b; short int D.1528; short int D.1527; <bb 2>: D.1527_2 = (short int) a_1; b_3 = D.1527_2 & 3; c_4 = (long int) b_3; D.1528_5 = b_3; g = D.1528_5; D.1528_8 = b_3; h = D.1528_8; return; }