GCC Bugzilla – Bug 21137
Convert (a >> 2) & 1 != 0 into a & 4 != 0
Last modified: 2007-11-09 11:16:00 UTC
At tree level, we would like to canonicalize (a >> 2) & 1 != 0 into a & 4 != 0. Currently, void bar (void); void foo (int a) { if ((a >> 2) & 1) bar (); } turns into foo (a) { int D.1235; int D.1236; _Bool D.1237; D.1235 = a >> 2; D.1236 = D.1235 & 1; D.1237 = (_Bool) D.1236; if (D.1237) { bar (); } else { } }
Confirmed.
This should actually be "Convert (a >> c1) & 2**c2" to a & (c2 << c1)".
Created attachment 9242 [details] Patch
Created attachment 9243 [details] testcase Passes with make check-gcc RUNTESTFLAGS=gcc.dg/dg.exp
Hi James, Unfortunately, there are a few mistakes in your proposed patch for PR21137. Firstly Kazu's proposed transformation is only valid when the results of the bitwise-AND are being tested for equality or inequality with zero. i.e. its safe to transform "((a >> 2) & 1) != 0" into "(a & 4) != 0" but not "x = (a >> 2) & 1;" into "x = (a & 4)". Your patch is in the general fold path for BIT_AND_EXPR, so you'll transform both. It's surprising there are no testsuite checks for the second example above; it might be worth adding them to prevent anyone making a similar mistake in future. Secondly, this transformation is only valid is c1 + c2 < TYPE_PRECISION(type). Consider the following code: signed char c; if ((c >> 6) & 64) ... this is not equivalent to if (c & (char)(64<<6)) ... i.e. if (c & (char)4096) ... i.e. if (c & 0) ... i.e. if (0) Of course, when c1+c2 >= TYPE_PRECISION(type), there are two additional optimizations that can be performed. If TYPE_UNSIGNED(type) the result is always false, and if !TYPE_UNSIGNED(type), the condition is equivalent to "a < 0". So in the example of mine above, optimization should produce: if (c < 0) ... Finally, in your patch, you use "break", if the transformation is invalid. This isn't really the correct "idiom/style" for fold, where if the guard for a transformation fails, you shouldn't drop out of the switch, but instead continue onto the following/next transformation "in the list". So instead of "if (!guard) break; return transform();", this optimization should be written as "if (guard) return transform();". I haven't looked for other examples for "break" in fold_unary/fold_binary/fold_ternary, but if there are any, they're probably (latent) missed-optimization bugs. Other than that the patch looks good. Thanks for looking into this.
Actually, doing (a >> c1) & 2**c2 -> (a & (1<<c1+c2)) >> c1 looks worse, but can be transformed into (a & (1<<c1+c2)) CMP c3 << c1. However, I haven't figure out the details of the second transformation. There is a bug about it though.
Subject: Bug 21137 Author: sayle Date: Sun Feb 26 15:36:52 2006 New Revision: 111453 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=111453 Log: 2006-02-26 Roger Sayle <roger@eyesopen.com> James A. Morrison <phython@gcc.gnu.org> PR middle-end/21137 * fold-const.c (fold_binary) <EQ_EXPR>: Fold ((X>>C1)&C2) eq/ne 0, when C2 is a power of two, as either (X&(C2<<C1)) eq/ne 0 if the new constant C2<<C1, or as (X<0) or (X,false) depending upon the signedness of the shift operation. * gcc.dg/fold-eqandshift-1.c: New test case. Added: trunk/gcc/testsuite/gcc.dg/fold-eqandshift-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/fold-const.c trunk/gcc/testsuite/ChangeLog
Fixed on mainline. I'm still investiating some related optimizations.
This is not fully fixed yet. Compile these two functions with -O2 -fdump-tree-original: void test5_1(int e) { if ((e >> 31) & 64) foo(); } typedef int myint; void test5_2(myint e) { if ((e >> 31) & 64) foo(); } On x86_64-unknown-linux-gnu, I get (4.2.0, 4.2.1 and mainline): ;; Function test5_1 (test5_1) ;; enabled by -tree-original { if (e < 0) { foo (); } } ;; Function test5_2 (test5_2) ;; enabled by -tree-original { if (((int) (e >> 31) & 64) != 0) { foo (); } } We should get identical dumps for the two functions. I noticed this problem while trying to fix the original testcase to work targets where an int is not exactly 32 bits wide. The attached patch to fix the testcase revealed the problem.
Created attachment 13959 [details] Patch to fix testcase when int isn't exactly 32 bits
Reopening since this was only partially fixed.