This patch adds value range propagation for (a&b), (a|b) operations. I am not familiar with gcc internals, so please review carefully before applying.
Created attachment 12035 [details] The patch relative to 4.1.1
Created attachment 12036 [details] Test program which demonstrates how VRP generates simpler code
Differences in *.vrp (-fdump-tree-vrp output) and assembly generated with stock 4.1.1 and with patched one out of test program test_vrp.c --- test_vrp.c.t36-411org.vrp 2006-08-06 22:40:04.000000000 +0200 +++ test_vrp.c.t36-411new.vrp 2006-08-06 22:40:20.000000000 +0200 @@ -26,12 +26,15 @@ Value ranges after VRP: n_1: VARYING -D.1286_2: VARYING -D.1287_3: VARYING +D.1286_2: [256, 256] EQUIVALENCES: { } (0 elements) +D.1287_3: [0, 1] EQUIVALENCES: { } (0 elements) n_5: [65793, 65809] EQUIVALENCES: { n_1 n_6 } (2 elements) n_6: [0, 65809] EQUIVALENCES: { n_1 } (1 elements) +Folding predicate D.1286_2 != 0 to 1 +Merging blocks 2 and 3 +Merging blocks 2 and 4 u4 (n) { _Bool D.1288; @@ -46,12 +49,7 @@ <L1>:; D.1286_2 = n_1 & 256; - if (D.1286_2 != 0) goto <L2>; else goto <L3>; - -<L2>:; f (); - -<L3>:; D.1287_3 = n_1 & 1; if (D.1287_3 != 0) goto <L4>; else goto <L5>; @@ -95,16 +93,18 @@ a_1: VARYING b_2: VARYING -D.1282_3: VARYING -D.1282_4: [D.1282_3, D.1282_3] EQUIVALENCES: { D.1282_3 } (1 elements) +D.1282_3: [4096, +INF] EQUIVALENCES: { } (0 elements) +D.1282_4: [4096, +INF] EQUIVALENCES: { D.1282_3 } (1 elements) a_5: [4096, 4096] EQUIVALENCES: { a_1 a_6 } (2 elements) a_6: [4096, +INF] EQUIVALENCES: { a_1 } (1 elements) b_7: [272, +INF] EQUIVALENCES: { b_2 } (1 elements) -D.1282_8: [0, 4095] EQUIVALENCES: { D.1282_3 } (1 elements) Simplified relational a_6 > 4096 into a_6 != 4096 +Folding predicate D.1282_3 > 4095 to 1 Removing basic block 8 +Merging blocks 3 and 4 +Merging blocks 3 and 5 v4 (a, b) { unsigned int D.1282; @@ -120,12 +120,7 @@ <L2>:; D.1282_3 = b_2 | 4096; - if (D.1282_3 > 4095) goto <L3>; else goto <L4>; - -<L3>:; f (); - -<L4>:; D.1282_4 = D.1282_3; if (D.1282_3 > 65535) goto <L5>; else goto <L6>; --- test_vrp-411org.s 2006-08-06 22:40:04.000000000 +0200 +++ test_vrp-411new.s 2006-08-06 22:40:20.000000000 +0200 @@ -7,24 +7,20 @@ subl $8, %esp movl 16(%esp), %ebx cmpl $65809, %ebx - ja .L8 + ja .L6 cmpl $65792, %ebx - jbe .L8 - testb $1, %bh - jne .L10 -.L5: - andl $1, %ebx - je .L8 + ja .L8 +.L6: addl $8, %esp popl %ebx - jmp g + ret .L8: + call f + andl $1, %ebx + je .L6 addl $8, %esp popl %ebx - ret -.L10: - call f - jmp .L5 + jmp g .size u4, .-u4 .globl v4 .type v4, @function @@ -32,31 +28,25 @@ pushl %ebx subl $8, %esp movl 16(%esp), %eax - movl 20(%esp), %edx + movl 20(%esp), %ebx cmpl $4095, %eax - jbe .L19 + jbe .L15 cmpl $4096, %eax - je .L20 -.L19: + je .L16 +.L15: addl $8, %esp popl %ebx ret -.L20: - cmpl $271, %edx - jbe .L19 - movl %edx, %ebx - orb $16, %bh - cmpl $4095, %ebx - ja .L21 .L16: + cmpl $271, %ebx + jbe .L15 + call f + orb $16, %bh cmpl $65535, %ebx - jbe .L19 + jbe .L15 addl $8, %esp popl %ebx jmp g -.L21: - call f - jmp .L16 .size v4, .-v4 .ident "GCC: (GNU) 4.1.1" .section .note.GNU-stack,"",@progbits
First patches should be off of the mainline. Second Patchs should be sent to gcc-patches@gcc.gnu.org. Third and this should be able to fix PR 18031 which I added a patch already (though not officially sent it because I had not tested it).
Confirmed.
Created attachment 12067 [details] New version of the patch. 4.1.1 bootstraps with it.
Created attachment 16009 [details] Updated patch against current svn This patch is bootstrapping successfully on current gcc svn.
Created attachment 16010 [details] Example program which is compiled differently with this patch. The difference is here: movl (%edx), %eax #, call bb_simple_perror_msg # .L40: - orl $1, 12(%esp) #, errs + movl $1, 12(%esp) #, errs .L19: addl $4, 24(%esp) #, argv.78 movl 24(%esp), %eax # argv.78, With improved VRP on (a | b), gcc now can deduce that in this program, errs |= 1 is always equivalent to errs = 1.
Created attachment 16011 [details] The instrumented version of the patch Set $VDA_DEBUG to the name of a file and gcc will append debug printouts to it showing how it deduced value ranges for (a | b) and (a & b). // extract_range_from_binary_expr(OR)[u32] // a integer_nonnegative_range(unsigned int __bid_IDEC_glbflags.40_536)=0 // b integer_nonnegative_range(_IDEC_flags[u32],[16,16])=1 if (a | b) < (16) || (a | b) > (4294967295)) err(); [gcc inferred that since b = 16, (a | b) is always >= 16, despite the fact we do not know value range of a] // extract_range_from_binary_expr(AND)[u32] // a integer_nonnegative_range(unsigned int[u32],[0,4294967295])=1 // b integer_nonnegative_range(_IDEC_round[u32],[1,1])=1 // bits_always_set(0,4294967295)=[u32]0 // bits_always_set(1,1)=[u32]1 // bits_maybe_set(0,4294967295)=[u32]4294967295 // bits_maybe_set(1,1)=[u32]1 if (a & b) < 0 || (a & b) > 1 err(); [a case where both operands had known positive range]
Created attachment 16012 [details] Testcase to be added to testsuite This program is artificially made to not compile if VRP fails to predict a range: if (a < 0x1000) return; if (a > 0x1000) return; if (b < 0x0110) return; if (!__builtin_constant_p ((a|b) >= 0x01000)) asm("the.bug.is.here"); Before this patch, gcc will fail to see that (a|b) >= 0x01000 is known at compile time, after it it will see that. I don't know how to conditionally check for -O (not -O2 or -Os, just -O). #if defined __OPTIMIZE__ means "-O<anything>", I need to check for "-O<anything> but not bare -O". Help. Currently gcc -O -c pr28632.c fails (-O level is not high enough to trigger VRP).
Created attachment 16050 [details] Updated patch. Uses double_int calculations instead of trees. On Mon, 2008-08-04 at 14:26 +0200, Richard Guenther wrote: > In extract_range_from_binary_expr it looks like the cases for > BIT_AND_EXPR and BIT_IOR_EXPR can share most (if not all) of > the code if you use the expression code instead of constant codes. > > In bits_always_set and bits_maybe_set it is better to use double_ints > (see double_int.h) for intermediate calculations, as they do not involve > allocating new tree constants. The updated patch is attached (with instrumentation code present but #defined out).
Bootstrap with -O2 on current svn fails. Bootstrap with -Os works. Bootstrap with -O2 on 4.3.1 works. Instrumented patch emits C code which can be used to test for incorrect VRP predictions. I ran such tests and found no such incorrect predictions. Either patch is buggy in some subtle way or it exposes a latent bug elsewhere :(
Created attachment 16113 [details] Updated doubleint-based patch. DOES NOT PASS TESTSUITE.
Created attachment 16114 [details] Tree based patch. Passes bootstrap.
(In reply to comment #13) > Created an attachment (id=16113) [edit] > Updated doubleint-based patch. DOES NOT PASS TESTSUITE. I meant "does not bootstrap". Anyway. Something strange is going on. Last two patches: bug28632_doubleint_based.patch bug28632_tree_based.patch are identical in their basic algorithm, but one uses trees and other uses double_ints for intermediate calculations. I did a bootstrap with them and on 4.3.1 both work, whereas on current-ish svn doubleint-based patch fails. Bootstrap with both patches was done with debugging output enabled and OUTPUT IS THE SAME! Entire ~7Mb output has the same ranges predicted by both patches, up to the point where gcc stage 2 is reached. Then newly built compiles mispredicts a range and compile fails (again only for bug28632_doubleint_based.patch). This practically rules out that I have some silly bug in bug28632_doubleint_based.patch which miscalculates ranges - that would make bug28632_tree_based.patch to fail as well. I also ran a test program which tested predictions for random (a | b) and (a & b) and it didn't find any errors. Looks like this problem is more difficult than I can handle. Putting it on hold for now.
(In reply to comment #15) > (In reply to comment #13) > > Created an attachment (id=16113) [edit] > > Updated doubleint-based patch. DOES NOT PASS TESTSUITE. > > I meant "does not bootstrap". > > Anyway. Something strange is going on. Last two patches: I have a reimplementation (for BIT_AND_EXPR only, BIT_IOR_EXPR seem to work ok on trunk) in testing that i intend to submit for inclusion.
Created attachment 18044 [details] Bootstraps and regtests bootstrapped and regtested all default languages with no new regressions. TODO: - split out to helper function - unite the two (unrelated) gcc_asserts and send them separately - drop debugging stuff
Re. comment #17: Any plans to finish this for GCC 4.6 stage 1?
Created attachment 21156 [details] gcc/testsuite/gcc.dg/tree-ssa/vrp50-PR28632.c
Created attachment 21162 [details] gcc46-pr28632.patch Sorry, I haven't been aware of this PR. The attached patch brings in some of the ideas from these patches and even goes beyond that. Tested so far just on tree-ssa.exp=vrp*.c though, going to bootstrap/regtest it now.
Subject: Bug 28632 Author: jakub Date: Fri Jul 9 19:40:03 2010 New Revision: 162009 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162009 Log: PR tree-optimization/28632 * tree-vrp.c (zero_nonzero_bits_from_vr): New function. (extract_range_from_binary_expr): Further optimize BIT_AND_EXPR and BIT_IOR_EXPR. * gcc.dg/tree-ssa/vrp51.c: New test. * gcc.dg/tree-ssa/vrp52.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp51.c trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp52.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vrp.c
Fixed.