Bug 28632 - VRP should understand bitwise OR and AND
Summary: VRP should understand bitwise OR and AND
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.1.1
: P3 enhancement
Target Milestone: ---
Assignee: Jakub Jelinek
URL:
Keywords: missed-optimization
Depends on:
Blocks: 16996 18031
  Show dependency treegraph
 
Reported: 2006-08-07 11:12 UTC by Denis Vlasenko
Modified: 2010-09-24 11:30 UTC (History)
4 users (show)

See Also:
Host: i386-pc-linux-gnu
Target: i386-pc-linux-gnu
Build: i386-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2010-01-02 01:00:42


Attachments
The patch relative to 4.1.1 (1.41 KB, patch)
2006-08-07 11:13 UTC, Denis Vlasenko
Details | Diff
Test program which demonstrates how VRP generates simpler code (264 bytes, text/plain)
2006-08-07 11:14 UTC, Denis Vlasenko
Details
New version of the patch. 4.1.1 bootstraps with it. (1.60 KB, patch)
2006-08-11 14:17 UTC, Denis Vlasenko
Details | Diff
Updated patch against current svn (2.22 KB, patch)
2008-08-04 11:25 UTC, Denis Vlasenko
Details | Diff
Example program which is compiled differently with this patch. (1.59 KB, text/x-csrc)
2008-08-04 11:28 UTC, Denis Vlasenko
Details
The instrumented version of the patch (3.05 KB, patch)
2008-08-04 11:34 UTC, Denis Vlasenko
Details | Diff
Testcase to be added to testsuite (331 bytes, text/x-csrc)
2008-08-04 11:55 UTC, Denis Vlasenko
Details
Updated patch. Uses double_int calculations instead of trees. (3.43 KB, patch)
2008-08-11 12:46 UTC, Denis Vlasenko
Details | Diff
Updated doubleint-based patch. DOES NOT PASS TESTSUITE. (3.19 KB, patch)
2008-08-20 14:57 UTC, Denis Vlasenko
Details | Diff
Tree based patch. Passes bootstrap. (3.12 KB, patch)
2008-08-20 14:58 UTC, Denis Vlasenko
Details | Diff
Bootstraps and regtests (2.53 KB, patch)
2009-06-22 08:28 UTC, Bernhard Reutner-Fischer
Details | Diff
gcc/testsuite/gcc.dg/tree-ssa/vrp50-PR28632.c (469 bytes, text/x-csrc)
2010-07-09 07:53 UTC, Bernhard Reutner-Fischer
Details
gcc46-pr28632.patch (2.53 KB, patch)
2010-07-09 17:12 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Denis Vlasenko 2006-08-07 11:12:03 UTC
This patch adds value range propagation for (a&b), (a|b) operations.

I am not familiar with gcc internals, so please review carefully before applying.
Comment 1 Denis Vlasenko 2006-08-07 11:13:15 UTC
Created attachment 12035 [details]
The patch relative to 4.1.1
Comment 2 Denis Vlasenko 2006-08-07 11:14:23 UTC
Created attachment 12036 [details]
Test program which demonstrates how VRP generates simpler code
Comment 3 Denis Vlasenko 2006-08-07 11:26:25 UTC
Differences in *.vrp (-fdump-tree-vrp output) and assembly generated
with stock 4.1.1 and with patched one out of test program test_vrp.c


--- test_vrp.c.t36-411org.vrp	2006-08-06 22:40:04.000000000 +0200
+++ test_vrp.c.t36-411new.vrp	2006-08-06 22:40:20.000000000 +0200
@@ -26,12 +26,15 @@
 Value ranges after VRP:
 
 n_1: VARYING
-D.1286_2: VARYING
-D.1287_3: VARYING
+D.1286_2: [256, 256]  EQUIVALENCES: { } (0 elements)
+D.1287_3: [0, 1]  EQUIVALENCES: { } (0 elements)
 n_5: [65793, 65809]  EQUIVALENCES: { n_1 n_6 } (2 elements)
 n_6: [0, 65809]  EQUIVALENCES: { n_1 } (1 elements)
 
 
+Folding predicate D.1286_2 != 0 to 1
+Merging blocks 2 and 3
+Merging blocks 2 and 4
 u4 (n)
 {
   _Bool D.1288;
@@ -46,12 +49,7 @@
 
 <L1>:;
   D.1286_2 = n_1 & 256;
-  if (D.1286_2 != 0) goto <L2>; else goto <L3>;
-
-<L2>:;
   f ();
-
-<L3>:;
   D.1287_3 = n_1 & 1;
   if (D.1287_3 != 0) goto <L4>; else goto <L5>;
 
@@ -95,16 +93,18 @@
 
 a_1: VARYING
 b_2: VARYING
-D.1282_3: VARYING
-D.1282_4: [D.1282_3, D.1282_3]  EQUIVALENCES: { D.1282_3 } (1 elements)
+D.1282_3: [4096, +INF]  EQUIVALENCES: { } (0 elements)
+D.1282_4: [4096, +INF]  EQUIVALENCES: { D.1282_3 } (1 elements)
 a_5: [4096, 4096]  EQUIVALENCES: { a_1 a_6 } (2 elements)
 a_6: [4096, +INF]  EQUIVALENCES: { a_1 } (1 elements)
 b_7: [272, +INF]  EQUIVALENCES: { b_2 } (1 elements)
-D.1282_8: [0, 4095]  EQUIVALENCES: { D.1282_3 } (1 elements)
 
 
 Simplified relational a_6 > 4096 into a_6 != 4096
+Folding predicate D.1282_3 > 4095 to 1
 Removing basic block 8
+Merging blocks 3 and 4
+Merging blocks 3 and 5
 v4 (a, b)
 {
   unsigned int D.1282;
@@ -120,12 +120,7 @@
 
 <L2>:;
   D.1282_3 = b_2 | 4096;
-  if (D.1282_3 > 4095) goto <L3>; else goto <L4>;
-
-<L3>:;
   f ();
-
-<L4>:;
   D.1282_4 = D.1282_3;
   if (D.1282_3 > 65535) goto <L5>; else goto <L6>;
 
--- test_vrp-411org.s	2006-08-06 22:40:04.000000000 +0200
+++ test_vrp-411new.s	2006-08-06 22:40:20.000000000 +0200
@@ -7,24 +7,20 @@
 	subl	$8, %esp
 	movl	16(%esp), %ebx
 	cmpl	$65809, %ebx
-	ja	.L8
+	ja	.L6
 	cmpl	$65792, %ebx
-	jbe	.L8
-	testb	$1, %bh
-	jne	.L10
-.L5:
-	andl	$1, %ebx
-	je	.L8
+	ja	.L8
+.L6:
 	addl	$8, %esp
 	popl	%ebx
-	jmp	g
+	ret
 .L8:
+	call	f
+	andl	$1, %ebx
+	je	.L6
 	addl	$8, %esp
 	popl	%ebx
-	ret
-.L10:
-	call	f
-	jmp	.L5
+	jmp	g
 	.size	u4, .-u4
 .globl v4
 	.type	v4, @function
@@ -32,31 +28,25 @@
 	pushl	%ebx
 	subl	$8, %esp
 	movl	16(%esp), %eax
-	movl	20(%esp), %edx
+	movl	20(%esp), %ebx
 	cmpl	$4095, %eax
-	jbe	.L19
+	jbe	.L15
 	cmpl	$4096, %eax
-	je	.L20
-.L19:
+	je	.L16
+.L15:
 	addl	$8, %esp
 	popl	%ebx
 	ret
-.L20:
-	cmpl	$271, %edx
-	jbe	.L19
-	movl	%edx, %ebx
-	orb	$16, %bh
-	cmpl	$4095, %ebx
-	ja	.L21
 .L16:
+	cmpl	$271, %ebx
+	jbe	.L15
+	call	f
+	orb	$16, %bh
 	cmpl	$65535, %ebx
-	jbe	.L19
+	jbe	.L15
 	addl	$8, %esp
 	popl	%ebx
 	jmp	g
-.L21:
-	call	f
-	jmp	.L16
 	.size	v4, .-v4
 	.ident	"GCC: (GNU) 4.1.1"
 	.section	.note.GNU-stack,"",@progbits
Comment 4 Andrew Pinski 2006-08-07 14:54:47 UTC
First patches should be off of the mainline.  Second Patchs should be sent to gcc-patches@gcc.gnu.org.

Third and this should be able to fix PR 18031 which I added a patch already (though not officially sent it because I had not tested it).
Comment 5 Andrew Pinski 2006-08-08 00:37:24 UTC
Confirmed.
Comment 6 Denis Vlasenko 2006-08-11 14:17:42 UTC
Created attachment 12067 [details]
New version of the patch. 4.1.1 bootstraps with it.
Comment 7 Denis Vlasenko 2008-08-04 11:25:40 UTC
Created attachment 16009 [details]
Updated patch against current svn

This patch is bootstrapping successfully on current gcc svn.
Comment 8 Denis Vlasenko 2008-08-04 11:28:27 UTC
Created attachment 16010 [details]
Example program which is compiled differently with this patch.

The difference is here:

        movl    (%edx), %eax    #,
        call    bb_simple_perror_msg    #
 .L40:
-       orl     $1, 12(%esp)    #, errs
+       movl    $1, 12(%esp)    #, errs
 .L19:
        addl    $4, 24(%esp)    #, argv.78
        movl    24(%esp), %eax  # argv.78,

With improved VRP on (a | b), gcc now can deduce that in this program, errs |= 1 is always equivalent to errs = 1.
Comment 9 Denis Vlasenko 2008-08-04 11:34:53 UTC
Created attachment 16011 [details]
The instrumented version of the patch

Set $VDA_DEBUG to the name of a file and gcc will append debug printouts to it showing how it deduced value ranges for (a | b) and (a & b).

// extract_range_from_binary_expr(OR)[u32]
// a integer_nonnegative_range(unsigned int __bid_IDEC_glbflags.40_536)=0
// b integer_nonnegative_range(_IDEC_flags[u32],[16,16])=1
 if (a | b) < (16) || (a | b) > (4294967295)) err();

[gcc inferred that since b = 16, (a | b) is always >= 16,
despite the fact we do not know value range of a]

// extract_range_from_binary_expr(AND)[u32]
// a integer_nonnegative_range(unsigned int[u32],[0,4294967295])=1
// b integer_nonnegative_range(_IDEC_round[u32],[1,1])=1
// bits_always_set(0,4294967295)=[u32]0
// bits_always_set(1,1)=[u32]1
// bits_maybe_set(0,4294967295)=[u32]4294967295
// bits_maybe_set(1,1)=[u32]1
 if (a & b) < 0 || (a & b) > 1 err();

[a case where both operands had known positive range]
Comment 10 Denis Vlasenko 2008-08-04 11:55:17 UTC
Created attachment 16012 [details]
Testcase to be added to testsuite

This program is artificially made to not compile if VRP fails
to predict a range:

  if (a < 0x1000) return;
  if (a > 0x1000) return;
  if (b < 0x0110) return;
  if (!__builtin_constant_p ((a|b) >= 0x01000))
    asm("the.bug.is.here");

Before this patch, gcc will fail to see that (a|b) >= 0x01000 is known at compile time, after it it will see that.

I don't know how to conditionally check for -O (not -O2 or -Os, just -O).
#if defined __OPTIMIZE__ means "-O<anything>", I need to check for
"-O<anything> but not bare -O". Help.
Currently gcc -O -c pr28632.c fails (-O level is not high enough to trigger VRP).
Comment 11 Denis Vlasenko 2008-08-11 12:46:18 UTC
Created attachment 16050 [details]
Updated patch. Uses double_int calculations instead of trees.

On Mon, 2008-08-04 at 14:26 +0200, Richard Guenther wrote:
> In extract_range_from_binary_expr it looks like the cases for
> BIT_AND_EXPR and BIT_IOR_EXPR can share most (if not all) of
> the code if you use the expression code instead of constant codes.
> 
> In bits_always_set and bits_maybe_set it is better to use double_ints
> (see double_int.h) for intermediate calculations, as they do not involve
> allocating new tree constants.

The updated patch is attached (with instrumentation code present but
#defined out).
Comment 12 Denis Vlasenko 2008-08-18 13:02:46 UTC
Bootstrap with -O2 on current svn fails. Bootstrap with -Os works.
Bootstrap with -O2 on 4.3.1 works.

Instrumented patch emits C code which can be used to test for incorrect VRP predictions.

I ran such tests and found no such incorrect predictions. Either patch is buggy in some subtle way or it exposes a latent bug elsewhere :(
Comment 13 Denis Vlasenko 2008-08-20 14:57:31 UTC
Created attachment 16113 [details]
Updated doubleint-based patch. DOES NOT PASS TESTSUITE.
Comment 14 Denis Vlasenko 2008-08-20 14:58:09 UTC
Created attachment 16114 [details]
Tree based patch. Passes bootstrap.
Comment 15 Denis Vlasenko 2008-08-20 15:07:01 UTC
(In reply to comment #13)
> Created an attachment (id=16113) [edit]
> Updated doubleint-based patch. DOES NOT PASS TESTSUITE.

I meant "does not bootstrap".

Anyway. Something strange is going on. Last two patches:

bug28632_doubleint_based.patch
bug28632_tree_based.patch

are identical in their basic algorithm, but one uses trees and other uses double_ints for intermediate calculations. I did a bootstrap with them
and on 4.3.1 both work, whereas on current-ish svn doubleint-based patch
fails.

Bootstrap with both patches was done with debugging output enabled
and OUTPUT IS THE SAME! Entire ~7Mb output has the same ranges predicted
by both patches, up to the point where gcc stage 2 is reached.
Then newly built compiles mispredicts a range and compile fails
(again only for bug28632_doubleint_based.patch).

This practically rules out that I have some silly bug
in bug28632_doubleint_based.patch which miscalculates ranges - that would make
bug28632_tree_based.patch to fail as well.

I also ran a test program which tested predictions for random (a | b)
and (a & b) and it didn't find any errors.

Looks like this problem is more difficult than I can handle.
Putting it on hold for now.
Comment 16 Bernhard Reutner-Fischer 2009-06-05 16:27:12 UTC
(In reply to comment #15)
> (In reply to comment #13)
> > Created an attachment (id=16113) [edit]
> > Updated doubleint-based patch. DOES NOT PASS TESTSUITE.
> 
> I meant "does not bootstrap".
> 
> Anyway. Something strange is going on. Last two patches:

I have a reimplementation (for BIT_AND_EXPR only, BIT_IOR_EXPR seem to work ok on trunk) in testing that i intend to submit for inclusion.
Comment 17 Bernhard Reutner-Fischer 2009-06-22 08:28:48 UTC
Created attachment 18044 [details]
Bootstraps and regtests

bootstrapped and regtested all default languages with no new regressions.
TODO:
- split out to helper function
- unite the two (unrelated) gcc_asserts and send them separately
- drop debugging stuff
Comment 18 Steven Bosscher 2010-01-02 01:00:42 UTC
Re. comment #17: Any plans to finish this for GCC 4.6 stage 1?
Comment 19 Bernhard Reutner-Fischer 2010-07-09 07:53:11 UTC
Created attachment 21156 [details]
gcc/testsuite/gcc.dg/tree-ssa/vrp50-PR28632.c
Comment 20 Jakub Jelinek 2010-07-09 17:12:13 UTC
Created attachment 21162 [details]
gcc46-pr28632.patch

Sorry, I haven't been aware of this PR.  The attached patch brings in some of the ideas from these patches and even goes beyond that.  Tested so far just on tree-ssa.exp=vrp*.c though, going to bootstrap/regtest it now.
Comment 21 Jakub Jelinek 2010-07-09 19:40:19 UTC
Subject: Bug 28632

Author: jakub
Date: Fri Jul  9 19:40:03 2010
New Revision: 162009

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162009
Log:
	PR tree-optimization/28632
	* tree-vrp.c (zero_nonzero_bits_from_vr): New function.
	(extract_range_from_binary_expr): Further optimize
	BIT_AND_EXPR and BIT_IOR_EXPR.

	* gcc.dg/tree-ssa/vrp51.c: New test.
	* gcc.dg/tree-ssa/vrp52.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp51.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp52.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vrp.c

Comment 22 Jakub Jelinek 2010-09-24 11:30:52 UTC
Fixed.