28632 – VRP should understand bitwise OR and AND

Bug 28632 - VRP should understand bitwise OR and AND

Summary: VRP should understand bitwise OR and AND

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	4.1.1

Importance:	P3 enhancement
Target Milestone:	---
Assignee:	Jakub Jelinek

URL:
Keywords:	missed-optimization

Depends on:
Blocks:	16996 18031
	Show dependency tree / graph

Reported:	2006-08-07 11:12 UTC by Denis Vlasenko
Modified:	2010-09-24 11:30 UTC (History)
CC List:	4 users (show)

See Also:
Host:	i386-pc-linux-gnu
Target:	i386-pc-linux-gnu
Build:	i386-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:	2010-01-02 01:00:42

Attachments
The patch relative to 4.1.1 (1.41 KB, patch) 2006-08-07 11:13 UTC, Denis Vlasenko	Details \| Diff
Test program which demonstrates how VRP generates simpler code (264 bytes, text/plain) 2006-08-07 11:14 UTC, Denis Vlasenko	Details
New version of the patch. 4.1.1 bootstraps with it. (1.60 KB, patch) 2006-08-11 14:17 UTC, Denis Vlasenko	Details \| Diff
Updated patch against current svn (2.22 KB, patch) 2008-08-04 11:25 UTC, Denis Vlasenko	Details \| Diff
Example program which is compiled differently with this patch. (1.59 KB, text/x-csrc) 2008-08-04 11:28 UTC, Denis Vlasenko	Details
The instrumented version of the patch (3.05 KB, patch) 2008-08-04 11:34 UTC, Denis Vlasenko	Details \| Diff
Testcase to be added to testsuite (331 bytes, text/x-csrc) 2008-08-04 11:55 UTC, Denis Vlasenko	Details
Updated patch. Uses double_int calculations instead of trees. (3.43 KB, patch) 2008-08-11 12:46 UTC, Denis Vlasenko	Details \| Diff
Updated doubleint-based patch. DOES NOT PASS TESTSUITE. (3.19 KB, patch) 2008-08-20 14:57 UTC, Denis Vlasenko	Details \| Diff
Tree based patch. Passes bootstrap. (3.12 KB, patch) 2008-08-20 14:58 UTC, Denis Vlasenko	Details \| Diff
Bootstraps and regtests (2.53 KB, patch) 2009-06-22 08:28 UTC, Bernhard Reutner-Fischer	Details \| Diff
gcc/testsuite/gcc.dg/tree-ssa/vrp50-PR28632.c (469 bytes, text/x-csrc) 2010-07-09 07:53 UTC, Bernhard Reutner-Fischer	Details
gcc46-pr28632.patch (2.53 KB, patch) 2010-07-09 17:12 UTC, Jakub Jelinek	Details \| Diff
Show Obsolete (9) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Denis Vlasenko 2006-08-07 11:12:03 UTC

This patch adds value range propagation for (a&b), (a|b) operations.

I am not familiar with gcc internals, so please review carefully before applying.

Comment 1 Denis Vlasenko 2006-08-07 11:13:15 UTC

Created attachment 12035 [details]
The patch relative to 4.1.1

Comment 2 Denis Vlasenko 2006-08-07 11:14:23 UTC

Created attachment 12036 [details]
Test program which demonstrates how VRP generates simpler code

Comment 3 Denis Vlasenko 2006-08-07 11:26:25 UTC

Differences in *.vrp (-fdump-tree-vrp output) and assembly generated
with stock 4.1.1 and with patched one out of test program test_vrp.c


--- test_vrp.c.t36-411org.vrp	2006-08-06 22:40:04.000000000 +0200
+++ test_vrp.c.t36-411new.vrp	2006-08-06 22:40:20.000000000 +0200
@@ -26,12 +26,15 @@
 Value ranges after VRP:
 
 n_1: VARYING
-D.1286_2: VARYING
-D.1287_3: VARYING
+D.1286_2: [256, 256]  EQUIVALENCES: { } (0 elements)
+D.1287_3: [0, 1]  EQUIVALENCES: { } (0 elements)
 n_5: [65793, 65809]  EQUIVALENCES: { n_1 n_6 } (2 elements)
 n_6: [0, 65809]  EQUIVALENCES: { n_1 } (1 elements)
 
 
+Folding predicate D.1286_2 != 0 to 1
+Merging blocks 2 and 3
+Merging blocks 2 and 4
 u4 (n)
 {
   _Bool D.1288;
@@ -46,12 +49,7 @@
 
 <L1>:;
   D.1286_2 = n_1 & 256;
-  if (D.1286_2 != 0) goto <L2>; else goto <L3>;
-
-<L2>:;
   f ();
-
-<L3>:;
   D.1287_3 = n_1 & 1;
   if (D.1287_3 != 0) goto <L4>; else goto <L5>;
 
@@ -95,16 +93,18 @@
 
 a_1: VARYING
 b_2: VARYING
-D.1282_3: VARYING
-D.1282_4: [D.1282_3, D.1282_3]  EQUIVALENCES: { D.1282_3 } (1 elements)
+D.1282_3: [4096, +INF]  EQUIVALENCES: { } (0 elements)
+D.1282_4: [4096, +INF]  EQUIVALENCES: { D.1282_3 } (1 elements)
 a_5: [4096, 4096]  EQUIVALENCES: { a_1 a_6 } (2 elements)
 a_6: [4096, +INF]  EQUIVALENCES: { a_1 } (1 elements)
 b_7: [272, +INF]  EQUIVALENCES: { b_2 } (1 elements)
-D.1282_8: [0, 4095]  EQUIVALENCES: { D.1282_3 } (1 elements)
 
 
 Simplified relational a_6 > 4096 into a_6 != 4096
+Folding predicate D.1282_3 > 4095 to 1
 Removing basic block 8
+Merging blocks 3 and 4
+Merging blocks 3 and 5
 v4 (a, b)
 {
   unsigned int D.1282;
@@ -120,12 +120,7 @@
 
 <L2>:;
   D.1282_3 = b_2 | 4096;
-  if (D.1282_3 > 4095) goto <L3>; else goto <L4>;
-
-<L3>:;
   f ();
-
-<L4>:;
   D.1282_4 = D.1282_3;
   if (D.1282_3 > 65535) goto <L5>; else goto <L6>;
 
--- test_vrp-411org.s	2006-08-06 22:40:04.000000000 +0200
+++ test_vrp-411new.s	2006-08-06 22:40:20.000000000 +0200
@@ -7,24 +7,20 @@
 	subl	$8, %esp
 	movl	16(%esp), %ebx
 	cmpl	$65809, %ebx
-	ja	.L8
+	ja	.L6
 	cmpl	$65792, %ebx
-	jbe	.L8
-	testb	$1, %bh
-	jne	.L10
-.L5:
-	andl	$1, %ebx
-	je	.L8
+	ja	.L8
+.L6:
 	addl	$8, %esp
 	popl	%ebx
-	jmp	g
+	ret
 .L8:
+	call	f
+	andl	$1, %ebx
+	je	.L6
 	addl	$8, %esp
 	popl	%ebx
-	ret
-.L10:
-	call	f
-	jmp	.L5
+	jmp	g
 	.size	u4, .-u4
 .globl v4
 	.type	v4, @function
@@ -32,31 +28,25 @@
 	pushl	%ebx
 	subl	$8, %esp
 	movl	16(%esp), %eax
-	movl	20(%esp), %edx
+	movl	20(%esp), %ebx
 	cmpl	$4095, %eax
-	jbe	.L19
+	jbe	.L15
 	cmpl	$4096, %eax
-	je	.L20
-.L19:
+	je	.L16
+.L15:
 	addl	$8, %esp
 	popl	%ebx
 	ret
-.L20:
-	cmpl	$271, %edx
-	jbe	.L19
-	movl	%edx, %ebx
-	orb	$16, %bh
-	cmpl	$4095, %ebx
-	ja	.L21
 .L16:
+	cmpl	$271, %ebx
+	jbe	.L15
+	call	f
+	orb	$16, %bh
 	cmpl	$65535, %ebx
-	jbe	.L19
+	jbe	.L15
 	addl	$8, %esp
 	popl	%ebx
 	jmp	g
-.L21:
-	call	f
-	jmp	.L16
 	.size	v4, .-v4
 	.ident	"GCC: (GNU) 4.1.1"
 	.section	.note.GNU-stack,"",@progbits

Comment 4 Andrew Pinski 2006-08-07 14:54:47 UTC

First patches should be off of the mainline.  Second Patchs should be sent to gcc-patches@gcc.gnu.org.

Third and this should be able to fix PR 18031 which I added a patch already (though not officially sent it because I had not tested it).

Comment 5 Andrew Pinski 2006-08-08 00:37:24 UTC

Confirmed.

Comment 6 Denis Vlasenko 2006-08-11 14:17:42 UTC

Created attachment 12067 [details]
New version of the patch. 4.1.1 bootstraps with it.

Comment 7 Denis Vlasenko 2008-08-04 11:25:40 UTC

Created attachment 16009 [details]
Updated patch against current svn

This patch is bootstrapping successfully on current gcc svn.

Comment 8 Denis Vlasenko 2008-08-04 11:28:27 UTC

Created attachment 16010 [details]
Example program which is compiled differently with this patch.

The difference is here:

        movl    (%edx), %eax    #,
        call    bb_simple_perror_msg    #
 .L40:
-       orl     $1, 12(%esp)    #, errs
+       movl    $1, 12(%esp)    #, errs
 .L19:
        addl    $4, 24(%esp)    #, argv.78
        movl    24(%esp), %eax  # argv.78,

With improved VRP on (a | b), gcc now can deduce that in this program, errs |= 1 is always equivalent to errs = 1.

Comment 9 Denis Vlasenko 2008-08-04 11:34:53 UTC

Created attachment 16011 [details]
The instrumented version of the patch

Set $VDA_DEBUG to the name of a file and gcc will append debug printouts to it showing how it deduced value ranges for (a | b) and (a & b).

// extract_range_from_binary_expr(OR)[u32]
// a integer_nonnegative_range(unsigned int __bid_IDEC_glbflags.40_536)=0
// b integer_nonnegative_range(_IDEC_flags[u32],[16,16])=1
 if (a | b) < (16) || (a | b) > (4294967295)) err();

[gcc inferred that since b = 16, (a | b) is always >= 16,
despite the fact we do not know value range of a]

// extract_range_from_binary_expr(AND)[u32]
// a integer_nonnegative_range(unsigned int[u32],[0,4294967295])=1
// b integer_nonnegative_range(_IDEC_round[u32],[1,1])=1
// bits_always_set(0,4294967295)=[u32]0
// bits_always_set(1,1)=[u32]1
// bits_maybe_set(0,4294967295)=[u32]4294967295
// bits_maybe_set(1,1)=[u32]1
 if (a & b) < 0 || (a & b) > 1 err();

[a case where both operands had known positive range]

Comment 10 Denis Vlasenko 2008-08-04 11:55:17 UTC

Created attachment 16012 [details]
Testcase to be added to testsuite

This program is artificially made to not compile if VRP fails
to predict a range:

  if (a < 0x1000) return;
  if (a > 0x1000) return;
  if (b < 0x0110) return;
  if (!__builtin_constant_p ((a|b) >= 0x01000))
    asm("the.bug.is.here");

Before this patch, gcc will fail to see that (a|b) >= 0x01000 is known at compile time, after it it will see that.

I don't know how to conditionally check for -O (not -O2 or -Os, just -O).
#if defined __OPTIMIZE__ means "-O<anything>", I need to check for
"-O<anything> but not bare -O". Help.
Currently gcc -O -c pr28632.c fails (-O level is not high enough to trigger VRP).

Comment 11 Denis Vlasenko 2008-08-11 12:46:18 UTC

Created attachment 16050 [details]
Updated patch. Uses double_int calculations instead of trees.

On Mon, 2008-08-04 at 14:26 +0200, Richard Guenther wrote:
> In extract_range_from_binary_expr it looks like the cases for
> BIT_AND_EXPR and BIT_IOR_EXPR can share most (if not all) of
> the code if you use the expression code instead of constant codes.
> 
> In bits_always_set and bits_maybe_set it is better to use double_ints
> (see double_int.h) for intermediate calculations, as they do not involve
> allocating new tree constants.

The updated patch is attached (with instrumentation code present but
#defined out).

Comment 12 Denis Vlasenko 2008-08-18 13:02:46 UTC

Bootstrap with -O2 on current svn fails. Bootstrap with -Os works.
Bootstrap with -O2 on 4.3.1 works.

Instrumented patch emits C code which can be used to test for incorrect VRP predictions.

I ran such tests and found no such incorrect predictions. Either patch is buggy in some subtle way or it exposes a latent bug elsewhere :(

Comment 13 Denis Vlasenko 2008-08-20 14:57:31 UTC

Created attachment 16113 [details]
Updated doubleint-based patch. DOES NOT PASS TESTSUITE.

Comment 14 Denis Vlasenko 2008-08-20 14:58:09 UTC

Created attachment 16114 [details]
Tree based patch. Passes bootstrap.

Comment 15 Denis Vlasenko 2008-08-20 15:07:01 UTC

(In reply to comment #13)
> Created an attachment (id=16113) [edit]
> Updated doubleint-based patch. DOES NOT PASS TESTSUITE.

I meant "does not bootstrap".

Anyway. Something strange is going on. Last two patches:

bug28632_doubleint_based.patch
bug28632_tree_based.patch

are identical in their basic algorithm, but one uses trees and other uses double_ints for intermediate calculations. I did a bootstrap with them
and on 4.3.1 both work, whereas on current-ish svn doubleint-based patch
fails.

Bootstrap with both patches was done with debugging output enabled
and OUTPUT IS THE SAME! Entire ~7Mb output has the same ranges predicted
by both patches, up to the point where gcc stage 2 is reached.
Then newly built compiles mispredicts a range and compile fails
(again only for bug28632_doubleint_based.patch).

This practically rules out that I have some silly bug
in bug28632_doubleint_based.patch which miscalculates ranges - that would make
bug28632_tree_based.patch to fail as well.

I also ran a test program which tested predictions for random (a | b)
and (a & b) and it didn't find any errors.

Looks like this problem is more difficult than I can handle.
Putting it on hold for now.

Comment 16 Bernhard Reutner-Fischer 2009-06-05 16:27:12 UTC

(In reply to comment #15)
> (In reply to comment #13)
> > Created an attachment (id=16113) [edit]
> > Updated doubleint-based patch. DOES NOT PASS TESTSUITE.
> 
> I meant "does not bootstrap".
> 
> Anyway. Something strange is going on. Last two patches:

I have a reimplementation (for BIT_AND_EXPR only, BIT_IOR_EXPR seem to work ok on trunk) in testing that i intend to submit for inclusion.

Comment 17 Bernhard Reutner-Fischer 2009-06-22 08:28:48 UTC

Created attachment 18044 [details]
Bootstraps and regtests

bootstrapped and regtested all default languages with no new regressions.
TODO:
- split out to helper function
- unite the two (unrelated) gcc_asserts and send them separately
- drop debugging stuff

Comment 18 Steven Bosscher 2010-01-02 01:00:42 UTC

Re. comment #17: Any plans to finish this for GCC 4.6 stage 1?

Comment 19 Bernhard Reutner-Fischer 2010-07-09 07:53:11 UTC

Created attachment 21156 [details]
gcc/testsuite/gcc.dg/tree-ssa/vrp50-PR28632.c

Comment 20 Jakub Jelinek 2010-07-09 17:12:13 UTC

Created attachment 21162 [details]
gcc46-pr28632.patch

Sorry, I haven't been aware of this PR.  The attached patch brings in some of the ideas from these patches and even goes beyond that.  Tested so far just on tree-ssa.exp=vrp*.c though, going to bootstrap/regtest it now.

Comment 21 Jakub Jelinek 2010-07-09 19:40:19 UTC

Subject: Bug 28632

Author: jakub
Date: Fri Jul  9 19:40:03 2010
New Revision: 162009

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162009
Log:
	PR tree-optimization/28632
	* tree-vrp.c (zero_nonzero_bits_from_vr): New function.
	(extract_range_from_binary_expr): Further optimize
	BIT_AND_EXPR and BIT_IOR_EXPR.

	* gcc.dg/tree-ssa/vrp51.c: New test.
	* gcc.dg/tree-ssa/vrp52.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp51.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp52.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vrp.c

Comment 22 Jakub Jelinek 2010-09-24 11:30:52 UTC

Fixed.