31136 – [4.2 Regression] FRE ignores bit-field truncation (C and C++ front-end don't produce bit-field truncation

Bug 31136 - [4.2 Regression] FRE ignores bit-field truncation (C and C++ front-end don't produce bit-field truncation

Summary: [4.2 Regression] FRE ignores bit-field truncation (C and C++ front-end don't ...

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	4.2.0

Importance:	P1 normal
Target Milestone:	4.2.0
Assignee:	Richard Biener

URL:
Keywords:	wrong-code

Depends on:
Blocks:	32687
	Show dependency tree / graph

Reported:	2007-03-11 18:48 UTC by Charles J. Tabony
Modified:	2007-07-09 07:17 UTC (History)
CC List:	5 users (show)

See Also:
Host:	x86_64-unknown-linux-gnu
Target:	x86_64-unknown-linux-gnu
Build:	x86_64-unknown-linux-gnu
Known to work:	4.1.2 4.3.0
Known to fail:	4.2.0
Last reconfirmed:	2007-04-21 16:56:26

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Charles J. Tabony 2007-03-11 18:48:02 UTC

With the tip of the 4.2 branch, the following program returns 1.  Mainline returns 0.  Is this defined behavior?  I could not find anything on the subject.

struct S {
  unsigned b4:4;
  unsigned b6:6;
} s;

int main(void){
  s.b6 = 31;
  s.b4 = s.b6;
  s.b6 = s.b4;
  return s.b6 == 15 ? 0 : 1;
}


before FRE (-fdump-tree-ccp):

;; Function main (main)

main ()
{
  short unsigned int D.1882;
  short unsigned int D.1881;
  int D.1880;
  <unnamed type> D.1879;
  <unnamed type> D.1878;
  <unnamed type> D.1877;
  <unnamed type> D.1876;

<bb 2>:
  s.b6 = 31;
  D.1876_3 = s.b6;
  D.1877_4 = (<unnamed type>) D.1876_3;
  s.b4 = D.1877_4;
  D.1878_7 = s.b4;
  D.1879_8 = (<unnamed type>) D.1878_7;
  s.b6 = D.1879_8;
  D.1881_10 = BIT_FIELD_REF <s, 16, 0>;
  D.1882_11 = D.1881_10 & 1008;
  D.1880_12 = D.1882_11 != 240;
  return D.1880_12;

}


after FRE (-fdump-tree-fre):

;; Function main (main)

main ()
{
  short unsigned int D.1882;
  short unsigned int D.1881;
  int D.1880;
  <unnamed type> D.1879;
  <unnamed type> D.1878;
  <unnamed type> D.1877;
  <unnamed type> D.1876;

<bb 2>:
  s.b6 = 31;
  D.1876_3 = 31;
  D.1877_4 = (<unnamed type>) D.1876_3;
  s.b4 = D.1877_4;
  D.1878_7 = D.1877_4;
  D.1879_8 = 31;
  s.b6 = D.1879_8;
  D.1881_10 = BIT_FIELD_REF <s, 16, 0>;
  D.1882_11 = D.1881_10 & 1008;
  D.1880_12 = D.1882_11 != 240;
  return D.1880_12;

}


D.1879_8 was replaced by 31, ignoring the fact that the value should have been truncated to 15 when assigned to s.b4.

Comment 1 Charles J. Tabony 2007-03-19 23:19:49 UTC

GCC 4.1.2 returns 0.

Comment 2 Andrew Pinski 2007-03-20 00:24:25 UTC

> D.1879_8 was replaced by 31, ignoring the fact that the value should have been
> truncated to 15 when assigned to s.b4.

The front-end should have truncated that to 15.  Which front-end are you using, C or C++?

Comment 3 Charles J. Tabony 2007-03-20 00:39:22 UTC

C.

Comment 4 Andrew Pinski 2007-03-20 06:40:59 UTC

Related to PR 26534.

Anyways the bug is in the front-end, We most likely have a mismatch type also.

Comment 5 Richard Biener 2007-03-20 10:00:19 UTC

Confirmed.

Comment 6 Joseph S. Myers 2007-03-23 03:45:10 UTC

Analysis: http://gcc.gnu.org/ml/gcc/2007-03/msg00867.html

Comment 7 Seongbae Park 2007-03-23 05:00:57 UTC

Follow up on Joseph's analysis:

The problematic STRIP_SIGN_NOPS() call is from fold_unary()
which is called from try_combine_conversion() in tree-ssa-pre.c.

STRIP_SIGN_NOPS() is called with the expression:

 <nop_expr 0x866f220
    type <integer_type 0xf7e30678 public unsigned QI
        size <integer_cst 0xf7d781e0 constant invariant 8>
        unit size <integer_cst 0xf7d781f8 constant invariant 1>
        align 8 symtab 0 alias set -1 precision 4 min <integer_cst 0xf7e32618 0> max <integer_cst 0xf7e32630 15>>

    arg 0 <integer_cst 0xf7e327b0 type <integer_type 0xf7e306d4> constant invariant 31>>

and it stripes away the conversion,
leaving only integer constant 31.
This is clearly wrong as it removes the downconversion of precision.

Following patch (against 4.2 branch) seems to fix the problem:

Index: tree.h
===================================================================
--- tree.h      (revision 123088)
+++ tree.h      (working copy)
@@ -912,7 +912,9 @@ extern void omp_clause_range_check_faile
         && (TYPE_MODE (TREE_TYPE (EXP))                        \
             == TYPE_MODE (TREE_TYPE (TREE_OPERAND (EXP, 0))))  \
         && (TYPE_UNSIGNED (TREE_TYPE (EXP))                    \
-            == TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (EXP, 0))))) \
+            == TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (EXP, 0))))\
+        && (TYPE_PRECISION (TREE_TYPE (EXP))                   \
+            >= TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (EXP, 0)))))\
     (EXP) = TREE_OPERAND (EXP, 0)

 /* Like STRIP_NOPS, but don't alter the TREE_TYPE either.  */

Comment 8 pinskia@gmail.com 2007-03-23 07:57:22 UTC

Subject: Re:  [4.2 Regression] FRE ignores bit-field truncation (C and C++ front-end don't produce bit-field truncation

On 23 Mar 2007 05:01:00 -0000, spark at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
> The problematic STRIP_SIGN_NOPS() call is from fold_unary()
> which is called from try_combine_conversion() in tree-ssa-pre.c.
>
> STRIP_SIGN_NOPS() is called with the expression:

No, STRIP_SIGN_NOPS is correct, just fold_unary is incorrect in its
folding.  It should have called fold_convert on the expression if the
types are different and it is a constant.

-- Pinski

Comment 9 pinskia@gmail.com 2007-03-23 08:01:31 UTC

Subject: Re:  [4.2 Regression] FRE ignores bit-field truncation (C and C++ front-end don't produce bit-field truncation

On 3/23/07, Andrew Pinski <pinskia@gmail.com> wrote:
> On 23 Mar 2007 05:01:00 -0000, spark at gcc dot gnu dot org
> <gcc-bugzilla@gcc.gnu.org> wrote:
> > The problematic STRIP_SIGN_NOPS() call is from fold_unary()
> > which is called from try_combine_conversion() in tree-ssa-pre.c.
> >
> > STRIP_SIGN_NOPS() is called with the expression:
>
> No, STRIP_SIGN_NOPS is correct, just fold_unary is incorrect in its
> folding.  It should have called fold_convert on the expression if the
> types are different and it is a constant.

Ok, the real issue is that we call fold with
NOP_EXPR<NOP_EXPR<INTEGER_CST>> instead of just NOP_EXPR<INTEGER_CST>
so you have to figure out where we should fold the first NOP_EXPR
instead of that patch.

-- Pinski

Comment 10 Andrew Pinski 2007-03-23 08:17:58 UTC

The good question is why does FRE not do anything on the trunk:
  s.b6 = 31;
  D.1597_1 = s.b6;
that really should be optimized at the FRE level.

Comment 11 jsm-csl@polyomino.org.uk 2007-03-23 13:41:41 UTC

Subject: Re:  [4.2 Regression] FRE ignores
 bit-field truncation (C and C++ front-end don't produce bit-field truncation

On Fri, 23 Mar 2007, pinskia at gmail dot com wrote:

> No, STRIP_SIGN_NOPS is correct, just fold_unary is incorrect in its

That depends on an analysis of every caller of STRIP_SIGN_NOPS to work out 
what semantics they require and whether removing conversions changing the 
value is correct in that case.  Only then can you determine whether 
STRIP_SIGN_NOPS should have the present semantics and some subset of 
callers should be changed to work with those semantics, or whether the 
semantics of STRIP_SIGN_NOPS would better be changed.

On the whole I think that references to the mode in STRIP_NOPS and 
STRIP_SIGN_NOPS are rather doubtful - mode should not be of relevance at 
this level of tree optimizations - and mode is probably being used as a 
proxy for precision.  The general sequence of integer type conversions can 
be represented in the form "truncate to M bits, sign-extend to N bits and 
then zero-extend to the width of the outer type", maybe this should be 
represented somehow; then it would be defined exactly what such 
conversions can be removed by these macros.

Comment 12 Mark Mitchell 2007-03-26 05:43:38 UTC

I agree with Joseph that STRIP_SIGN_NOPS should not be removing changes in precision that may change the value and that, indeed, mode is probably being used as an inaccurate proxy for precision.

Comment 13 Richard Biener 2007-04-21 16:37:28 UTC

The interesting thing is that we

Created value VH.0 for (<unnamed-unsigned:4>) 31

The bug (compared to the trunk) is, that tree-ssa-pre.c:try_look_through_load
on the 4.2 branch manages to propagate the 31 while trunk does not (surprisingly).

On 4.2 we have for the def_stmt

#   SFT.0D.1539_2 = V_MUST_DEF <SFT.0D.1539_1>;
sD.1526.b6D.1525 = 31

while on the trunk

# SFT.0_10 = VDEF <SFT.0_9(D)> { SFT.0 }
s.b6 = 31

and the predicate !ZERO_SSA_OPERANDS (def_stmt, SSA_OP_VIRTUAL_USES) evaluates
differently on them.  *sigh*

This causes us to have the unfolded expression created from
create_value_expr_from which we then fold incorrectly by folding
of double conversion code.

One fix is to fold the expression we generate with like

Index: tree-ssa-pre.c
===================================================================
--- tree-ssa-pre.c      (revision 124018)
+++ tree-ssa-pre.c      (working copy)
@@ -2973,6 +2973,9 @@ create_value_expr_from (tree expr, basic
       TREE_OPERAND (vexpr, i) = val;
     }

+  if (UNARY_CLASS_P (vexpr))
+    vexpr = fold (vexpr);
+
   return vexpr;
 }

which then results in the correct

main ()
{
  short unsigned int D.1536;
  short unsigned int D.1535;
  int D.1534;
  <unnamed-unsigned:6> D.1533;
  <unnamed-unsigned:4> D.1532;
  <unnamed-unsigned:4> D.1531;
  <unnamed-unsigned:6> D.1530;

<bb 2>:
  s.b6 = 31;
  D.1530_3 = 31;
  D.1531_4 = 15;
  s.b4 = D.1531_4;
  D.1532_7 = 15;
  D.1533_8 = 15;
  s.b6 = D.1533_8;
  D.1535_10 = BIT_FIELD_REF <s, 16, 0>;
  D.1536_11 = D.1535_10 & 1008;
  D.1534_12 = D.1536_11 != 240;
  return D.1534_12;

}

now another question is, why we "regressed" here on the mainline.  Danny?
(I guess we might get more unfolded trees by constants propagated by
the look from load code - like an addition)

Comment 14 Richard Biener 2007-04-21 16:56:26 UTC

Indeed.

int main(void){
  s.b6 = 31;
  s.b4 = s.b6 + s.b6;
  s.b6 = s.b4;
  return s.b6 == 15 ? 0 : 1;
}

Created value VH.0 for 31 + 31
...

<bb 2>:
  s.b6 = 31;
  D.1530_3 = 31;
  D.1531_4 = 31;
  D.1530_5 = 31;
  D.1531_6 = 31;
  D.1532_7 = D.1531_6 + D.1531_6;
  D.1533_8 = (<unnamed-unsigned:4>) D.1532_7;

but luckily we don't fold (<unnamed-unsigned:4>) (31 + 31) wrong.  (But
note we also don't constant fold)

Still, for folding (<unnamed-unsigned:6>)(<unnamed-unsigned:4>) 31:6
there is a bug in fold_unary as we are calling
fold_convert_const (code, type, arg0) where arg is 31:6 and type
(<unnamed-unsigned:6>) which is obviously a no-op.  We should call it
on op0 instead.

I'm going to test this (it's broken on the mainline as well) and commit if
it succeeds.

Comment 15 Richard Biener 2007-04-21 16:58:02 UTC

See comment #13.

Comment 16 Richard Biener 2007-04-21 18:44:10 UTC

Subject: Bug 31136

Author: rguenth
Date: Sat Apr 21 18:43:57 2007
New Revision: 124019

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124019
Log:
2007-04-21  Richard Guenther  <rguenther@suse.de>

	PR middle-end/31136
	* fold-const.c (fold_unary): Call fold_convert_const on the
	original tree.

	* gcc.c-torture/execute/pr31136.c: New testcase.

Added:
    branches/gcc-4_2-branch/gcc/testsuite/gcc.c-torture/execute/pr31136.c
Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/fold-const.c
    branches/gcc-4_2-branch/gcc/testsuite/ChangeLog

Comment 17 Richard Biener 2007-04-21 18:47:24 UTC

Subject: Bug 31136

Author: rguenth
Date: Sat Apr 21 18:47:13 2007
New Revision: 124020

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124020
Log:
2007-04-21  Richard Guenther  <rguenther@suse.de>

	PR middle-end/31136
	* fold-const.c (fold_unary): Call fold_convert_const on the
	original tree.

	* gcc.c-torture/execute/pr31136.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/pr31136.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/fold-const.c
    trunk/gcc/testsuite/ChangeLog

Comment 18 Richard Biener 2007-04-21 18:53:02 UTC

Fixed.  I split the remaining FRE problems to a new PR31651.