Bug 18031 - OR of a bitfield and a constant is not optimized at tree level
Summary: OR of a bitfield and a constant is not optimized at tree level
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: unknown
: P2 enhancement
Target Milestone: 4.4.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on: 28632
Blocks: 19466
  Show dependency treegraph
 
Reported: 2004-10-16 05:06 UTC by Kazu Hirata
Modified: 2009-06-08 15:40 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.4.0
Known to fail: 4.3.3
Last reconfirmed: 2006-04-27 20:46:04


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kazu Hirata 2004-10-16 05:06:03 UTC
struct B {
  unsigned bit : 1;
};

void
ior (struct B *b)
{
  b->bit |= 1;
}

I get:

ior (b)
{
<bb 0>:
  b->bit = (<unnamed type>) (unsigned char) ((signed char) b->bit | 1);
  return;

}

Note that we could just say b->bit = 1;
Comment 1 Andrew Pinski 2004-10-16 13:57:26 UTC
Confirmed, this needs a tree combiner.
Comment 2 Andrew Pinski 2005-01-15 05:31:01 UTC
Well or just fold being fixed.
Comment 3 Andrew Pinski 2005-02-07 05:55:43 UTC
For the RTL level (at least on PPC), we could combine the following instructions and simplify them:
(insn 15 13 16 0 (set (reg:SI 124)
        (const_int 1 [0x1])) 293 {*movsi_internal1} (nil)
    (nil))
(insn 17 16 18 0 (set (zero_extract:SI (reg:SI 125)
            (const_int 1 [0x1])
            (const_int 0 [0x0]))
        (reg:SI 124)) 112 {insvsi} (insn_list:REG_DEP_TRUE 15 (insn_list:REG_DEP_TRUE 16 (nil)))
    (expr_list:REG_DEAD (reg:SI 124)
        (nil)))
into:
(set (reg:SI 121)
       (ior:SI (reg:SI 119)
                 (const_int -2147483648 [0xffffffff80000000])))


For x86 we need to combine and simplify the following code (which is not simplified at the tree level or 
RTL level yet, well is is optimizated with my tree combiner to the correct thing):
unsigned f(unsigned i)
{
 unsigned t = i&1;
 t|=1;
 t|=(i&~1);
 return t;
}
Comment 4 Andrew Pinski 2005-06-24 20:16:59 UTC
I think this is now fixed on the mainline at the RTL level:
ior:
        movl    4(%esp), %eax
        orb     $1, (%eax)
        ret

Comment 5 Steven Bosscher 2006-04-27 20:46:04 UTC
So I asked myself, why are we not catching this in vrp?  I know we can derive ranges from types, so why don't we derive a [0,1] range from the bitfield load?

It turns out that we make _all_ loads VARYING right away, so we end up with:


Value ranges after VRP:

b_1: ~[0B, 0B]  EQUIVALENCES: { b_2 } (1 elements)
b_2: VARYING
D.1882_3: VARYING
D.1883_4: [0, 1]  EQUIVALENCES: { } (0 elements)
D.1884_5: [0, +INF]  EQUIVALENCES: { } (0 elements)
D.1885_6: [0, 127]  EQUIVALENCES: { } (0 elements)
D.1886_7: [0, +INF]  EQUIVALENCES: { } (0 elements)


ior (b)
{
  <unnamed type> D.1886;
  unsigned char D.1885;
  signed char D.1884;
  signed char D.1883;
  <unnamed type> D.1882;

<bb 2>:
  D.1882_3 = b_2->bit;
  D.1883_4 = (signed char) D.1882_3;
  D.1884_5 = D.1883_4 | 1;
  D.1885_6 = (unsigned char) D.1884_5;
  D.1886_7 = (<unnamed type>) D.1885_6;
  b_2->bit = D.1886_7;
  return;

}


where, at least so it seems to me, we could find something like,
D.1882_3: [0, 1] (etc.)

Comment 6 Andrew Pinski 2006-04-27 21:14:26 UTC
(In reply to comment #5)
> So I asked myself, why are we not catching this in vrp?  I know we can derive
> ranges from types, so why don't we derive a [0,1] range from the bitfield load?
> D.1883_4: [0, 1]  EQUIVALENCES: { } (0 elements)
> D.1884_5: [0, +INF]  EQUIVALENCES: { } (0 elements)

>   D.1884_5 = D.1883_4 | 1;

Shouldn't we find that D.1884_5 is 1 at this point because _4 was [0,1]
and [0, 1] | 1 is just 1?
Comment 7 Steven Bosscher 2006-04-27 21:32:52 UTC
Yes, BIT_IOR_EXPR is also not handled.
Comment 8 Andrew Pinski 2006-05-05 09:13:13 UTC
Here is a quick prototype addition for VRP, it does not handle the generic case only [0, 1] | 1:
Index: tree-vrp.c
===================================================================
--- tree-vrp.c  (revision 113452)
+++ tree-vrp.c  (working copy)
@@ -1276,6 +1276,7 @@ extract_range_from_binary_expr (value_ra
       && code != MIN_EXPR
       && code != MAX_EXPR
       && code != BIT_AND_EXPR
+      && code != BIT_IOR_EXPR
       && code != TRUTH_ANDIF_EXPR
       && code != TRUTH_ORIF_EXPR
       && code != TRUTH_AND_EXPR
@@ -1319,6 +1320,7 @@ extract_range_from_binary_expr (value_ra
      the operands is VR_VARYING or symbolic range.  TODO, we may be
      able to derive anti-ranges in some cases.  */
   if (code != BIT_AND_EXPR
+      && code != BIT_IOR_EXPR
       && code != TRUTH_AND_EXPR
       && code != TRUTH_OR_EXPR
       && (vr0.type == VR_VARYING
@@ -1568,6 +1570,27 @@ extract_range_from_binary_expr (value_ra
          return;
        }
     }
+  else if (code == BIT_IOR_EXPR)
+    {
+      /* FIXME.  Handle only: [0, 1] | [1, 1], this should be expanded to handle:
+         [ a, b ] | [b | b]. */
+      if (vr1.type == VR_RANGE
+         && vr1.min == vr1.max
+         && integer_onep (vr1.max)
+         && vr0.type == VR_RANGE
+         && integer_zerop (vr0.min)
+         && integer_onep (vr0.max))
+       {
+         type = VR_RANGE;
+         min = build_int_cst (TREE_TYPE (expr), 1);
+         max = build_int_cst (TREE_TYPE (expr), 1);
+       }
+      else
+       {
+         set_value_range_to_varying (vr);
+         return;
+       }
+    }
   else
     gcc_unreachable ();
 
Comment 9 Richard Biener 2006-10-24 13:07:40 UTC
We should at least fold

  b->bit = (<unnamed type>) (unsigned char) ((signed char) b->bit | 1);

to

  b->bit = (<unnamed type>) ((signed char) b->bit | 1);
Comment 10 Bernhard Reutner-Fischer 2009-06-07 16:54:05 UTC
4.3.3 does not simplify it, 4.4.0 onward do.

for reference
4.3.3:
Substituing values and folding statements

Constants propagated:      0
Copies propagated:         0
Predicates folded:         0

ior (bD.1193)
{
  <unnamed-unsigned:1> D.1200;
  unsigned charD.10 D.1199;
  signed charD.9 D.1198;
  signed charD.9 D.1197;
  <unnamed-unsigned:1> D.1196;

  # BLOCK 2 freq:10000
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  # VUSE <SMT.4D.1206_7(D)> { SMT.4D.1206 }
  D.1196_2 = bD.1193_1(D)->bitD.1192;
  D.1197_3 = (signed charD.9) D.1196_2;
  D.1198_4 = D.1197_3 | 1;
  D.1199_5 = (unsigned charD.10) D.1198_4;
  D.1200_6 = (<unnamed-unsigned:1>) D.1199_5;
  # SMT.4D.1206_8 = VDEF <SMT.4D.1206_7(D)> { SMT.4D.1206 }
  bD.1193_1(D)->bitD.1192 = D.1200_6;
  return;
  # SUCC: EXIT [100.0%] 

}


4.4.0:
Pass statistics:
----------------
Constants propagated: 1
Number of ASSERT_EXPR expressions inserted: 1
Statements deleted: 5

ior (struct B * bD.1247)
{
  <unnamed-unsigned:1> D.1254;
  unsigned charD.10 D.1253;
  signed charD.9 D.1252;
  signed charD.9 D.1251;
  <unnamed-unsigned:1> D.1250;

  # BLOCK 2 freq:10000
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  # SMT.10D.1266_8 = VDEF <SMT.10D.1266_7(D)> { SMT.10D.1266 }
  bD.1247_1(D)->bitD.1246 = 1;
  return;
  # SUCC: EXIT [100.0%] 

}
Comment 11 Richard Biener 2009-06-08 15:40:30 UTC
Thus fixed.