Bug 45214

Summary: Poor initial RTL for bitfield operations
Product: gcc Reporter: Bernd Schmidt <bernds>
Component: middle-endAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: enhancement CC: gcc-bugs, rguenth
Priority: P3 Keywords: missed-optimization
Version: 4.6.0   
Target Milestone: ---   
Host: Target: i686-pc-linux-gnu
Build: Known to work:
Known to fail: Last reconfirmed: 2010-08-06 21:48:55
Bug Depends on:    
Bug Blocks: 19466    
Attachments: A testcase which shows the problem.

Description Bernd Schmidt 2010-08-06 21:21:07 UTC
The attached testcase, from gcc's own gimplify.c, is optimized poorly at the tree stage.  Initial RTL has

;; t_1->gsbase.plf = D.2014_8;

(insn 8 6 9 (set (reg:QI 65)
        (mem/s:QI (plus:SI (reg/v/f:SI 58 [ t ])
                (const_int 1 [0x1])) [0+1 S1 A8])) gimplify.i:48 -1
     (nil))

(insn 9 8 10 (parallel [
            (set (reg:QI 64)
                (lshiftrt:QI (reg:QI 65)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (expr_list:REG_EQUAL (lshiftrt:QI (mem/s:QI (plus:SI (reg/v/f:SI 58 [ t ])
                    (const_int 1 [0x1])) [0+1 S1 A8])
            (const_int 3 [0x3]))
        (nil)))

(insn 10 9 11 (parallel [
            (set (reg:QI 66)
                (and:QI (reg:QI 64)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 11 10 13 (parallel [
            (set (reg:QI 67)
                (ior:QI (reg:QI 66)
                    (const_int 1 [0x1])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 13 11 14 (parallel [
            (set (reg:QI 69)
                (and:QI (reg:QI 67)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 14 13 15 (parallel [
            (set (reg:QI 70)
                (ashift:QI (reg:QI 69)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 15 14 16 (set (reg:QI 71)
        (mem/s/j:QI (plus:SI (reg/v/f:SI 58 [ t ])
                (const_int 1 [0x1])) [0+1 S1 A8])) gimplify.i:48 -1
     (nil))

(insn 16 15 17 (parallel [
            (set (reg:QI 72)
                (and:QI (reg:QI 71)
                    (const_int -25 [0xffffffe7])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 17 16 18 (parallel [
            (set (reg:QI 73)
                (ior:QI (reg:QI 72)
                    (reg:QI 70)))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 18 17 0 (set (mem/s/j:QI (plus:SI (reg/v/f:SI 58 [ t ])
                (const_int 1 [0x1])) [0+1 S1 A8])
        (reg:QI 73)) gimplify.i:48 -1
     (nil))

This is not optimized by anything unless the combiner is extended to handle four insns.  This PR should stay open even if the combiner is improved, until the tree optimizers handle this better.
Comment 1 Bernd Schmidt 2010-08-06 21:21:56 UTC
Created attachment 21427 [details]
A testcase which shows the problem.
Comment 2 Richard Biener 2010-08-06 21:48:55 UTC
Confirmed.
Comment 3 Andrew Pinski 2010-10-28 19:20:39 UTC
  D.2047_5 = t_1->gsbase.plf;
  D.2048_6 = (unsigned char) D.2047_5;
  D.2049_7 = D.2048_6 | 1;
  D.2050_8 = (<unnamed-unsigned:2>) D.2049_7;
  t_1->gsbase.plf = D.2050_8;


It could be optimized to just:
  D.2047_5 = t_1->gsbase.plf;
  D.2047_6 = D.2047_5 | 1
  t_1->gsbase.plf = D.2050_6;

But I will note that on MIPS64-Linux-gnu we get pretty good RTL at the beginning due to zero_extract:

(insn 9 8 10 t.c:48 (set (reg:SI 201)
        (mem/s:SI (reg/v/f:SI 193 [ t ]) [0+0 S4 A32])) -1 (nil))

(insn 10 9 11 t.c:48 (set (reg:DI 203)
        (zero_extract:DI (subreg:DI (reg:SI 201) 0)
            (const_int 2 [0x2])
            (const_int 19 [0x13]))) -1 (nil))

(insn 11 10 12 t.c:48 (set (reg:QI 204)
        (truncate:QI (reg:DI 203))) -1 (nil))

(insn 12 11 13 t.c:48 (set (reg:SI 205)
        (ior:SI (subreg:SI (reg:QI 204) 0)
            (const_int 1 [0x1]))) -1 (nil))

(insn 13 12 14 t.c:48 (set (reg:SI 206)
        (mem/s/j:SI (reg/v/f:SI 193 [ t ]) [0+0 S4 A32])) -1 (nil))

(insn 14 13 15 t.c:48 (set (reg:DI 207)
        (subreg:DI (reg:SI 206) 0)) -1 (nil))

(insn 15 14 16 t.c:48 (set (zero_extract:DI (reg:DI 207)
            (const_int 2 [0x2])
            (const_int 19 [0x13]))
        (subreg:DI (reg:SI 205) 0)) -1 (nil))

(insn 16 15 17 t.c:48 (set (reg:SI 206)
        (truncate:SI (reg:DI 207))) -1 (nil))

(insn 17 16 0 t.c:48 (set (mem/s/j:SI (reg/v/f:SI 193 [ t ]) [0+0 S4 A32])
        (reg:SI 206)) -1 (nil))
Comment 4 Andrew Pinski 2011-07-19 22:52:27 UTC
;; t_1->gsbase.plf = D.2722_6;

(insn 7 6 8 (set (reg:QI 63)
        (const_int 1 [0x1])) t.c:48 -1
     (nil))

(insn 8 7 9 (parallel [
            (set (reg:QI 62)
                (ashift:QI (reg:QI 63)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) t.c:48 -1
     (nil))

(insn 9 8 0 (parallel [
            (set (mem/s/j:QI (plus:SI (reg/v/f:SI 59 [ t ])
                        (const_int 1 [0x1])) [0+1 S1 A8])
                (ior:QI (mem/s/j:QI (plus:SI (reg/v/f:SI 59 [ t ])
                            (const_int 1 [0x1])) [0+1 S1 A8])
                    (reg:QI 62)))
            (clobber (reg:CC 17 flags))
        ]) t.c:48 -1
     (nil))