Bug 45214 - Poor initial RTL for bitfield operations
Summary: Poor initial RTL for bitfield operations
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.6.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: bitfield
  Show dependency treegraph
 
Reported: 2010-08-06 21:21 UTC by Bernd Schmidt
Modified: 2011-07-19 22:52 UTC (History)
2 users (show)

See Also:
Host:
Target: i686-pc-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-08-06 21:48:55


Attachments
A testcase which shows the problem. (427 bytes, text/plain)
2010-08-06 21:21 UTC, Bernd Schmidt
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bernd Schmidt 2010-08-06 21:21:07 UTC
The attached testcase, from gcc's own gimplify.c, is optimized poorly at the tree stage.  Initial RTL has

;; t_1->gsbase.plf = D.2014_8;

(insn 8 6 9 (set (reg:QI 65)
        (mem/s:QI (plus:SI (reg/v/f:SI 58 [ t ])
                (const_int 1 [0x1])) [0+1 S1 A8])) gimplify.i:48 -1
     (nil))

(insn 9 8 10 (parallel [
            (set (reg:QI 64)
                (lshiftrt:QI (reg:QI 65)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (expr_list:REG_EQUAL (lshiftrt:QI (mem/s:QI (plus:SI (reg/v/f:SI 58 [ t ])
                    (const_int 1 [0x1])) [0+1 S1 A8])
            (const_int 3 [0x3]))
        (nil)))

(insn 10 9 11 (parallel [
            (set (reg:QI 66)
                (and:QI (reg:QI 64)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 11 10 13 (parallel [
            (set (reg:QI 67)
                (ior:QI (reg:QI 66)
                    (const_int 1 [0x1])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 13 11 14 (parallel [
            (set (reg:QI 69)
                (and:QI (reg:QI 67)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 14 13 15 (parallel [
            (set (reg:QI 70)
                (ashift:QI (reg:QI 69)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 15 14 16 (set (reg:QI 71)
        (mem/s/j:QI (plus:SI (reg/v/f:SI 58 [ t ])
                (const_int 1 [0x1])) [0+1 S1 A8])) gimplify.i:48 -1
     (nil))

(insn 16 15 17 (parallel [
            (set (reg:QI 72)
                (and:QI (reg:QI 71)
                    (const_int -25 [0xffffffe7])))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 17 16 18 (parallel [
            (set (reg:QI 73)
                (ior:QI (reg:QI 72)
                    (reg:QI 70)))
            (clobber (reg:CC 17 flags))
        ]) gimplify.i:48 -1
     (nil))

(insn 18 17 0 (set (mem/s/j:QI (plus:SI (reg/v/f:SI 58 [ t ])
                (const_int 1 [0x1])) [0+1 S1 A8])
        (reg:QI 73)) gimplify.i:48 -1
     (nil))

This is not optimized by anything unless the combiner is extended to handle four insns.  This PR should stay open even if the combiner is improved, until the tree optimizers handle this better.
Comment 1 Bernd Schmidt 2010-08-06 21:21:56 UTC
Created attachment 21427 [details]
A testcase which shows the problem.
Comment 2 Richard Biener 2010-08-06 21:48:55 UTC
Confirmed.
Comment 3 Andrew Pinski 2010-10-28 19:20:39 UTC
  D.2047_5 = t_1->gsbase.plf;
  D.2048_6 = (unsigned char) D.2047_5;
  D.2049_7 = D.2048_6 | 1;
  D.2050_8 = (<unnamed-unsigned:2>) D.2049_7;
  t_1->gsbase.plf = D.2050_8;


It could be optimized to just:
  D.2047_5 = t_1->gsbase.plf;
  D.2047_6 = D.2047_5 | 1
  t_1->gsbase.plf = D.2050_6;

But I will note that on MIPS64-Linux-gnu we get pretty good RTL at the beginning due to zero_extract:

(insn 9 8 10 t.c:48 (set (reg:SI 201)
        (mem/s:SI (reg/v/f:SI 193 [ t ]) [0+0 S4 A32])) -1 (nil))

(insn 10 9 11 t.c:48 (set (reg:DI 203)
        (zero_extract:DI (subreg:DI (reg:SI 201) 0)
            (const_int 2 [0x2])
            (const_int 19 [0x13]))) -1 (nil))

(insn 11 10 12 t.c:48 (set (reg:QI 204)
        (truncate:QI (reg:DI 203))) -1 (nil))

(insn 12 11 13 t.c:48 (set (reg:SI 205)
        (ior:SI (subreg:SI (reg:QI 204) 0)
            (const_int 1 [0x1]))) -1 (nil))

(insn 13 12 14 t.c:48 (set (reg:SI 206)
        (mem/s/j:SI (reg/v/f:SI 193 [ t ]) [0+0 S4 A32])) -1 (nil))

(insn 14 13 15 t.c:48 (set (reg:DI 207)
        (subreg:DI (reg:SI 206) 0)) -1 (nil))

(insn 15 14 16 t.c:48 (set (zero_extract:DI (reg:DI 207)
            (const_int 2 [0x2])
            (const_int 19 [0x13]))
        (subreg:DI (reg:SI 205) 0)) -1 (nil))

(insn 16 15 17 t.c:48 (set (reg:SI 206)
        (truncate:SI (reg:DI 207))) -1 (nil))

(insn 17 16 0 t.c:48 (set (mem/s/j:SI (reg/v/f:SI 193 [ t ]) [0+0 S4 A32])
        (reg:SI 206)) -1 (nil))
Comment 4 Andrew Pinski 2011-07-19 22:52:27 UTC
;; t_1->gsbase.plf = D.2722_6;

(insn 7 6 8 (set (reg:QI 63)
        (const_int 1 [0x1])) t.c:48 -1
     (nil))

(insn 8 7 9 (parallel [
            (set (reg:QI 62)
                (ashift:QI (reg:QI 63)
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) t.c:48 -1
     (nil))

(insn 9 8 0 (parallel [
            (set (mem/s/j:QI (plus:SI (reg/v/f:SI 59 [ t ])
                        (const_int 1 [0x1])) [0+1 S1 A8])
                (ior:QI (mem/s/j:QI (plus:SI (reg/v/f:SI 59 [ t ])
                            (const_int 1 [0x1])) [0+1 S1 A8])
                    (reg:QI 62)))
            (clobber (reg:CC 17 flags))
        ]) t.c:48 -1
     (nil))