This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Code optimization with GCSE


The subreg pass has this :

(insn 5 2 6 2 ex1b.c:8 (set (reg/f:DI 74)
        (const:DI (plus:DI (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)
                (const_int 8 [0x8])))) 71 {movdi_internal} (nil))

(insn 6 5 7 2 ex1b.c:8 (set (reg/f:DI 75)
        (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)) 71
{movdi_internal} (nil))

...

(insn 10 9 11 2 ex1b.c:8 (set (reg/f:DI 79)
        (const:DI (plus:DI (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)
                (const_int 16 [0x10])))) 71 {movdi_internal} (nil))


As we can see, all three are using the symbol_ref data before adding
their offset. But after cse, we get this:

(insn 5 2 6 2 ex1b.c:8 (set (reg/f:DI 74)
        (const:DI (plus:DI (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)
                (const_int 8 [0x8])))) 71 {movdi_internal} (nil))

(insn 6 5 7 2 ex1b.c:8 (set (reg/f:DI 75)
        (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)) 71
{movdi_internal} (nil))

...

(insn 10 9 11 2 ex1b.c:8 (set (reg/f:DI 79)
        (plus:DI (reg/f:DI 75)
            (const_int 16 [0x10]))) 2 {adddi3_port}
(expr_list:REG_EQUAL (const:DI (plus:DI (symbol_ref:DI ("data")
<var_decl 0
                (const_int 16 [0x10])))
        (nil)))


As we can see, the CSE pass, instead of putting the three in function
of 74, puts only the last one in function of 75.

I put the whole dump of cse at the end of this email, I didn't want to
make this one too long...

Thanks again,
Jean Christophe Beyler

------------------ Dump of cse1 ------------------

;; Function foo (foo)


3 basic blocks, 2 edges.

Basic block 0 , next 2, loop_depth 0, count 0, freq 10000, maybe hot.
Predecessors:
Successors:  2 [100.0%]  (fallthru)

Basic block 2 , prev 0, next 1, loop_depth 0, count 0, freq 10000, maybe hot.
Predecessors:  ENTRY [100.0%]  (fallthru)
Successors:  EXIT [100.0%]  (fallthru)

Basic block 1 , prev 2, loop_depth 0, count 0, freq 10000, maybe hot.
Predecessors:  2 [100.0%]  (fallthru)
Successors:

starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
df_worklist_dataflow_overeager:n_basic_blocks 3 n_edges 2 count 3 (    1)


foo

Dataflow summary:
def_info->table_size = 0, use_info->table_size = 0
;;  invalidated by call 	 2 [r2] 4 [r4] 5 [r5] 6 [r6] 7 [r7] 8 [r8] 9
[r9] 10 [r10] 11 [r11] 12 [r12] 13 [r13] 14 [r14] 15 [r15] 16 [r16] 17
[r17] 18 [r18] 19 [r19] 20 [r20] 21 [r21] 22 [r22] 23 [r23] 24 [r24]
25 [r25] 26 [r26] 27 [r27] 28 [r28] 29 [r29] 30 [r30] 31 [r31] 32
[r32] 33 [r33] 34 [r34] 35 [r35] 36 [r36] 37 [r37] 38 [r38] 39 [r39]
40 [r40] 41 [r41] 42 [r42] 43 [r43] 44 [r44] 45 [r45] 46 [r46] 47
[r47] 63 [r63] 64 [$rap] 65 [cc] 66 [acc]
;;  hardware regs used 	 0 [r0] 1 [r1] 3 [r3]
;;  regular block artificial uses 	 0 [r0] 1 [r1] 3 [r3] 62 [r62]
;;  eh block artificial uses 	 0 [r0] 1 [r1] 3 [r3] 62 [r62]
;;  entry block defs 	 0 [r0] 1 [r1] 3 [r3] 6 [r6] 8 [r8] 9 [r9] 10
[r10] 11 [r11] 12 [r12] 13 [r13] 14 [r14] 15 [r15] 62 [r62] 63 [r63]
;;  exit block uses 	 1 [r1] 3 [r3] 6 [r6] 62 [r62]
;;  regs ever live 	 6[r6]

( )->[0]->( 2 )
;; bb 0 artificial_defs: { d-1(0){ }d-1(1){ }d-1(3){ }d-1(6){ }d-1(8){
}d-1(9){ }d-1(10){ }d-1(11){ }d-1(12){ }d-1(13){ }d-1(14){ }d-1(15){
}d-1(62){ }d-1(63){ }}
;; bb 0 artificial_uses: { }

( 0 )->[2]->( 1 )
;; bb 2 artificial_defs: { }
;; bb 2 artificial_uses: { u-1(0){ }u-1(1){ }u-1(3){ }u-1(62){ }}

( 2 )->[1]->( )
;; bb 1 artificial_defs: { }
;; bb 1 artificial_uses: { u-1(1){ }u-1(3){ }u-1(6){ }u-1(62){ }}

Finding needed instructions:
  Adding insn 23 to worklist
Finished finding needed instructions:
processing block 2 live out =  0 [r0] 1 [r1] 3 [r3] 6 [r6] 62 [r62]
  Adding insn 17 to worklist
  Adding insn 13 to worklist
  Adding insn 12 to worklist
  Adding insn 11 to worklist
  Adding insn 10 to worklist
  Adding insn 9 to worklist
  Adding insn 8 to worklist
  Adding insn 7 to worklist
  Adding insn 6 to worklist
  Adding insn 5 to worklist
df_worklist_dataflow_overeager:n_basic_blocks 3 n_edges 2 count 3 (    1)
;; Following path with 11 sets: 2
deferring rescan insn with uid = 10.
deferring rescan insn with uid = 17.


try_optimize_cfg iteration 1

(note 3 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)

(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)

(insn 5 2 6 2 ex1b.c:8 (set (reg/f:DI 74)
        (const:DI (plus:DI (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)
                (const_int 8 [0x8])))) 71 {movdi_internal} (nil))

(insn 6 5 7 2 ex1b.c:8 (set (reg/f:DI 75)
        (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)) 71
{movdi_internal} (nil))

(insn 7 6 8 2 ex1b.c:8 (set (reg:DI 77 [ data+8 ])
        (mem/s:DI (reg/f:DI 74) [2 data+8 S8 A64])) 71 {movdi_internal} (nil))

(insn 8 7 9 2 ex1b.c:8 (set (reg:DI 78 [ data ])
        (mem/s:DI (reg/f:DI 75) [2 data+0 S8 A64])) 71 {movdi_internal} (nil))

(insn 9 8 10 2 ex1b.c:8 (set (reg:DI 76)
        (plus:DI (reg:DI 77 [ data+8 ])
            (reg:DI 78 [ data ]))) 2 {adddi3_port} (nil))

(insn 10 9 11 2 ex1b.c:8 (set (reg/f:DI 79)
        (plus:DI (reg/f:DI 75)
            (const_int 16 [0x10]))) 2 {adddi3_port}
(expr_list:REG_EQUAL (const:DI (plus:DI (symbol_ref:DI ("data")
<var_decl 0xb7d35058 data>)
                (const_int 16 [0x10])))
        (nil)))

(insn 11 10 12 2 ex1b.c:8 (set (reg:DI 80 [ data+16 ])
        (mem/s:DI (reg/f:DI 79) [2 data+16 S8 A64])) 71 {movdi_internal} (nil))

(insn 12 11 13 2 ex1b.c:8 (set (reg:DI 73)
        (plus:DI (reg:DI 76)
            (reg:DI 80 [ data+16 ]))) 2 {adddi3_port} (nil))

(insn 13 12 17 2 ex1b.c:8 (set (reg:DI 72 [ <result> ])
        (reg:DI 73)) 71 {movdi_internal} (nil))

(insn 17 13 23 2 ex1b.c:10 (set (reg/i:DI 6 r6)
        (reg:DI 73)) 71 {movdi_internal} (nil))

(insn 23 17 0 2 ex1b.c:10 (use (reg/i:DI 6 r6)) -1 (nil))
starting the processing of deferred insns
rescanning insn with uid = 10.
deleting insn with uid = 10.
rescanning insn with uid = 17.
deleting insn with uid = 17.
ending the processing of deferred insns




On Wed, Jul 15, 2009 at 12:25 PM, Adam Nemet<anemet@caviumnetworks.com> wrote:
> Jean Christophe Beyler <jean.christophe.beyler@gmail.com> writes:
>> uint64_t foo (void)
>> {
>> ? ? return data[0] + data[1] + data[2];
>> }
>>
>> And this generates :
>>
>> ? ? la ?r9,data
>> ? ? la ?r7,data+8
>> ? ? ldd r6,0(r7)
>> ? ? ldd r8,0(r9)
>> ? ? ldd r7,16(r9)
>>
>> I'm trying to see if there is a problem with my rtx costs function
>> because again, I don't understand why it would generate 2 la instead
>> of using an offset of 8 and 16.
>
> You probably want to look at the RTL dumps. ?This code should have been
> expanded with the correct offsets (at least that is what happens on
> MIPS). ?I don't see how later passes would modify the code other than
> removing 2 of the 3 "la rX, data" insns.
>
> Adam
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]