This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug rtl-optimization/69052] [6 Regression] Performance regression after r229402.


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69052

--- Comment #5 from amker at gcc dot gnu.org ---
(In reply to Yuri Rumyantsev from comment #0)
> In loop_invariant phase additional function inv_can_prop_to_addr_use which
> tried to determine if forward propagation for cheap address is possible
> through call of verify_changes which is very poor in comparison with combine
> phase.
> For example, for attached test-case it tries
> (gdb) call debug_rtx(def_insn)
> (insn 69 67 70 9 (set (reg/f:SI 149)
>         (plus:SI (reg:SI 87)
>             (const:SI (unspec:SI [
>                         (symbol_ref:SI ("ind") [flags 0x2] <var_decl
> 0x7ffff7ffbe10 ind>)
>                     ] UNSPEC_GOTOFF)))) t1.c:40 212 {*leasi}
>      (expr_list:REG_DEAD (reg:SI 87)
>         (nil)))
> (gdb) call debug_rtx(use_insn)
> (insn 70 69 71 9 (set (reg:SI 150)
>         (mem/u:SI (plus:SI (mult:SI (reg/v:SI 90 [ k ])
>                     (const_int 4 [0x4]))
>                 (reg/f:SI 149)) [1 ind S4 A32])) t1.c:40 86 {*movsi_internal}
>      (expr_list:REG_DEAD (reg/f:SI 149)
>         (nil)))
> and determines that propagation is not possible:
> (gdb) p ok
> $1 = false
> but combine can do such substitution.
> 
> This leads to undesired code motion and performance lost:
> for stmt out[ind[k]] = result
> before r229402
> 	movl	ind@GOTOFF(%ebx,%esi,4), %eax
> 	movl	12(%esp), %edi
> 	movl	%ebp, (%edi,%eax,4)
> after r229402
> 	movl	28(%esp), %eax
> 	movl	24(%esp), %ebx
> 	movl	(%eax,%esi,4), %eax
> 	movl	%edi, (%ebx,%eax,4)
> 
> redundant fill has been generated by LRA.
> 
> Since emulation combine phase is not so simple I assume that additional hook
> should be added to turn off such transformation for x86 in PIE mode.

I agree it's hard to mimic combine behavior here.  It would be nice if we can
abstract an interface out of combine so that we can test if instructions can be
combined at any passes, but this sounds even more difficult...
Also is it possible to give green light to got related address computation.  On
other targets, I fount it's better to keep these instructions inside one basic
block, otherwise redundant computation or CSE opportunities might be missed
among different basic blocks.

Not sure if stage4 is a good time for a new hook either.

Any ideas?

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]