[PATCH] rtl-optimization/80960 - avoid creating garbage RTL in DSE
Richard Biener
rguenther@suse.de
Tue Feb 2 07:51:07 GMT 2021
On Mon, 1 Feb 2021, Jakub Jelinek wrote:
> On Mon, Feb 01, 2021 at 12:54:50PM -0700, Jeff Law wrote:
> > >>> So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the
> > >>> patch but counts are _extremely_ small. Statistics:
> > >>>
> > >>> 70148 dse: local deletions = 0, global deletions = 0
> > >>> 32 dse: local deletions = 0, global deletions = 1
> > >>> 9 dse: local deletions = 0, global deletions = 2
> > >>> 7 dse: local deletions = 0, global deletions = 3
> > >>> 2 dse: local deletions = 0, global deletions = 4
> > >>> 2 dse: local deletions = 0, global deletions = 5
> > >>> 3 dse: local deletions = 0, global deletions = 7
> > >>> 67 dse: local deletions = 1, global deletions = 0
> > >>> 1 dse: local deletions = 1, global deletions = 2
> > >>> 12 dse: local deletions = 2, global deletions = 0
> > >>> 1 dse: local deletions = 24, global deletions = 1
> > >>> 2 dse: local deletions = 3, global deletions = 0
> > >>> 4 dse: local deletions = 4, global deletions = 0
> > >>> 4 dse: local deletions = 6, global deletions = 0
> > >>> 1 dse: local deletions = 7, global deletions = 0
> > >>> 1 dse: local deletions = 8, global deletions = 0
> > >>>
> > >>> so not sure how much confidence this brings over the analytical
> > >>> reasoning that it shouldn't make a difference ...
> > >>>
> > >>> stats on just dse2 are even more depressing (given it's cost)
> > >>>
> > >>> 35123 dse: local deletions = 0, global deletions = 0
> > >>> 2 dse: local deletions = 0, global deletions = 1
> > >>> 20 dse: local deletions = 1, global deletions = 0
> > >>> 1 dse: local deletions = 2, global deletions = 0
> > >>> 1 dse: local deletions = 3, global deletions = 0
> > >>> 1 dse: local deletions = 4, global deletions = 0
> > >> Based on that, I'd argue that DSE2 should go away and DSE1 should be
> > >> evaluated for the chopping block. While RTL DSE was marginally
> > >> important in 1999 when it was first submitted, the tree-ssa pipeline as
> > >> a whole has probably made RTL DSE largely pointless.
> > > True. Though I'd argue that DSE2 might be the conceptually more useful pass since it sees spill slots.
> > True in concept, but I bet that the SSA pipeline has made this much less
> > common in RTL DSE than it was 20+ years ago. Our allocator and reloader
> > are much improved as well which would further decrease the number of
> > opportunities.
> >
> > I'd hazard a guess that what's left are locals that need to be
> > addressable and some optimization in the RTL pipeline exposed a dead
> > store that wasn't otherwise visible in the SSA pipeline. BUt the only
> > way to be sure would be to dig into them.
>
> Shouldn't we gather statistics from larger codebase first and perhaps
> compare against tree-ssa-dse statistics? I mean, in many functions there
> are no DSE opportunities at all.
Of course. Some DSE will definitely be required because we expose
ABI details only on RTL and expand sometimes is quite stupid. ISTR
either DCE or CSE performs some limited amount of DSE as well?
The most needed and interesting work will be to disentangle RTL expansion
into the "complex" bits to be done on (lowered) GIMPLE and the
mechanical detail of GIMPLE to RTL one instruction at a time. I guess
only during this work we'll learn what we need in lowered GIMPLE.
Richard.
More information about the Gcc-patches
mailing list