[Bug rtl-optimization/98863] [11 Regression] WRF with LTO consumes a lot of memory in REE, FWPROP and x86 specific passes

rguenther at suse dot de gcc-bugzilla@gcc.gnu.org
Mon Feb 8 09:01:44 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863

--- Comment #40 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 8 Feb 2021, rsandifo at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863
> 
> --- Comment #39 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
> Just to give an update on this: I have a patch that reduces the
> amount of memory consumed by fwprop so that it no longer seems
> to be outlier.  However, it involves doing more bitmap operations.
> In this testcase we have a larger number of registers that
> seem to be live but unused across a large region of code,
> so bitmap ANDs with the live in sets are expensive and hit
> the worst-case O(nblocks×nregisters).  I'm still trying to find
> a way of reducing the effect of that.

But isn't this what the RD problem does as well (yeah, DF shows
up as quite compile-time expensive here), and thus all UD/DU chain
users suffer from the same issue?

What I didn't explore further is re-doing the way RD numbers defs
in the bitmaps with the idea that all defs just used inside a
single BB are not necessary to be represented (the local problems
take care of them).  But that of course only helps if there are
a significant number of such defs (shadowed by later defs of the same
reg in the same BB) - which usually should be the case.  There's
extra overhead for re-numbering things of course (but my hope was
to make the RD problem fit in the cache for a nice speedup...)


More information about the Gcc-bugs mailing list