This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: LRA for x86/x86-64 [0/9]


On Tue, Oct 2, 2012 at 10:29 AM, Paolo Bonzini <bonzini@gnu.org> wrote:
> Il 02/10/2012 09:28, Steven Bosscher ha scritto:
>>>   My experience shows that these lists are usually 1-2 elements. Although in
>>> > this case, there are pseudos with huge number elements (hundreeds).  I tried
>>> > -fweb for this tests because it can decrease the number elements but GCC (I
>>> > don't know what pass) scales even worse: after 20 min of waiting and when
>>> > virt memory achieved 20GB I stoped it.
>> Ouch :-)
>>
>> The webizer itself never even runs, the compiler blows up somewhere
>> during the df_analyze call from web_main. The issue here is probably
>> in the DF_UD_CHAIN problem or in the DF_RD problem.
>
> /me is glad to have fixed fwprop when his GCC contribution time was more
> than 1-2 days per year...

I thought you spent more time on GCC nowadays, working for RedHat?
Who's your manager, perhaps we can coerce him/her into letting you
spend more time on GCC :-P


> Unfortunately, the fwprop solution (actually a rewrite) was very
> specific to the problem and cannot be reused in other parts of the compiler.

That'd be too bad... But is this really true? I thought you had
something done that builds chains only for USEs reached by multiple
DEFs? That's the only interesting kind for web, too.


> I guess here it is where we could experiment with region-based
> optimization.  If a loop (including the parent dummy loop) is too big,
> ignore it and only do LRS on smaller loops inside it.  Reaching
> definitions is insanely expensive on an entire function, but works well
> on smaller loops.

Heh, yes. In fact I have been working on a region-based version of web
because it is (or at least: used to be) a useful pass that only isn't
enabled by default because the underlying RD problem scales so badly.
My current collection of hacks doesn't bootstrap, doesn't even build
libgcc yet, but I plan to finish it for GCC 4.9. It's based on
identifying SEME regions using structural analysis, and DF's partial
CFG analysis (the latter is currently the problem).

FWIW: part of the problem for this particular test case is that there
are many registers with partial defs (vector registers) and the RD
problem doesn't (and probably cannot) keep track of one partial
def/use killing another partial def/use. This handling of vector regs
appears to be a general problem with much of the RTL infrastructure.

Ciao!
Steven


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]