[RFA] The Integrated Register Allocator

Wed Apr 2 22:38:00 GMT 2008

Steven Bosscher wrote:
>
>
> Some more specific comments that came up while browsing through your patch:
>
> * IRA appears to have its own implementation of the LIVE problem,
> instead of using DF. I think it's a shame to return to multiple
> implementations of elementary dataflow problems like liveness.
>
>   
I guess i missed this in my quick reading.    this is not 
acceptable.     I understand the reasons why you were scanning the insn 
yourself in some places, but i missed the part where you were doing your 
own live.  

> * Likewise for the other dataflow solvers in your patch.  E.g. why not
> write calculate_save_in_out as a df_problem?
>   I bet IRA will show compile time problems for some "complicated CFG"
> test cases, like Brad Lucier's infamous Scheme interpreter, or Ada
> code with lots of SJLJ exception handling code (i.e. the kind of code
> that made us implement different worklist based solvers for DF to
> handle "bad" CFGs).
>
>   
I agree.

> * You have dropped local-alloc, bravo for that!  But what about cases
> where REG_NO_CONFLICT is still necessary to get good code?  IRA
> doesn't seem to have anything to replace this feature.
>
> * You use live ranges for allocation, but finding them is effectively
> duplication of the the work that web.c does, IIUC (do I?).  Any reason
> to not just run web.c before the register allocator, or base live
> range splitting on the web.c code?  Note you can hook into the web.c
> code (like see.c does) if you need something special done for each
> live range.
>
> * IRA has so many allocation algorithms.  One of the points of
> criticism on new-ra always was that it implemented too many register
> allocation algorithms without implementing one really well.  Is it the
> plan to settle on one at some point, or are all going to stay?  It
> seems to me that having many algorithms with one as the default is
> going to lead to bit-rot in the other ones (basically the same
> situation as keeping the existing RA after IRA is committed).  I'd
> much rather see GCC settle on one algorithm and make sure it works
> well.
>
> * The hunks that introduce get_call_invalidated_regs() calls are
> probably for the IPA RA bits?  It'd be nice to see this as a separate
> patch, if you go look for points to split the IRA patch into
> manageable hunks...
>
> * What is the status of df_invalidated_by_call after your patch?
> You've replaced it in one place (df-scan.c) with
> get_call_invalidated_regs() but it's not clear to me what you do with
> the other uses of it.  And, instead of changing df-scan.c in the place
> you did, can't you instead update the way df_invalidated_by_call is
> computed?
>
> * The print-rtl.c bits, what are they for?  This isn't clear to me,
> from the patch...
>
> * There is only one timevar for all of IRA.  I think it would be good
> to have different timevars for different phases of the allocator.
>  
> * Do you know where the hot spots are for IRA?  Functions like
> tune_allocno_costs_and_cover_classes()
> (O(n_allocnos*n_hardregs*n_calls)) and
> setup_allocno_cover_class_and_costs() (O(n_allocnos*n_hardregs)) look
> like they might be expensive, but there are no other obvious places.
> It'd be nice to know where to look if someone wants to attack that
> 3-10% extra compile time.
>
> * In struct allocno_live_range, you have the ints start and finish.
> What are they?  The comment "Program point range" isn't very helpful.
> Are these ints INSN_UIDs?  Something else that's internal for IRA??
>
> * The ira-max-loops-num default value is 50 in params.def, not 20 as
> described in invoke.texi (but kuddos for not forgetting about user
> documentation!).
>
> * the estimate_reg_pressure_cost change looks like a bad idea to me:
> 1.The function no longer does what it claims to do in the pre-function comment
> 2. It seems to me that you shouldn't change the invariant loop code
> motion heuristics to be dependent on whether IRA runs or not.
> 3. Your change basically tells the LICM algorithms to move all they
> can, regardless of register pressure.  Surely this can lead to levels
> of register pressure that even IRA can't resolve without excessive
> spilling?
>
> * The ira-max-loops-num parameter is yet another parameter to tell GCC
> to favor compile time over good code generation for very large
> functions.  This is unfortunate IMHO because it conflicts with
> inlining: When GCC inlines a lot in order to generate good code, it
> forces itself to fall back to worse algorithms for code generation
> further down in the passes pipe line. With LTO (and thus probably more
> inlining) this is only going to get worse. Is there any way to avoid
> this silly conflict?
>
> Gr.
> Steven
>