This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] The Integrated Register Allocator


Steven Bosscher wrote:


Some more specific comments that came up while browsing through your patch:

* IRA appears to have its own implementation of the LIVE problem,
instead of using DF. I think it's a shame to return to multiple
implementations of elementary dataflow problems like liveness.

I guess i missed this in my quick reading. this is not acceptable. I understand the reasons why you were scanning the insn yourself in some places, but i missed the part where you were doing your own live.

* Likewise for the other dataflow solvers in your patch.  E.g. why not
write calculate_save_in_out as a df_problem?
  I bet IRA will show compile time problems for some "complicated CFG"
test cases, like Brad Lucier's infamous Scheme interpreter, or Ada
code with lots of SJLJ exception handling code (i.e. the kind of code
that made us implement different worklist based solvers for DF to
handle "bad" CFGs).

I agree.

* You have dropped local-alloc, bravo for that!  But what about cases
where REG_NO_CONFLICT is still necessary to get good code?  IRA
doesn't seem to have anything to replace this feature.

* You use live ranges for allocation, but finding them is effectively
duplication of the the work that web.c does, IIUC (do I?).  Any reason
to not just run web.c before the register allocator, or base live
range splitting on the web.c code?  Note you can hook into the web.c
code (like see.c does) if you need something special done for each
live range.

* IRA has so many allocation algorithms.  One of the points of
criticism on new-ra always was that it implemented too many register
allocation algorithms without implementing one really well.  Is it the
plan to settle on one at some point, or are all going to stay?  It
seems to me that having many algorithms with one as the default is
going to lead to bit-rot in the other ones (basically the same
situation as keeping the existing RA after IRA is committed).  I'd
much rather see GCC settle on one algorithm and make sure it works
well.

* The hunks that introduce get_call_invalidated_regs() calls are
probably for the IPA RA bits?  It'd be nice to see this as a separate
patch, if you go look for points to split the IRA patch into
manageable hunks...

* What is the status of df_invalidated_by_call after your patch?
You've replaced it in one place (df-scan.c) with
get_call_invalidated_regs() but it's not clear to me what you do with
the other uses of it.  And, instead of changing df-scan.c in the place
you did, can't you instead update the way df_invalidated_by_call is
computed?

* The print-rtl.c bits, what are they for?  This isn't clear to me,
from the patch...

* There is only one timevar for all of IRA. I think it would be good
to have different timevars for different phases of the allocator.
* Do you know where the hot spots are for IRA? Functions like
tune_allocno_costs_and_cover_classes()
(O(n_allocnos*n_hardregs*n_calls)) and
setup_allocno_cover_class_and_costs() (O(n_allocnos*n_hardregs)) look
like they might be expensive, but there are no other obvious places.
It'd be nice to know where to look if someone wants to attack that
3-10% extra compile time.


* In struct allocno_live_range, you have the ints start and finish.
What are they?  The comment "Program point range" isn't very helpful.
Are these ints INSN_UIDs?  Something else that's internal for IRA??

* The ira-max-loops-num default value is 50 in params.def, not 20 as
described in invoke.texi (but kuddos for not forgetting about user
documentation!).

* the estimate_reg_pressure_cost change looks like a bad idea to me:
1.The function no longer does what it claims to do in the pre-function comment
2. It seems to me that you shouldn't change the invariant loop code
motion heuristics to be dependent on whether IRA runs or not.
3. Your change basically tells the LICM algorithms to move all they
can, regardless of register pressure.  Surely this can lead to levels
of register pressure that even IRA can't resolve without excessive
spilling?

* The ira-max-loops-num parameter is yet another parameter to tell GCC
to favor compile time over good code generation for very large
functions.  This is unfortunate IMHO because it conflicts with
inlining: When GCC inlines a lot in order to generate good code, it
forces itself to fall back to worse algorithms for code generation
further down in the passes pipe line. With LTO (and thus probably more
inlining) this is only going to get worse. Is there any way to avoid
this silly conflict?

Gr.
Steven


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]