This is the mail archive of the
mailing list for the GCC project.
Re: [RFC] Fix PR rtl-optimization/27616
- From: Roger Sayle <roger at eyesopen dot com>
- To: Eric Botcazou <ebotcazou at libertysurf dot fr>, Ian Lance Taylor <iant at google dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 22 Aug 2006 08:36:52 -0600 (MDT)
- Subject: Re: [RFC] Fix PR rtl-optimization/27616
Hi Eric and Ian,
On Fri, 11 Aug 2006, Eric Botcazou wrote:
> > While this is obviously a very unusual test case, I think that you
> > have set the cap much too high. I can't imagine a case which would
> > successfully simplify after even four recursive calls to fold_rtx_mem.
> > I think if you set the recursion limit to, say, ten, you will catch
> > anything that could possibly simplify, and that is likely to be quite
> > a bit fewer than the number of entries in the hash table.
> Since we are really talking about corner cases, the idea was to be quite
> conservative. However if the consensus is that a small fixed cap is good
> enough, fine with me.
> > I would also be interested in hearing what Roger Sayle thinks when he
> > gets back from vacation, but don't let that stop you from putting the
> > patch in in the meantime.
> OK, let's wait for Roger's input.
Interesting. I agree with both positions. Firstly, many thanks to
Eric for coming up with a solution to this P1. When I'd briefly looked
into the problem, I'd been stumped by the lack of a first order kludge.
Clearly, the correct solution is the proposed iteration cap.
I see no problem going with Eric's proposed deep limit of table_size+3,
which as explained in his comment is arguably the deepest recursion that
could usefully be simplified by cse.c. However, if I was writing the
code myself I'd probably take the pragmatic approach of going with Ian's
suggestion of using an arbitrary constant limit, perhaps 10. [I might
even be curious enough to instrument the code to discover its maximal
(and maximal useful) value during a GCC bootstrap]. It's a close call
and in my opinion at the discretion of the patch author.
No matter which solution is used we know from the age of this design
flaw that problematic cases are rare. For me, the trade-off is between
the remote possibility of missing a deep simplification opportunity vs.
requiring the overhead of tracking table_size in common case code,
and the potential of quadratic behaviour in CSE (if it were not for
PARAM_MAX_CSE_INSNS). Often not all of the table_size values in the
hash_table are relevant, and with the default max-cse-insns of 1000,
allowing the recursion to go several thousand deep does seem like
overkill. Whilst usually such "magic numbers" are frowned
upon, the precedent in fold-const.c:extract_muldiv is to go with the
much simpler, though imperfect solution. If anyone cared enough, they
could make it a param. Ahh, the classic science vs. engineering
trade-off; Never a universally correct answer. My opinion in such
50-50 cases, is that person actually doing the hard work gets to be
I'm happy with either variant going into mainline, and backported to
the 4.1 branch after a few days without problems.
Thanks again for fixing this. I was concerned that a fix for this
problem would either be more invasive, or disable more optimizations.