This is the mail archive of the
mailing list for the GCC project.
Re: [RS6000] Fix PR61098, Poor code setting count register
- From: Alan Modra <amodra at gmail dot com>
- To: David Edelsohn <dje dot gcc at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 14 May 2014 19:26:29 +0930
- Subject: Re: [RS6000] Fix PR61098, Poor code setting count register
- Authentication-results: sourceware.org; auth=none
- References: <20140508014846 dot GA5162 at bubble dot grove dot modra dot org> <CAGWvnyme6a+QLSxXwzUOxWAHWTokxZCySua_+25hUzaEVYvgPA at mail dot gmail dot com> <20140509024054 dot GE5162 at bubble dot grove dot modra dot org> <CAGWvny=3+XuqiWtxhB4HSXDWHRZMpNkhioTVeEpEs9C4vMWRGg at mail dot gmail dot com> <20140514030448 dot GJ5162 at bubble dot grove dot modra dot org> <CAGWvnymJhh1iR4wQW7GOBkAd=C6RH9eHMOJe8JimNrBsN3MOHg at mail dot gmail dot com>
On Tue, May 13, 2014 at 11:46:20PM -0400, David Edelsohn wrote:
> Danny may have re-organized the code, but I thought that it originally
> came from Tom Rixx, if not earlier.
OK, I'm not trying to apportion blame. My name is on plenty of dodgy
code in the rs6000 backend too. :)
> I seem to remember problems in the past with late creation of TOC
> entries for constants causing problems, so it was easier to fall back
> to materializing all integer constants inline. I don't remember the
> PRs, but I think there were issues with creating a TOC if the late
> constant were the only TOC reference, or maybe the issue was buying a
> stack frame to materialize the TOC/GOT for a late constant. And
> maximum 5 instruction sequence is not really bad relative to a load
> from the TOC (even with medium model / data in TOC). There are a lot
> of trade-offs with respect to I$ expansion versus the load hitting in
> the L1 D$.
Sure, but Steve will tell you that the 5 instruction sequence is both
slower due to all the dependent ops, and results in larger code+data
size. We definitely want to avoid it if possible, and pr67836 shows a
case taken from glibc math library code where there should be no
problem in using the TOC.
> Alpha emit_set_const does limit the number of instructions, but that
> is a search for a more efficient sequence than the naive sequence. The
> Alpha splitters use alpha_split_const_mov(), which tries
> alpha_emit_set_const() for an efficient sequence and then falls back
> to alpha_emit_set_long_const() for a naive sequence. Alpha uses PLUS
> instead of IOR because of the way its ISA works.
> alpha_emit_set_long_const() always will materialize the constant and
> does not check for a maximum number of instructions. This is why it's
> comment says "fall back to straight forward decomposition".
> However, alpha_emit_set_long_const() and alpha_split_const_mov() can
> fail, presumably because emit_move_insn() fails, not because of
> reaching a maximum number of instructions.
> alpha_legitimate_constant_p() rejecs expensive constants early. Once
> the splitter is invoked, it always tries to materialize the constant,
> but the splitter apparently can fail for other reasons.
No, that is wrong. alpha_emit_set_const does *not* always try to
materialize the constant inline. It does so for constants that need
more than three instructions only when TARGET_BUILD_CONSTANTS.
> I don't mind exploring the benefits of tighening up
> rs6000_legitimate_const(), but I'm not sure it's an obvious win,
> especially with the potential failure corner cases.
Yes, those potential corner cases have me worried too..
> However, I want to have a better understanding about the part of the
> patch that removes the FAIL path from the splitters.
That part was really quite simple. I was removing dead code.
rs6000_emit_set_const has never returned anything but DEST, right from
the initial commit. It can't be called with DEST == NULL, so
"dest = gen_reg_rtx (mode)" is also dead code.
However, I retracted that patch because I now think
rs6000_emit_set_const should in fact sometimes result in the splitter
failing, exactly as is done in the alpha port.
Australia Development Lab, IBM