This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PR68432 00/26] Handle size/speed choices for internal functions
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>, richard dot sandiford at arm dot com
- Date: Wed, 25 Nov 2015 13:48:23 +0100
- Subject: Re: [PR68432 00/26] Handle size/speed choices for internal functions
- Authentication-results: sourceware.org; auth=none
- References: <874mgajmo9 dot fsf at e105548-lin dot cambridge dot arm dot com>
On Wed, Nov 25, 2015 at 1:20 PM, Richard Sandiford
<richard.sandiford@arm.com> wrote:
> This series fixes PR 68432, a regression caused by my internal-functions-
> for-optabs series. Some of the libm optabs in i386.md have a true HAVE_*
> condition but conditionally FAIL if we're optimising for size:
>
> if (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH
> && !flag_trapping_math)
> {
> if (TARGET_ROUND)
> emit_insn (gen_sse4_1_round<mode>2
> (operands[0], operands[1], GEN_INT (ROUND_MXCSR)));
> else if (optimize_insn_for_size_p ())
> FAIL;
> else
> ix86_expand_rint (operands[0], operands[1]);
> }
>
> This is going to cause problems if we want to make more decisions
> related to optabs at the gimple level.
>
> We've already had the same problem in rtl, where some patterns used
> to check optimize_insn_for_size_p in their C condition, and would then
> fail to match if an instruction were moved from a hot block to a cold
> block. Now rtl has two attributes, preferred_for_size and
> preferred_for_speed, that say whether a particular alternative of a
> particular instruction should be used when optimising for size or speed
> respectively. We try to honour a "false" value as much as possible,
> but it's not an absolute guarantee.
>
> The point of this series is to extend preferred_for_size and
> preferred_for_speed to define_expands, so that the attributes
> can be tested for optabs too.
So to not re-introduce the same position issue at GIMPLE (passes
querying optimize_bb_for_speed/size and with that querying
the optab for IFN support) when expanding an internal function
you ignore whether it actually "exists"? That is, IFN expansion
will skip the define_expand (which is maybe disabled)?
Otherwise cleaning this things up is nice. The question is whether
it really solves all the issues ;)
Richard.
> enabled, preferred_for_size and preferred_for_speed are supposed
> to be an inavariant property of a given (code, alternative) pair.
> They're not supposed to depend on a particular insn or its operands.
> However, the attribute functions still take an rtx_insn * as argument,
> so mistakes are only caught at runtime if we see a specific instruction
> for which the attributes conflict with the cached ones
> (recog.c:check_bool_attrs).
>
> Extending the attributes to define_expands means that we finally
> need to "fix" that and make the attributes take a (code, alternative)
> pair instead. Most ports were already structured to allow that.
> The two exceptions are ARM and NDS32.
>
> The problem for NDS32 is that "enabled" depends on "length", which
> needs access to an instruction in order to calculate branch lengths.
> This isn't a problem in practice because all instructions with
> operand-dependent lengths force "enabled" to 1. However,
> it's easier to enforce the restriction at compile time if we
> have an "is_16bit" attribute, to go along with the TARGET_16_BIT
> condition that controls whether 16-bit instructions can be used.
>
> The problem for ARM is that "enabled" depends on "type" and
> "use_literal_pool", both of which use match_operand tests in some cases.
> I think the "type" match_operands were actually OK for "enabled" in
> practice, but I think the "use_literal_pool" ones are a genuine bug.
> They are used when we have one alternative that accepts both memory and
> immediate operands. The alternative is supposed to be disabled for
> immediate operands when arm_disable_literal_pool is true, but not
> disabled for memory operands. However, the "enabled" cache only cares
> about the alternative number, so we can end up disabling memory operands
> if we cached based on an immediate operand, or allow immediate operands
> if we cached based on a memory operand. The series fixes that by
> splitting alternatives where necessary.
>
> Due to the define_subst patches, it's already syntactically possible to
> attach attributes to define_expands, but genattrtab.c currently ignores
> them. The series restricts these attributes to the new "code,alternative"
> style and then handles them in the same way as define_insn attributes.
>
> I realise this is rather invasive for stage 3, but I think it's
> worth fixing the bug "properly" rather than papering over it.
> The ARM "use_literal_pool" bug described above shows how easy
> it is for the enabled attribute to go wrong in subtle ways.
>
> The series is split into five parts:
>
> (1) Make the ARM and NDS32 changes described above
> (2) Add support for "code,alternative" attributes
> (3) Make all ports use them for enabled, preferred_for_size and
> preferred_for_speed
> (4) Use preferred_for_size and preferred_for_speed to decide whether
> to use internal functions
> (5) Convert the internal-function-related i386 patterns to use
> preferred_for_size instead of FAILing.
>
> (3) is a purely mechanical change. I think it counts as obvious if the
> other parts are OK.
>
> Tested by building GCC before and after the series on:
>
> aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi
> arm-linux-gnueabihf avr-rtems bfin-elf c6x-elf cr16-elf cris-elf
> epiphany-elf fr30-elf frv-linux-gnu ft32-elf h8300-elf
> ia64-linux-gnu iq2000-elf lm32-elf m32c-elf m32r-elf
> m68k-linux-gnu mcore-elf mep-elf microblaze-elf mips-linux-gnu
> mmix mn10300-elf moxie-elf msp430-elf nds32le-elf
> hppa64-hp-hpux11.23 nios2-linux-gnu nvptx-none pdp11
> powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu
> sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf tilepro-elf
> xstormy16-elf v850-elf vax-netbsdelf visium-elf xtensa-elf
> x86_64-darwin
>
> and comparing the assembly output for gcc.dg, g++.dg and gcc.c-torture
> at -O2. There were no differences besides the usual timestamps.
>
> Also tested on x86_64-linux-gnu and arm-linux-gnueabihf. I will test
> on powerpc64-linux-gnu as well before committing. OK to install?
>
> Thanks,
> Richard
>