[PATCH 1/2] Add new RTX instruction class FILLER_INSN

Richard Sandiford richard.sandiford@arm.com
Wed Aug 19 10:52:01 GMT 2020

Andrea Corallo <andrea.corallo@arm.com> writes:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
>> Hi Andrea,
>> On Wed, Jul 22, 2020 at 12:02:33PM +0200, Andrea Corallo wrote:
>>> This first patch implements the addition of a new RTX instruction class
>>> FILLER_INSN, which has been white listed to allow placement of NOPs
>>> outside of a basic block.  This is to allow padding after unconditional
>>> branches.  This is favorable so that any performance gained from
>>> diluting branches is not paid straight back via excessive eating of
>>> nops.
>>> It was deemed that a new RTX class was less invasive than modifying
>>> behavior in regards to standard UNSPEC nops.
>> So I wonder if this cannot be done with some kind of NOTE, instead?
> Hi Segher,
> I was having a look into reworking this using an insn note as (IIUC)
> suggested.  The idea is appealing but looking into insn-notes.def I've
> found the following comment:
> "We are slowly removing the concept of insn-chain notes from the
> compiler.  Adding new codes to this file is STRONGLY DISCOURAGED.
> If you think you need one, look for other ways to express what you
> mean, such as register notes or bits in the basic-block structure."
> Would still be justificated in this case to proceed this way?  The other
> option would be to add the information into the basic-block or into
> struct rtx_jump_insn.
> My GCC experience is far from sufficient for having a formed opinion on
> this, I'd probably bet on struct rtx_jump_insn as the better option.

Adding it to the basic block structure wouldn't work because we need
this information to survive until asm output time, and the cfg doesn't
last that long.  (Would be nice if it did, but that's a whole new can
of worms.)

Using REG_NOTES on the jump might be OK.  I guess the note value could
be the length in bytes.  shorten_branches would then need to look for
these notes and add the associated length after adding the length of
the insn itself.  There would then need to be some hook that final.c
can call to emit nops of the given length.

I guess there's also the option of representing this in the same way
as a delayed branch sequence, which is to make the jump insn pattern:

  (sequence [(normal jump insn)
             (delayed insn 1)

The members of the sequence are full insns, rather than just patterns.
For this use case, the delayed insns would all be nops.

However, not much is prepared to handle the sequence representation
before the normal pass_machine_reorg position.  (The main dbr pass
itself is pass_delay_slots, but some targets run dbr within
pass_machine_reorg instead.)  There again, it isn't worth doing
layout optimisations earlier than pass_machine_reorg anyway.


More information about the Gcc-patches mailing list