This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: RFA: reload infrastructure to fix PR target/21623
Ian Lance Taylor wrote:
[What do you gain making targetm const?]
The possibility of faster code when compiling with -combine, as it
becomes possible to inline the target functions. I admit this is a
long-term possibility.
But then, inlining a dispatch via another function pointer would not be
faster
than not using the extra function pointer in the first place. And it
won't work
when you intend to switch targetm's. So what is needed to get the best
of both worlds
is a way to get a target-specific gcc_target definition with all the
fields declared const
that actually are. I suppose a port could list all the fields that are
being written to,
and a gen* program can then generate a custom gcc_target definition.
Or, even simpler, have default_emit_secondary_reload insist on having
a single scratch [sic (1)] register, and emit instructions to copy from FROM to
the scratch register, and then from the scratch register to TO.
No, that is completely besides the point. You can use the default function
only when there is no secondary reload that needs to be handled in a
non-default
way.
I'm not sure how that differs from what I said.
I don't think there are many ports - if any - that don't need reload
patterns. Having
a simple default is no use if you can't use it.
You could call this default function from your hook, but you'd have to
decide first if you should do that. That means duplicating decision
logic from
the time when you select the reload classes.
Indeed, it could make sense not to have a target hook for the insn
emitting at all,
but have the regclass calculating code set a function pointer for that.
We could have more than one, so that separate steps can be described with
separate functions - which might even be gen_* function generated by the md
file. This might me interwoven with the dependency information. E.g. we
could specify that the value in a particular scratch register is to be
calculated
by using the addsi_mark3 expander with two specified inputs. The
calculation
of the scratch register is then dependent on these two inputs.
Using separate patterns for separate steps can not only make it easier
to break up
the task and reuse already existing expanders, it is also necessary in
order to
elide steps that become unnecessary because of reload inheritance.
So, each of these expansions has a function pointer to call, a number of
values that it sets, and likely / possibly some inputs and/or internal
scratch
registers. A possible refinement here is also that where convenient,
information about the number and kind of operands can be specified with an
instruction name, and translated into the appropriate assignments with some
gen* magic.
I think that what you are talking about at this point amounts to
writing the instructions which do the reload.
Not quite. There is still the possibility to use an expander that does
something
clever depending on the excact register assignments. And we don't
need lots of tomprary rtl that we might throw away again at the start of the
next find_reloads iteration.
Perhaps it would make
sense to produce a sequence of insns, plus a list of required
registers. For each required register we would point to the insn
which set it and the insn(s) which used it (plus the location(s)
within each insn where it is used).
While it makese sense to have a known good order, it should be understood
that some reloads can be interchanged in their order to optimize
register usage
and / or reload inheritance
We could explicitly represent the
value that the register will have, for reload inheritance.
That would be a REG_EQUAL note if the insn has a single_set?
The
dependencies are described by the insn list. The final insn in the
sequence would set the register being reloaded, or the memory location
to which the register is being stored.
The possible drawback I see here is that you might want to emit
different insns depending on the particular register that reload
chooses. That could be handled in principle by using insn
altenatives. I'm not sure if it would be otherwise problematical.
A last resort would be to write the insn in some opaque form (maybe even
with an unspec) and take it apart with a post-reload splitter.
I think the performance issues from having so much temporary rtl would
be the worst impact.
Although I have to admit it would be kind of neat to have the reloads
represented as rtl with pseudos and then run cse over them (with measures in
place to keep register pressure bounded, of course).