This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch RFC] SH: PR target/21623


Kaz Kojima wrote:

Hi,

PR target/21623 is a 4.0/4.1 regression that I could reproduce
only with -O. The compiler claims for moving from the T register
to the FP register:

foo2.i:78: error: insn does not satisfy its constraints:
(insn 190 188 275 24 (set (reg/v:SI 76 fr12 [orig:169 n ] [169])
(reg:SI 147 t)) 129 {movsi_ie} (insn_list:REG_DEP_TRUE 188 (nil))
(nil))


I find it worring that we even consider using a floating point register there. Why?
I have trouble checking out a functional unified tree with 4.0, do you know
vintage for a source-compatible sources checkout of bfd/binutils/sim/newlib?


I can't find any reload information for this insn in the .greg
dump and it seems that the appended patch makes the reload happen
on this insn. I'm not sure if this is in the right direction and
even if so, 9 is the appropriate cost value.


:REVIEWMAIL:

When you changed the cost, you probably perbed the register allocation
so that the reload was not needed in the first place. Still, that is no
guarantee that the situation can never arise. For moves that cannot be
done directly, we need suitable clauses in
SECONDARY_INOUT_RELOAD_CLASS. IIRC there is also some
code that takes the secondary (and tertiary?) reloads into account to increase
the register move cost.


However, we can actually do this move directly - more or less. It needs three
instructions, but no intermediate registers are required, and the scheduling
can also be done quite well.


For SH2e and SH3e, the following sequence allows execution in two or three cycles:
bf/s 0f
fldi0 fr12
fldi1 fr12
0:


For SH4, the fllowing is better because the branch can be dual-issued with the fldi1:
fldi0 fr12
bf 0f
fldi1 fr12
0:


For this sequence, it also makes sense to split off the initial fldi0 to give more
scheduling freedom. I.e. the SH2e / SH3e sequence could be used in the move
pattern, and an SH4 splitter can then separate the fldi0 from the conditional fldi1.
(See one_cmpldi2+1 and cneg for an existing example of a fake conditional execution
pattern in sh.md)



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]