This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH, MIPS] 74k madd scheduler tweaks
- From: Richard Sandiford <rdsandiford at googlemail dot com>
- To: Sandra Loosemore <sandra at codesourcery dot com>
- Cc: <gcc-patches at gcc dot gnu dot org>
- Date: Sat, 04 Aug 2012 14:48:06 +0100
- Subject: Re: [PATCH, MIPS] 74k madd scheduler tweaks
- References: <5019A869.firstname.lastname@example.org>
Sandra Loosemore <email@example.com> writes:
> The existing scheduler bypass information for madd on the 74k uses some
> bits copied from the 24k, and is not quite correct. This patch is based
> on one originally sent to us by MIPS and has been present in our local
> source base for years. I've confirmed that we are legally allowed to
> contribute this to the FSF; ok for mainline?
Sorry to ask, but do you have a record of why? Reason I ask is that...
> Index: gcc/config/mips/74k.md
> --- gcc/config/mips/74k.md (revision 189988)
> +++ gcc/config/mips/74k.md (working copy)
> @@ -168,10 +168,11 @@
> ;; mult/madd/msub->int_mfhilo : 4 cycles (default)
> ;; mult->madd/msub : 1 cycles
> ;; madd/msub->madd/msub : 1 cycles
> -(define_bypass 1 "r74k_int_mult,r74k_int_mul3" "r74k_int_madd"
> - "mips_linked_madd_p")
> -(define_bypass 1 "r74k_int_madd" "r74k_int_madd"
> - "mips_linked_madd_p")
> +(define_bypass 1 "r74k_int_mult" "r74k_int_madd")
> +(define_bypass 1 "r74k_int_madd" "r74k_int_madd")
> +(define_bypass 1 "r74k_int_mul3" "r74k_int_madd"
> + "mips_mult_madd_chain_bypass_p")
> ;; --------------------------------------------------------------
> ;; Floating Point Instructions
...this looks like a step backwards. Before reload, the original
version assumes that a multiplication feeding a madd is going to form
a chain _as long as_ the result of the multiplication is used as the
accumulator input to the madd. The condition is important because
r1 = r2 * r3
r6 = r1 * r4 + r5
(which is perfectly OK before reload) shouldn't form a chain.
mult and mul3 are equivalent at this stage because we're still using
pseudo registers and haven't yet introduced uses of LO. Having both
reservations in the same bypass makes sense because of that.
The original version should be OK after reload too. It should then
never be the case that r74k_int_mul3 feeds r74k_int_madd in a way
that satisfies mips_linked_madd_p, so the bypass is harmless but
becomes temporarily redundant. But it should always be the case
that r74k_int_mult only feeds r74k_int_madd in a way that satisfies
mips_linked_madd_p, so the _predicate_ should be harmless but temporarily
equivalent to "always true".
I.e. the original version should behave as:
(define_bypass 1 "r74k_int_mult" "r74k_int_madd")
(define_bypass 1 "r74k_int_madd" "r74k_int_madd")
after reload (with no bypass for r74k_int_mul3).
At least that's how it's supposed to work. The new version is
obviously equivalent in the post-reload case, but seems to be making
the pre-reload case worse. I'm wondering whether someone hit a
situation where mips_linked_madd_p wasn't working properly.