This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH]: GCC Scheduler support for R10000 on MIPS


Kumba <kumba@gentoo.org> writes:
> I've sent this up before, but I believe it just got lost in the shuffle.  Is 
> there anything else that I can do to assist in getting this into GCC?

Hmm.  Out of interest, when did you last submit it?  The last patch
I could see was:

    http://article.gmane.org/gmane.comp.gcc.patches/109344

which seemed like work in progress rather than a submission.  You said
you'd submit an updated patch:

    http://article.gmane.org/gmane.comp.gcc.patches/109444

but I don't remember seeing one, and I can't find any record of it
in the archives.

Sorry if you've been frustrated by the lack of action here.

Have you run the patch through the gcc testsuites?

> +;; VR1x000 pipeline description.
> +;;   Copyright (C) 2005, 2006 Free Software Foundation, Inc.

You had to update some of the types for 4.4, so I assume this should
include at least 2008.

> +;; This file is part of GCC.
> +
> +;; GCC is free software; you can redistribute it and/or modify it
> +;; under the terms of the GNU General Public License as published
> +;; by the Free Software Foundation; either version 2, or (at your
> +;; option) any later version.
> +
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT
> +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +;; License for more details.
> +
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING.  If not, write to the
> +;; Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston,
> +;; MA 02110-1301, USA.

Needs to be GPLv3.

> +;; This file overrides parts of generic.md.  It is derived from the
> +;; old define_function_unit description.

I don't think this is useful, given that there was no R10k
define_functin_unit description in FSF sources.

Minor nit, but you've sometimes used many blank lines to separate things.
Seems a bit excessive: two should be enough.

Coding convention nit: all comments should start with a capital letter
and end with ".", even if they aren't real sentences.

> +;; R12K/R14K/R16K are derivatives of R10K, thus copy its description
> +;; until specific tuning for each is added
> +
> +
> +;; R10000 has int queue, fp queue, address queue
> +(define_automaton "r10k_int, r10k_fp, r10k_addr")
> +
> +;; R10000 has 2 integer ALUs, fp-adder and fp-multiplier, load/store
> +(define_cpu_unit "r10k_alu1" "r10k_int")
> +(define_cpu_unit "r10k_alu2" "r10k_int")
> +(define_cpu_unit "r10k_fpadd" "r10k_fp")
> +(define_cpu_unit "r10k_fpmpy" "r10k_fp")
> +(define_cpu_unit "r10k_loadstore" "r10k_addr")
> +
> +;; R10000 has separate fp-div and fp-sqrt units as well and these can
> +;; execute in parallel, however their issue & completion logic is shared
> +;; by the fp-multiplier
> +(define_cpu_unit "r10k_fpdiv" "r10k_fp")
> +(define_cpu_unit "r10k_fpsqrt" "r10k_fp")

Do you actually model the relationship with the multiplier?  If not,
it might be worth a comment saying so.

Given that both r10k_fpdiv and r10k_fpsqrt are long-latency insns,
and that they can be used independently, we might get smaller
automata if we split r10k_fp into three: r10k_fp, r10k_fpdiv and
r10k_fpsqrt.  Could you try this and see how it affects the total
number of states?

> +;; Integer add/sub + logic ops, and mf/mt hi/lo can be done by alu1 or alu2
> +;; Miscellaneous arith goes here too (this is a guess)
> +(define_insn_reservation "r10k_arith" 1
> +  (and (eq_attr "cpu" "r10000,r12000,r14000,r16000")
> +       (eq_attr "type" "arith,mfhilo,mthilo,slt,clz,const,nop,trap"))
> +  "r10k_alu1 | r10k_alu2")

"logical" seems to be missing from the .md file.  Should it be in
this reservation?

> +;; Only ALU2 does int multiplications and divisions
> +;; R10K allows an int insn using register Lo to be issued
> +;; one cycle earlier than an insn using register Hi for
> +;; the insns below, however, we skip on doing this
> +;; for now until correct usage of lo_operand() is figured
> +;; out.

The way to implement that sort of thing is define_bypass.  E.g.
make the default latency work for LO, then add a define_bypass that is
conditional on the int insn using HI.  (I assume that way round would be
better because using LO is the common case.)  You'd have to use a custom
predicate, along the lines of mips_store_data_bypass_p.

No need to do that if you don't want, but I think the last sentence is a
little misleading as it stands.

> @@ -587,6 +587,10 @@ static const struct mips_cpu_info mips_c
>
>     /* MIPS IV processors. */
>     { "r8000", PROCESSOR_R8000, 4, 0 },
> +  { "r10000", PROCESSOR_R10000, 4, 0 },
> +  { "r12000", PROCESSOR_R12000, 4, 0 },
> +  { "r14000", PROCESSOR_R14000, 4, 0 },
> +  { "r16000", PROCESSOR_R16000, 4, 0 },
>     { "vr5000", PROCESSOR_R5000, 4, 0 },
>     { "vr5400", PROCESSOR_R5400, 4, 0 },
>     { "vr5500", PROCESSOR_R5500, 4, PTF_AVOID_BRANCHLIKELY },

The formatting here looks odd, but maybe it's just mailer mangling.

I'd prefer to map r12000, r14000 and r16000 to PROCESSOR_R10000,
rather than have four PROCESSOR_* values that do the same thing.
I take your point about leaving processor-specific tuning as
future work, but I think the split should be introduced as part
of that work rather than here.

> @@ -234,6 +238,10 @@ enum mips_code_readable_setting {
>   #define TARGET_MIPS5500             (mips_arch == PROCESSOR_R5500)
>   #define TARGET_MIPS7000             (mips_arch == PROCESSOR_R7000)
>   #define TARGET_MIPS9000             (mips_arch == PROCESSOR_R9000)
> +#define TARGET_MIPS10000            (mips_arch == PROCESSOR_R10000)
> +#define TARGET_MIPS12000            (mips_arch == PROCESSOR_R12000)
> +#define TARGET_MIPS14000            (mips_arch == PROCESSOR_R14000)
> +#define TARGET_MIPS16000            (mips_arch == PROCESSOR_R16000)
>   #define TARGET_SB1                  (mips_arch == PROCESSOR_SB1		\
>   				     || mips_arch == PROCESSOR_SB1A)
>   #define TARGET_SR71K                (mips_arch == PROCESSOR_SR71000)
> @@ -250,6 +258,10 @@ enum mips_code_readable_setting {
>   #define TUNE_MIPS6000               (mips_tune == PROCESSOR_R6000)
>   #define TUNE_MIPS7000               (mips_tune == PROCESSOR_R7000)
>   #define TUNE_MIPS9000               (mips_tune == PROCESSOR_R9000)
> +#define TUNE_MIPS10000              (mips_tune == PROCESSOR_R10000)
> +#define TUNE_MIPS12000              (mips_tune == PROCESSOR_R12000)
> +#define TUNE_MIPS14000              (mips_tune == PROCESSOR_R14000)
> +#define TUNE_MIPS16000              (mips_tune == PROCESSOR_R16000)
>   #define TUNE_SB1                    (mips_tune == PROCESSOR_SB1		\
>   				     || mips_tune == PROCESSOR_SB1A)
>   #define TUNE_24K		    (mips_tune == PROCESSOR_24KC	\

I generally try to avoid having TARGET_* and TUNE_* macros that
aren't used.  (Again, it's a case of "add it when you need it".)
There are some old macros that are unused, but still.

Looks good otherwise, thanks.

Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]