[Patch ARM] Refactor neon-schedgen.ml and regenerate A8 neon descriptions.

Wed Aug 25 14:12:00 GMT 2010

On Wed, 2010-08-25 at 12:04 +0100, Julian Brown wrote:
> On Wed, 25 Aug 2010 09:07:49 +0100
> Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> wrote:
> 
> > Hi,
> > 
> > This patch refactors neon-schedgen a bit to allow support for multiple
> > cores and generate multiple scheduler descriptions out of a single
> > source. I am only submitting the regenerated A8 descriptions and the
> > changes to the ML. 
> > 
> > The major change to neon-schedgen.ml is the addition of a core type,
> > thus adding support for a new core is essentially adding the new core
> > type, adding the core to the list of allCores and then ending up
> > regenerating all the insn_reservations and the bypasses from this. The
> > only bits that are still hand-written are the bits that bolt these
> > reservations and bypasses with the interactions with the other parts
> > of the scheduler description.
> 
> I might have the wrong end of the stick here but... neon.md is filled
> with "neon_type" specifications which are already hairy and verbose,

No that is not what I'm doing. 

> e.g.:
> 
>   [(set (attr "neon_type")
>       (if_then_else (ne (symbol_ref "<Is_float_mode>") (const_int 0))
>                     (if_then_else (ne (symbol_ref "<Is_d_reg>") (const_int 0))
>                                   (const_string "neon_fp_vadd_ddd_vabs_dd")
>                                   (const_string "neon_fp_vadd_qqq_vabs_qq"))
>                     (if_then_else (ne (symbol_ref "<Is_d_reg>") (const_int 0))
>                                   (if_then_else
>                                     (ne (symbol_ref "<Scalar_mul_8_16>") (const_int 0))
>                                     (const_string "neon_mul_ddd_8_16_qdd_16_8_long_32_16_long")
>                                     (const_string "neon_mul_qqq_8_16_32_ddd_32"))
>                                   (if_then_else (ne (symbol_ref "<Scalar_mul_8_16>") (const_int 0))
>                                     (const_string "neon_mul_qqq_8_16_32_ddd_32")
>                                     (const_string "neon_mul_qqq_8_16_32_ddd_32")))))]
> 
> It seems like adding cortex_a8 to the start of each of these won't
> exactly help make these clearer -- and what are we going to do about
> new cores? The real problem here is that these "neon_types" are far too
> strongly tied to one physical implementation of the NEON ISA. My
> (admittedly vague) notion is that we should be moving towards more
> "logical" units for the neon instructions, decoupling them from any
> particular implementation, and hopefully cut down on these rather opaque
> conditionals whilst doing it. Or is that in fact what you are
> suggesting?

Errr.. IIUC neon-schedgen.ml is responsible only for generating the
scheduler descriptions that use *neon_types* . All that this does is
prefix the define_insn_reservation names. You need a per-core
insn_reservation to be able to describe how it interacts with the rest
of the core and the latency might well be different in different cores. 

What I have done is pretty much the following and a rather straight
forward renaming exercise.

< a small hunk from the new A8 neon scheduler description is as below> 

-(define_insn_reservation "neon_int_1" 3
+(define_insn_reservation "cortex_a8_neon_int_1" 3
   (and (eq_attr "tune" "cortexa8")
        (eq_attr "neon_type" "neon_int_1"))
   "cortex_a8_neon_dp")

I'm not trying to change the values that "neon_types" can take. Notice
the cortex_a8 prefix is only for the insn reservations. I'm trying to
get to a point where these reservations can be made on a per-core basis.
neon-schedgen.ml generates define_insn_reservations and define_bypasses
on functional units in the A8 neon description that *use* the neon_types
to decide which instructions are mapped to which functional units. 

> 
> If you're going to hack neon-schedgen.ml (which I didn't write, and
> admit I don't have any great understanding of), would it make sense to
> create a new table for Cortex-A9 (etc.), rather than mixing
> definitions for all cores together in the existing availability_table?

There are situations where we might want to use common aspects of the engine in the table and thus a filter function appeared to be good enough
on the table rather than splitting this table into a "common across all cores" bit, and a per core bit.

> 
> Also a minor point on the ML: you can write,
> 
> let coreStr core = 
>                  match core with
>                    CortexA8 -> "cortex_a8" 
>                  | CortexA9 -> "cortex_a9"
> 
> as:
> 
> let coreStr = function
>     CortexA8 -> "cortex_a8"
>   | CortexA9 -> "cortex_a9"

Can do . I haven't written any ML in a long time.

> 
> I know there's no GNU style guide for ML, but please try to follow the
> existing formatting (re: indentation and so forth) if possible.

Thanks for the review. I thought I had done that but will go back and
double check and follow up with a patch. 

Cheers
Ramana

> 
> Cheers,
> 
> Julian