This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Reorganize rs6000_rtx_costs

From: Geoff Keating <geoffk at geoffk dot org>
To: Roger Sayle <roger at eyesopen dot com>
Cc: gcc-patches at gcc dot gnu dot org
Date: Mon, 5 Jul 2004 17:25:17 -0700
Subject: Re: [PATCH] Reorganize rs6000_rtx_costs
References: <Pine.LNX.4.44.0407051416180.10731-100000@www.eyesopen.com>

On 05/07/2004, at 1:30 PM, Roger Sayle wrote:

On 5 Jul 2004, Geoffrey Keating wrote:

At the moment, however, such optimization is premature;

the rs6000 backend still thinks that "nand" and "nor" require
two instructions, ...


Here's a thought.  The .md file already contains all the information
needed to get a nearly-perfect RTX_COSTS macro in the -Os case.  So
does the .md file for most other ports.  Would it be possible to use
that information?


In theory, it might be possible to write a genrtxcosts.c that looked
at both the define_insn's length attribute and even the DFA pipeline
description's latency for an insn's "type" attribute and generated
a default TARGET_RTX_COST for a backend.  However, this approach
would require all backends to be updated to the new DFA pipeline
descriptions, all of them to use the "length" attribute consistently,
and all of the pipeline timings to be at least as accurate as the
currently hand-crafted rtx_cost functions.

I'm most interested in the -Os case right now. I think that some hand-crafting for the speed case will always be necessary, but the size of an instruction is not something that needs much tweaking. (You could perhaps do the speed case by simply adding an additional attribute to the machine description.)

What I was thinking is this: suppose you wanted a completely correct TARGET_RTX_COST at -Os. You'd end up basically rewriting recog_insn. That's a huge amount of work by hand, insn-recog.c is something like 62k lines on ppc. Even if you did that, you'd end up having to do a similar amount of work for mips, for x86, for ia64...

A lot of the limitations of the current TARGET_RTX_COST for powerpc is that this work hasn't been done; it's been implemented for a few special cases that seemed to be important. That's why it doesn't know about the multiply-add instructions, it doesn't know about the logical instructions that perform a complement, it doesn't know about the rotate-and-mask instructions, it doesn't know about Scc operations, and so on. Historically this hasn't been very important, but now that combine uses it (and so theoretically *every* possible combination needs to be correct) it has become more so.

I suspect this will eventually be achievable, but there's a lot of
legacy code (machine description) that might need to be cleaned up
first.

I'd go the other way. I'd try to write an automated RTX_COST generation first, and *then* worry about cleaning up machine descriptions if necessary. Until you have the automated generation, you won't know what the cleaned-up versions should look like anyway.

I'm also not sure how often the middle-end uses rtx_cost to estimate
the costs of simple RTL that isn't a recognized backend insn pattern.
For example, the cost of PLUS:DI or PLUS:TI which may require a
handful of native instructions.  In these cases "genrecog"-like
machinery alone wouldn't be sufficient.

Ideally, you would synthesize that information based on the algorithms optabs.c uses. I don't think this actually applies to PLUS, but IOR:DI would be synthesizable. (PLUS:DI actually is recognized by the ppc backend; it says it has length 8.)

References:
- Re: [PATCH] Reorganize rs6000_rtx_costs
  - From: Roger Sayle

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]