This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] PR rtl-optimization/14851: Move three-linked combinepass before two-linked
- From: Uros Bizjak <uros at kss-loka dot si>
- To: Bernd Schmidt <bernds_cb1 at t-online dot de>
- Cc: gcc-patches at gcc dot gnu dot org, Roger Sayle <roger at eyesopen dot com>
- Date: Tue, 15 Mar 2005 12:38:33 +0100
- Subject: Re: [PATCH] PR rtl-optimization/14851: Move three-linked combinepass before two-linked
- References: <4235320D.601@kss-loka.si> <42356089.5030605@t-online.de>
Bernd Schmidt wrote:
I would like to resend a combiner patch that moves three-linked
combine pass before two-linked pass. A long explanation with a
benchmark result can be found in
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02193.html.
* combine.c (combine_instructions): Move 'three linked insns'
pass before 'two linked insns' pass.
This may or may not be a good idea, but the problem you describe
sounds more like an issue with the machine description - it shouldn't
be sensitive to such implementation details in combine.
Perhaps even better solution is for combine to see a bit further and to
consider an RTX_COST of a new combined pattern. This would prohibit
combine from taking the first combination that matches some pattern.
To illustrate the problem, consider some multiple-input patterns, for
example x87 fiop patterns (fiadd, fimul). Some of them can inherently
extend integer into float, and testcase
test1 (double a, short b)
{
return a * b;
}
compiled with gcc -O2 -march=i386 would produce (before combine):
(insn 13 9 14 0 (set (reg:DF 63)
(float:DF (reg/v:HI 60 [ b ]))) 131 {*floathidf2_i387}
(insn_list:REG_DEP_TRUE 8 (nil))
(expr_list:REG_DEAD (reg/v:HI 60 [ b ])
(nil)))
(insn 14 13 18 0 (set (reg:DF 62)
(mult:DF (reg:DF 63)
(reg/v:DF 59 [ a ]))) 407 {*fop_df_comm_i387}
(insn_list:REG_DEP_TRUE 6 (insn_list:REG_DEP_TRUE 13 (nil)))
(expr_list:REG_DEAD (reg:DF 63)
(expr_list:REG_DEAD (reg/v:DF 59 [ a ])
(nil))))
Combine combines insn 14 with the first link in insn_list entry and it
produces:
filds 16(%ebp)
fmull 8(%ebp)
This combination has a latency of (10 + 7).
If the second insn_list entry would be taken, then the produced asm code
would be:
fldl 8(%ebp)
fimuls 16(%ebp)
with the latency of (7 + 7).
Uros.