This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC,PATCH] combine: Don't simplify subregs of promoted types
- From: Ian Lance Taylor <iant at google dot com>
- To: Andreas Krebbel <krebbel1 at de dot ibm dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: 15 Nov 2006 08:55:50 -0800
- Subject: Re: [RFC,PATCH] combine: Don't simplify subregs of promoted types
- References: <20060904112357.GA3964@de.ibm.com>
Replying to an old message. Sorry if I've missed something relevant
in the interim.
Andreas Krebbel <krebbel1@de.ibm.com> writes:
> gcc currently fails to make use of the information that the
> result of a function is already promoted to a wider type.
>
> This is especially annoying on 64bit where an int type is often
> used to indicate the error condition. E.g. the following code contains
> a pointless sign extend instruction in function g (when sibling call
> optimization is disabled):
>
> int f () { return 1; }
> int g () { return f (); }
>
> The DI result of f is accessed by a subreg with the /s flag which
> states that this is a subreg of a promoted type value. combine could
> optimize this when combining insn 9 with 10. The code in simplify-rtx.c:818
> (simplify_unary_operation_1) would then remove the sign_extend.
>
> (insn 8 7 9 2 (set (reg:DI 45)
> (reg:DI 2 %r2)) 51 {*movdi_64} (insn_list:REG_DEP_TRUE 7 (nil))
> (expr_list:REG_DEAD (reg:DI 2 %r2)
> (nil)))
>
> (insn 9 8 10 2 (set (reg:SI 43 [ D.1515 ])
> (subreg/s:SI (reg:DI 45) 4)) 55 {*movsi_zarch} (insn_list:REG_DEP_TRUE 8 (nil))
> (expr_list:REG_DEAD (reg:DI 45)
> (nil)))
>
> (insn 10 9 14 2 (set (reg:DI 47 [ D.1515 ])
> (sign_extend:DI (reg:SI 43 [ D.1515 ]))) 113 {*extendsidi2} (insn_list:REG_DEP_TRUE 9 (nil))
> (expr_list:REG_DEAD (reg:SI 43 [ D.1515 ])
> (nil)))
>
> Unfortunately combine optimizes insn 8 and 9 first to (set (reg:SI 43) (reg:SI 2)).
> In this step we lose the subreg and with it the information that the value is
> already sign extended.
>
> The attached patch prevents combine from optimizing subregs with the /s flag set since
> this is a valueable information. With the patch applied I get a 890 byte smaller cc1
> executable on s390x -0.003%). The number of lgfr (32 -> 64bit sign extend) instructions goes
> down from 21399 to 21083 by 316 (-1.48%).
I'm not comfortable with this patch. You're disabling a (minor) set
of optimizations in the expectation that you are going to see a
sign_extend. If you don't see the sign_extend, then you've made the
code worse.
I think you touch on the right solution here:
> Another point is that combine doesn't draw the promotion of return types into account when
> doing its nonzero_bits analysis. I see that the information for incoming arguments is used
> but not for the return type - is there a special reason for this?
It seems to me that if you fix the code to correctly record
nonzero_bits and sign_bit_copies for return values, then the right
thing will happen. In fact, there is already code to do this, in
check_conversions and record_promoted_value. The comments suggest
that it should fix this exact problem. Why is that not helping?
Ian