This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC,PATCH] combine: Don't simplify subregs of promoted types


Replying to an old message.  Sorry if I've missed something relevant
in the interim.

Andreas Krebbel <krebbel1@de.ibm.com> writes:

> gcc currently fails to make use of the information that the
> result of a function is already promoted to a wider type.
> 
> This is especially annoying on 64bit where an int type is often
> used to indicate the error condition. E.g. the following code contains
> a pointless sign extend instruction in function g (when sibling call 
> optimization is disabled):
> 
> int f () { return 1; }
> int g () { return f (); }
> 
> The DI result of f is accessed by a subreg with the /s flag which
> states that this is a subreg of a promoted type value. combine could
> optimize this when combining insn 9 with 10. The code in simplify-rtx.c:818
> (simplify_unary_operation_1) would then remove the sign_extend.
> 
> (insn 8 7 9 2 (set (reg:DI 45)
>         (reg:DI 2 %r2)) 51 {*movdi_64} (insn_list:REG_DEP_TRUE 7 (nil))
>     (expr_list:REG_DEAD (reg:DI 2 %r2)
>         (nil)))
> 
> (insn 9 8 10 2 (set (reg:SI 43 [ D.1515 ])
>         (subreg/s:SI (reg:DI 45) 4)) 55 {*movsi_zarch} (insn_list:REG_DEP_TRUE 8 (nil))
>     (expr_list:REG_DEAD (reg:DI 45)
>         (nil)))
> 
> (insn 10 9 14 2 (set (reg:DI 47 [ D.1515 ])
>         (sign_extend:DI (reg:SI 43 [ D.1515 ]))) 113 {*extendsidi2} (insn_list:REG_DEP_TRUE 9 (nil))
>     (expr_list:REG_DEAD (reg:SI 43 [ D.1515 ])
>         (nil)))
> 
> Unfortunately combine optimizes insn 8 and 9 first to (set (reg:SI 43) (reg:SI 2)).
> In this step we lose the subreg and with it the information that the value is
> already sign extended.
> 
> The attached patch prevents combine from optimizing subregs with the /s flag set since
> this is a valueable information. With the patch applied I get a 890 byte smaller cc1
> executable on s390x -0.003%). The number of lgfr (32 -> 64bit sign extend) instructions goes 
> down from 21399 to 21083 by 316 (-1.48%).

I'm not comfortable with this patch.  You're disabling a (minor) set
of optimizations in the expectation that you are going to see a
sign_extend.  If you don't see the sign_extend, then you've made the
code worse.

I think you touch on the right solution here:

> Another point is that combine doesn't draw the promotion of return types into account when
> doing its nonzero_bits analysis. I see that the information for incoming arguments is used
> but not for the return type - is there a special reason for this?

It seems to me that if you fix the code to correctly record
nonzero_bits and sign_bit_copies for return values, then the right
thing will happen.  In fact, there is already code to do this, in
check_conversions and record_promoted_value.  The comments suggest
that it should fix this exact problem.  Why is that not helping?

Ian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]