Summary: | (short)(((int)short_var) <<1) should be folded so that the shift is done in the short type | ||
---|---|---|---|
Product: | gcc | Reporter: | Andrew Pinski <pinskia> |
Component: | middle-end | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | enhancement | CC: | gabravier, gcc-bugs, rguenth, roger |
Priority: | P3 | Keywords: | easyhack, missed-optimization, TREE |
Version: | 4.2.0 | ||
Target Milestone: | 13.0 | ||
Host: | Target: | ||
Build: | Known to work: | 13.0 | |
Known to fail: | Last reconfirmed: | 2021-09-02 00:00:00 | |
Bug Depends on: | |||
Bug Blocks: | 19986 |
Description
Andrew Pinski
2005-11-30 18:20:59 UTC
It also should be done for: int f1(void) { *a = (short)(((int)(unsigned short)*a) << 1); } Which is a little more complicated on the tree level than the RTL level: tree level: *a.1 = (short int) ((int) (short unsigned int) *a.1 << 1); RTL level just has a zero_extend. Confirmed. The first testcase is really just short *a; void f(void) { *a = *a << 1; } interestingly, the C frontend does not do integer promotion of unsigned short *a; void f(voif) { *a = *a << 1; } where *a should be promoted to int as of 6.3.1.8 and 6.5.7/3, which says "Integer promotions are performed on each of the operands". Now the question is how to read this, but either the C frontend does unnecessary promution for the signed case or it misses it for the unsigned case. Doh. The C frontend _does_ the promotion (in the unsigned case): (intD.0) *aD.1296 << 1 just convert.c:convert_to_integer "folds" it to a shift on unsigned short again. This transformation should be moved to fold instead. convert_to_integer contains case LSHIFT_EXPR: /* We can pass truncation down through left shifting when the shift count is a nonnegative constant and the target type is unsigned. */ if (TREE_CODE (TREE_OPERAND (expr, 1)) == INTEGER_CST && tree_int_cst_sgn (TREE_OPERAND (expr, 1)) >= 0 && TYPE_UNSIGNED (type) && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST) which for our case then (should) falls through to ... /* Don't do unsigned arithmetic where signed was wanted, or vice versa. Exception: if both of the original operands were unsigned then we can safely do the work as unsigned. Exception: shift operations take their type solely from the first argument. Exception: the LSHIFT_EXPR case above requires that we perform this operation unsigned lest we produce signed-overflow undefinedness. And we may need to do it as unsigned if we truncate to the original size. */ if (TYPE_UNSIGNED (TREE_TYPE (expr)) || (TYPE_UNSIGNED (TREE_TYPE (arg0)) && (TYPE_UNSIGNED (TREE_TYPE (arg1)) || ex_form == LSHIFT_EXPR || ex_form == RSHIFT_EXPR || ex_form == LROTATE_EXPR || ex_form == RROTATE_EXPR)) || ex_form == LSHIFT_EXPR) typex = lang_hooks.types.unsigned_type (typex); else typex = lang_hooks.types.signed_type (typex); now, this path seems to handle LSHIFT_EXPR of signed types, so the exeption above does not need to apply. Further, I don't understand the reasoning why we need to do the shift unsigned - we invoke undefined behavior for signed overflow, but the original code, (short) int << n invoked undefined behavior in truncating the int to short in case the value doesn't fit. Which of course still happens for the unsigned case. So, what would break with Index: convert.c =================================================================== *** convert.c (revision 107813) --- convert.c (working copy) *************** convert_to_integer (tree type, tree expr *** 512,518 **** the target type is unsigned. */ if (TREE_CODE (TREE_OPERAND (expr, 1)) == INTEGER_CST && tree_int_cst_sgn (TREE_OPERAND (expr, 1)) >= 0 - && TYPE_UNSIGNED (type) && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST) { /* If shift count is less than the width of the truncated type, --- 490,495 ---- *************** convert_to_integer (tree type, tree expr *** 607,614 **** || ex_form == LSHIFT_EXPR || ex_form == RSHIFT_EXPR || ex_form == LROTATE_EXPR ! || ex_form == RROTATE_EXPR)) ! || ex_form == LSHIFT_EXPR) typex = lang_hooks.types.unsigned_type (typex); else typex = lang_hooks.types.signed_type (typex); --- 584,590 ---- || ex_form == LSHIFT_EXPR || ex_form == RSHIFT_EXPR || ex_form == LROTATE_EXPR ! || ex_form == RROTATE_EXPR))) typex = lang_hooks.types.unsigned_type (typex); else typex = lang_hooks.types.signed_type (typex); ? I.e. 6.3.1.3/3 says that conversion from type T to signed type is implementation defined if the value doesn't fit. And we define it to reducing it modulo 2^N. So, for !flag_wrapv (short)((int)0x8000 << 1) == 0 but 0x8000 << 1 is undefined. So this transformation is only ok for flag_wrapv. I have two fixes. One really safe and one ok via IRC communication. I'm no longer working on this. So, is there still an optimization problem on that kind of basic thing ? Or have some things in GCC have changed since this time to make this irrelevant ? Looking at these same examples at .optimized even in latest trunk still shows the same conversions to int... 🤔 The optimization is still not done, the shift could be unconditionally made a logical shift but on the smaller type. This is now fixed (at the tree level) on mainline (and was fixed in GCC 13). For those wondering, LSHIFT_EXPR is well defined (in this case) in GIMPLE, hence the "optimized" dump currently looks like: void f () { short int * a.0_1; short int _2; short int _3; a.0_1 = a; _2 = *a.0_1; _3 = _2 << 1; *a.0_1 = _3; return; } |