Bug 25186

Summary: (short)(((int)short_var) <<1) should be folded so that the shift is done in the short type
Product: gcc Reporter: Andrew Pinski <pinskia>
Component: middle-endAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: enhancement CC: gabravier, gcc-bugs, rguenth, roger
Priority: P3 Keywords: easyhack, missed-optimization, TREE
Version: 4.2.0   
Target Milestone: 13.0   
Host: Target:
Build: Known to work: 13.0
Known to fail: Last reconfirmed: 2021-09-02 00:00:00
Bug Depends on:    
Bug Blocks: 19986    

Description Andrew Pinski 2005-11-30 18:20:59 UTC
Take the following example:
short *a;

int f(void)
{
  *a  = (short)(((int)*a) << 1);
}

the Shift should be done in the same type as *a.
This is done in simplify_subreg on the RTL level but we really should be able to do it in fold also.
Comment 1 Andrew Pinski 2005-11-30 18:32:29 UTC
It also should be done for:
int f1(void)
{
  *a  = (short)(((int)(unsigned short)*a) << 1);
}

Which is a little more complicated on the tree level than the RTL level:
tree level:
  *a.1 = (short int) ((int) (short unsigned int) *a.1 << 1);

RTL level just has a zero_extend.
Comment 2 Richard Biener 2005-12-01 10:53:06 UTC
Confirmed.  The first testcase is really just

short *a;
void f(void)
{
  *a = *a << 1;
}

interestingly, the C frontend does not do integer promotion of

unsigned short *a;
void f(voif)
{
  *a = *a << 1;
}

where *a should be promoted to int as of 6.3.1.8 and 6.5.7/3, which
says "Integer promotions are performed on each of the operands".  Now
the question is how to read this, but either the C frontend does
unnecessary promution for the signed case or it misses it for the unsigned
case.
Comment 3 Richard Biener 2005-12-01 11:01:31 UTC
Doh.  The C frontend _does_ the promotion (in the unsigned case):

 (intD.0) *aD.1296 << 1

just convert.c:convert_to_integer "folds" it to a shift on unsigned short
again.

This transformation should be moved to fold instead.
Comment 4 Richard Biener 2005-12-01 12:28:12 UTC
convert_to_integer contains

        case LSHIFT_EXPR:
          /* We can pass truncation down through left shifting
             when the shift count is a nonnegative constant and
             the target type is unsigned.  */
          if (TREE_CODE (TREE_OPERAND (expr, 1)) == INTEGER_CST
              && tree_int_cst_sgn (TREE_OPERAND (expr, 1)) >= 0 
              && TYPE_UNSIGNED (type)
              && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)

which for our case then (should) falls through to

...
                    /* Don't do unsigned arithmetic where signed was wanted,
                       or vice versa.
                       Exception: if both of the original operands were
                       unsigned then we can safely do the work as unsigned.
                       Exception: shift operations take their type solely
                       from the first argument.
                       Exception: the LSHIFT_EXPR case above requires that
                       we perform this operation unsigned lest we produce
                       signed-overflow undefinedness.
                       And we may need to do it as unsigned
                       if we truncate to the original size.  */
                    if (TYPE_UNSIGNED (TREE_TYPE (expr))
                        || (TYPE_UNSIGNED (TREE_TYPE (arg0))
                            && (TYPE_UNSIGNED (TREE_TYPE (arg1))
                                || ex_form == LSHIFT_EXPR
                                || ex_form == RSHIFT_EXPR
                                || ex_form == LROTATE_EXPR
                                || ex_form == RROTATE_EXPR))
                        || ex_form == LSHIFT_EXPR)
                      typex = lang_hooks.types.unsigned_type (typex);
                    else
                      typex = lang_hooks.types.signed_type (typex);

now, this path seems to handle LSHIFT_EXPR of signed types, so the exeption
above does not need to apply.  Further, I don't understand the reasoning
why we need to do the shift unsigned - we invoke undefined behavior for
signed overflow, but the original code, (short) int << n invoked undefined
behavior in truncating the int to short in case the value doesn't fit.  Which
of course still happens for the unsigned case.

So, what would break with

Index: convert.c 
===================================================================
*** convert.c   (revision 107813)
--- convert.c   (working copy)
*************** convert_to_integer (tree type, tree expr
*** 512,518 ****
             the target type is unsigned.  */ 
          if (TREE_CODE (TREE_OPERAND (expr, 1)) == INTEGER_CST
              && tree_int_cst_sgn (TREE_OPERAND (expr, 1)) >= 0
-             && TYPE_UNSIGNED (type)
              && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
            {
              /* If shift count is less than the width of the truncated type,
--- 490,495 ----
*************** convert_to_integer (tree type, tree expr
*** 607,614 ****
                                || ex_form == LSHIFT_EXPR
                                || ex_form == RSHIFT_EXPR
                                || ex_form == LROTATE_EXPR
!                               || ex_form == RROTATE_EXPR)) 
!                       || ex_form == LSHIFT_EXPR)
                      typex = lang_hooks.types.unsigned_type (typex);
                    else
                      typex = lang_hooks.types.signed_type (typex);
--- 584,590 ----
                                || ex_form == LSHIFT_EXPR
                                || ex_form == RSHIFT_EXPR
                                || ex_form == LROTATE_EXPR
!                               || ex_form == RROTATE_EXPR)))
                      typex = lang_hooks.types.unsigned_type (typex);
                    else
                      typex = lang_hooks.types.signed_type (typex);

?
Comment 5 Richard Biener 2005-12-01 12:46:16 UTC
I.e. 6.3.1.3/3 says that conversion from type T to signed type is implementation defined if the value doesn't fit.  And we define it to reducing it modulo 2^N.

So, for !flag_wrapv

  (short)((int)0x8000 << 1) == 0

but
  0x8000 << 1  is undefined.

So this transformation is only ok for flag_wrapv.
Comment 6 Richard Biener 2005-12-01 14:17:56 UTC
I have two fixes.  One really safe and one ok via IRC communication.
Comment 7 Richard Biener 2006-04-05 13:49:05 UTC
I'm no longer working on this.
Comment 8 Gabriel Ravier 2021-09-02 02:03:11 UTC
So, is there still an optimization problem on that kind of basic thing ? Or have some things in GCC have changed since this time to make this irrelevant ? Looking at these same examples at .optimized even in latest trunk still shows the same conversions to int... 🤔
Comment 9 Richard Biener 2021-09-02 07:15:38 UTC
The optimization is still not done, the shift could be unconditionally made a logical shift but on the smaller type.
Comment 10 Roger Sayle 2023-04-22 08:06:50 UTC
This is now fixed (at the tree level) on mainline (and was fixed in GCC 13).
For those wondering, LSHIFT_EXPR is well defined (in this case) in GIMPLE,
hence the "optimized" dump currently looks like:
void f ()
{
  short int * a.0_1;
  short int _2;
  short int _3;

  a.0_1 = a;
  _2 = *a.0_1;
  _3 = _2 << 1;
  *a.0_1 = _3;
  return;

}