This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

useless cast blocking some optimization in gcc 4.7.3


    Hello,

I have identified a big performance regression between 4.6 and 4.7. (I have enclosed a pathological test).

After investigation, it is because of the += statement applied on 2 signed chars. - It is now type-promoted to "int" when it is written "result += foo()". (since 4.7) - it is type promoted to "unsigned char" when it is written "result = result + foo()".

The "char->int->char" cast is blocking some optimizations in later phases.
Anyway, this doesn't look wrong, so I extended fold optimization in order to catch this case. (patch enclosed)
The patch basically transforms :
(TypeA) ( (TypeB) a1 + (TypeB) a2 ) /* with a1 and a2 of the signed type TypeA */
into :
    a1 + a2

I believe this is legal for any licit a1/a2 input values (no overflow on signed char). No new failure on the two tested targets : sh-superh-elf and x86_64-unknown-linux-gnu.
Should I enter a bugzilla to track this ? Is it ok for trunk ?

2013-04-08  Laurent Alfonsi  <laurent.alfonsi@st.com>

       * fold-const.c (fold_unary_loc): Suppress useless type promotion.


Thanks,
Laurent

#include <cstdio>

typedef char int8_t;
const int iterations = 20;
const int SIZE 	= 200;
int8_t data8[SIZE];

/******************************************************************************/

template <typename T>
inline void check_result(T result) {
	if (result != T(200)) {
		printf("test failed %d!=%d\n", result, 200);
        }
}

/******************************************************************************/

template <typename T>
	struct all_constants {
	  static T get_one(T input) { return (T(1)); }
	};

/******************************************************************************/

template <typename T, typename Input>
void test_constant(T* first, int count) {
  int i;

  for(i = 0; i < iterations; ++i) {
    T result = 0;
    for (int n = 0; n < count; ++n) {
		result += Input::get_one( first[n] );
	}
    check_result<T>(result);
  }
}

/******************************************************************************/

int main(int argc, char** argv)
{
	test_constant<int8_t, all_constants<int8_t> >(data8,SIZE);
	return 0;
}
--- ./gcc.orig/gcc/fold-const.c	2013-04-08 14:09:32.000000000 +0200
+++ ./gcc/gcc/fold-const.c	2013-04-08 11:08:16.000000000 +0200
@@ -8055,6 +8055,26 @@
 	    }
 	}
 
+      /* Convert (T1) ((T2)X + (T2)Y) into X + Y, 
+         if X and Y already have type T1 (integral only), and T2 > T1 */
+      if (INTEGRAL_TYPE_P (type)
+          && TYPE_OVERFLOW_UNDEFINED (type)
+	  && (TREE_CODE (op0) == PLUS_EXPR || TREE_CODE (op0) == MINUS_EXPR
+	     || TREE_CODE (op0) == MULT_EXPR)
+	  && TREE_CODE (TREE_OPERAND (op0, 0)) == NOP_EXPR
+	  && TREE_CODE (TREE_OPERAND (op0, 1)) == NOP_EXPR
+	  && type == TREE_TYPE (TREE_OPERAND (TREE_OPERAND (op0, 0), 0))
+	  && type == TREE_TYPE (TREE_OPERAND (TREE_OPERAND (op0, 1), 0))
+	  && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (op0)))
+	{
+	  tem = fold_build2_loc (loc, TREE_CODE (op0), type,
+	    		     fold_convert_loc (loc, type,
+	    				       TREE_OPERAND (op0, 0)),
+	    		     fold_convert_loc (loc, type,
+	    				       TREE_OPERAND (op0, 1)));
+	  return fold_convert_loc (loc, type, tem);
+	}
+
       tem = fold_convert_const (code, type, op0);
       return tem ? tem : NULL_TREE;
 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]