This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Constant fold VIEW_CONVERT_EXPR (take 2)
- From: Roger Sayle <roger at eyesopen dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Wed, 12 Apr 2006 08:01:56 -0600 (MDT)
- Subject: [PATCH] Constant fold VIEW_CONVERT_EXPR (take 2)
Hi Richard,
The following patch is a significantly revised implementation of my
patch to constant fold VIEW_CONVERT_EXPR of constant nodes in "fold".
http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00099.html
My apologies for taking so long to make the changes you requested.
On Wed, 4 Jan 2006, Richard Henderson wrote:
> On Tue, Jan 03, 2006 at 12:11:09AM -0700, Roger Sayle wrote:
> > + encode_view_convert_expr (unsigned char *ptr, tree expr)
>
> Why are complex and vectors handled in fold_view_convert_expr instead
> of here?
>
> > + if (i < HOST_BITS_PER_WIDE_INT)
> > + *ptr++ = (unsigned char) (TREE_INT_CST_LOW (expr) >> i);
> > + else if (i < 2 * HOST_BITS_PER_WIDE_INT)
> > + *ptr++ = (unsigned char) (TREE_INT_CST_HIGH (expr)
> > + >> (i - HOST_BITS_PER_WIDE_INT));
> > + /* It shouldn't matter what's done here, so fill with zero. */
>
> I think its clearer if you always lay down and read data in the buffer
> in the target byte ordering. At present you're doing double big-endian
> correction in the vector and fp code.
I now see and completely understand where you're going with these
requests. Although they don't affect the correctness of the patch,
using the target's in-memory byte ordering via a convenient API that
handles VECTOR_CST and COMPLEX_CST itself (rather than decomposing
them at a higher level) creates a useful function that should
be able to reused elsewhere. I've no idea where just yet, but the
previous design's strategy provided no possibility for such re-use.
The following revision now renames the encode and decode functions to
native_encode_expr and native_interpret_expr, which handle arbitrary
expressions, including VECTOR_CST and COMPLEX_CST themselves. Rather
than abort on unsupported tree codes/types, these routines now return
an error indication, allowing additional tree nodes such as STRING_CST,
CONSTRUCTOR, etc... to be added in future with no/few changes to the
callers.
As suggested by James Morrison, I've also added a new test case to show
that we can now constant fold more vector additions during tree-ssa.
The following patch has now been retested on i686-pc-linux-gnu,
ia64-unknown-linux-gnu, ia64-hp-hpux11.22, powerpc-unknown-linux-gnu,
powerpc-ibm-aix5.2.0.0 and powerpc-apple-darwin7.9.0, all default
langauges (including Ada on x86/Linux) with no regressions. This
should cover a range of endianness and vector capabilities, and now
confirms that gnat (a heavy user of VIEW_CONVERT_EXPR) is happy.
I'm pleased with this rewrite. Ok for mainline?
2006-04-12 Roger Sayle <roger@eyesopen.com>
* fold-const.c (native_encode_expr): New function to encode
the target representation of an INTEGER_CST, REAL_CST, COMPLEX_CST
or VECTOR_CST into a specified buffer.
(native_interpret_expr): Inverse of native_encode_expr to convert
a target representation into an INTEGER_CST, REAL_CST etc...
(fold_view_convert_expr): New function to constant fold/evaluate a
VIEW_CONVERT_EXPR of a suitable constant expression.
(fold_unary) <VIEW_CONVERT_EXPR>: Call fold_view_convert_expr for
INTEGER_CST, REAL_CST, COMPLEX_CST and VECTOR_CST nodes. Change
call of build1 to fold_build1 when constructing a VIEW_CONVERT_EXPR.
* gcc.target/i386/20050113-1.c: Tweak testcase to reflect that casts
of integers to a vector types are now constant expressions in C.
* gcc.dg/vect/vect-fold-1.c: New test case.
Index: fold-const.c
===================================================================
*** fold-const.c (revision 112817)
--- fold-const.c (working copy)
*************** fold_plusminus_mult_expr (enum tree_code
*** 6756,6761 ****
--- 6756,7063 ----
return NULL_TREE;
}
+ /* Subroutine of fold_view_convert_expr. Encode the INTEGER_CST,
+ REAL_CST, COMPLEX_CST or VECTOR_CST specified by EXPR into the
+ buffer PTR of length LEN bytes. Return the number of bytes
+ placed in the buffer, or zero upon failure. */
+
+ static int
+ native_encode_expr (tree expr, unsigned char *ptr, int len)
+ {
+ tree type = TREE_TYPE (expr);
+ int byte, offset, total_bytes;
+ unsigned char value;
+ int word, words;
+
+ if (TREE_CODE (expr) == INTEGER_CST)
+ {
+ total_bytes = GET_MODE_SIZE (TYPE_MODE (type));
+ if (total_bytes > len)
+ return 0;
+ words = total_bytes / UNITS_PER_WORD;
+
+ for (byte = 0; byte < total_bytes; byte++)
+ {
+ int bitpos = byte * BITS_PER_UNIT;
+ if (bitpos < HOST_BITS_PER_WIDE_INT)
+ value = (unsigned char) (TREE_INT_CST_LOW (expr) >> bitpos);
+ else
+ value = (unsigned char) (TREE_INT_CST_HIGH (expr)
+ >> (bitpos - HOST_BITS_PER_WIDE_INT));
+
+ if (total_bytes > UNITS_PER_WORD)
+ {
+ word = byte / UNITS_PER_WORD;
+ if (WORDS_BIG_ENDIAN)
+ word = (words - 1) - word;
+ offset = word * UNITS_PER_WORD;
+ if (BYTES_BIG_ENDIAN)
+ offset += (UNITS_PER_WORD - 1) - (byte % UNITS_PER_WORD);
+ else
+ offset += byte % UNITS_PER_WORD;
+ }
+ else
+ offset = BYTES_BIG_ENDIAN ? (total_bytes - 1) - byte : byte;
+ ptr[offset] = value;
+ }
+ return total_bytes;
+ }
+
+ if (TREE_CODE (expr) == REAL_CST)
+ {
+ /* There are always 32 bits in each long, no matter the size of
+ the hosts long. We handle floating point representations with
+ up to 192 bits. */
+ long tmp[6];
+
+ total_bytes = GET_MODE_SIZE (TYPE_MODE (type));
+ if (total_bytes > len)
+ return 0;
+ words = total_bytes / UNITS_PER_WORD;
+
+ real_to_target (tmp, TREE_REAL_CST_PTR (expr), TYPE_MODE (type));
+
+ for (byte = 0; byte < total_bytes; byte++)
+ {
+ int bitpos = byte * BITS_PER_UNIT;
+ value = (unsigned char) (tmp[bitpos / 32] >> (bitpos & 31));
+
+ if (total_bytes > UNITS_PER_WORD)
+ {
+ word = byte / UNITS_PER_WORD;
+ if (FLOAT_WORDS_BIG_ENDIAN)
+ word = (words - 1) - word;
+ offset = word * UNITS_PER_WORD;
+ if (BYTES_BIG_ENDIAN)
+ offset += (UNITS_PER_WORD - 1) - (byte % UNITS_PER_WORD);
+ else
+ offset += byte % UNITS_PER_WORD;
+ }
+ else
+ offset = BYTES_BIG_ENDIAN ? (total_bytes - 1) - byte : byte;
+ ptr[offset] = value;
+ }
+ return total_bytes;
+ }
+
+ if (TREE_CODE (expr) == COMPLEX_CST)
+ {
+ tree part = TREE_REALPART (expr);
+ offset = native_encode_expr (part, ptr, len);
+ if (offset == 0)
+ return 0;
+ total_bytes = offset;
+ part = TREE_IMAGPART (expr);
+ offset = native_encode_expr (part, ptr+offset, len-offset);
+ if (offset != total_bytes)
+ return 0;
+ return total_bytes + offset;
+ }
+
+ if (TREE_CODE (expr) == VECTOR_CST)
+ {
+ tree elem, elements;
+ int i, size, count;
+
+ size = 0;
+ offset = 0;
+ elements = TREE_VECTOR_CST_ELTS (expr);
+ count = TYPE_VECTOR_SUBPARTS (TREE_TYPE (expr));
+ for (i = 0; i < count; i++)
+ {
+ if (elements)
+ {
+ elem = TREE_VALUE (elements);
+ elements = TREE_CHAIN (elements);
+ }
+ else
+ elem = NULL_TREE;
+
+ if (elem)
+ {
+ size = native_encode_expr (elem, ptr+offset, len-offset);
+ if (size == 0)
+ return 0;
+ }
+ else if (size != 0)
+ {
+ if (offset + size > len)
+ return 0;
+ memset (ptr+offset, 0, size);
+ }
+ else
+ return 0;
+ offset += size;
+ }
+ return offset;
+ }
+ return 0;
+ }
+
+ /* Subroutine of fold_view_convert_expr. Interpet the contents of
+ the buffer PTR of length LEN as a constant of type TYPE. For
+ INTEGRAL_TYPE_P we return an INTEGER_CST, for SCALAR_FLOAT_TYPE_P
+ we return a REAL_CST, etc... If the buffer cannot be interpreted,
+ return NULL_TREE. */
+
+ static tree
+ native_interpret_expr (tree type, unsigned char *ptr, int len)
+ {
+ enum machine_mode mode = TYPE_MODE (type);
+ int byte, offset, total_bytes;
+ unsigned char value;
+ int word, words;
+
+ if (INTEGRAL_TYPE_P (type))
+ {
+ unsigned int HOST_WIDE_INT lo = 0;
+ HOST_WIDE_INT hi = 0;
+
+ total_bytes = GET_MODE_SIZE (TYPE_MODE (type));
+ if (total_bytes > len)
+ return NULL_TREE;
+ if (total_bytes * BITS_PER_UNIT > 2 * HOST_BITS_PER_WIDE_INT)
+ return NULL_TREE;
+ words = total_bytes / UNITS_PER_WORD;
+
+ for (byte = 0; byte < total_bytes; byte++)
+ {
+ int bitpos = byte * BITS_PER_UNIT;
+ if (total_bytes > UNITS_PER_WORD)
+ {
+ word = byte / UNITS_PER_WORD;
+ if (WORDS_BIG_ENDIAN)
+ word = (words - 1) - word;
+ offset = word * UNITS_PER_WORD;
+ if (BYTES_BIG_ENDIAN)
+ offset += (UNITS_PER_WORD - 1) - (byte % UNITS_PER_WORD);
+ else
+ offset += byte % UNITS_PER_WORD;
+ }
+ else
+ offset = BYTES_BIG_ENDIAN ? (total_bytes - 1) - byte : byte;
+ value = ptr[offset];
+
+ if (bitpos < HOST_BITS_PER_WIDE_INT)
+ lo |= (unsigned HOST_WIDE_INT) value << bitpos;
+ else
+ hi |= (unsigned HOST_WIDE_INT) value
+ << (bitpos - HOST_BITS_PER_WIDE_INT);
+ }
+
+ return force_fit_type (build_int_cst_wide (type, lo, hi),
+ 0, false, false);
+ }
+
+ if (SCALAR_FLOAT_TYPE_P (type))
+ {
+ /* There are always 32 bits in each long, no matter the size of
+ the hosts long. We handle floating point representations with
+ up to 192 bits. */
+ REAL_VALUE_TYPE r;
+ long tmp[6];
+
+ total_bytes = GET_MODE_SIZE (TYPE_MODE (type));
+ if (total_bytes > len || total_bytes > 24)
+ return NULL_TREE;
+ words = total_bytes / UNITS_PER_WORD;
+
+ memset (tmp, 0, sizeof (tmp));
+ for (byte = 0; byte < total_bytes; byte++)
+ {
+ int bitpos = byte * BITS_PER_UNIT;
+ if (total_bytes > UNITS_PER_WORD)
+ {
+ word = byte / UNITS_PER_WORD;
+ if (FLOAT_WORDS_BIG_ENDIAN)
+ word = (words - 1) - word;
+ offset = word * UNITS_PER_WORD;
+ if (BYTES_BIG_ENDIAN)
+ offset += (UNITS_PER_WORD - 1) - (byte % UNITS_PER_WORD);
+ else
+ offset += byte % UNITS_PER_WORD;
+ }
+ else
+ offset = BYTES_BIG_ENDIAN ? (total_bytes - 1) - byte : byte;
+ value = ptr[offset];
+
+ tmp[bitpos / 32] |= (unsigned long)value << (bitpos & 31);
+ }
+
+ real_from_target (&r, tmp, mode);
+ return build_real (type, r);
+ }
+
+ if (TREE_CODE (type) == COMPLEX_TYPE)
+ {
+ tree etype, rpart, ipart;
+ int size;
+
+ etype = TREE_TYPE (type);
+ size = GET_MODE_SIZE (TYPE_MODE (etype));
+ if (size * 2 > len)
+ return NULL_TREE;
+ rpart = native_interpret_expr (etype, ptr, size);
+ if (!rpart)
+ return NULL_TREE;
+ ipart = native_interpret_expr (etype, ptr+size, size);
+ if (!ipart)
+ return NULL_TREE;
+ return build_complex (type, rpart, ipart);
+ }
+
+ if (TREE_CODE (type) == VECTOR_TYPE)
+ {
+ tree etype, elem, elements;
+ int i, size, count;
+
+ etype = TREE_TYPE (type);
+ size = GET_MODE_SIZE (TYPE_MODE (etype));
+ count = TYPE_VECTOR_SUBPARTS (type);
+ if (size * count > len)
+ return NULL_TREE;
+
+ elements = NULL_TREE;
+ for (i = count - 1; i >= 0; i--)
+ {
+ elem = native_interpret_expr (etype, ptr+(i*size), size);
+ if (!elem)
+ return NULL_TREE;
+ elements = tree_cons (NULL_TREE, elem, elements);
+ }
+ return build_vector (type, elements);
+ }
+
+ return NULL_TREE;
+ }
+
+
+ /* Fold a VIEW_CONVERT_EXPR of a constant expression EXPR to type
+ TYPE at compile-time. If we're unable to perform the conversion
+ return NULL_TREE. */
+
+ static tree
+ fold_view_convert_expr (tree type, tree expr)
+ {
+ /* We support up to 512-bit values (for V8DFmode). */
+ enum {
+ max_bytes = 64
+ };
+ unsigned char buffer[max_bytes];
+ int len;
+
+ /* Check that the host and target are sane. */
+ if (CHAR_BIT != 8 || BITS_PER_UNIT != 8)
+ return NULL_TREE;
+
+ len = native_encode_expr (expr, buffer, max_bytes);
+ if (len == 0)
+ return NULL_TREE;
+
+ return native_interpret_expr (type, buffer, len);
+ }
+
+
/* Fold a unary expression of code CODE and type TYPE with operand
OP0. Return the folded expression if folding is successful.
Otherwise, return NULL_TREE. */
*************** fold_unary (enum tree_code code, tree ty
*** 7095,7101 ****
case VIEW_CONVERT_EXPR:
if (TREE_CODE (op0) == VIEW_CONVERT_EXPR)
! return build1 (VIEW_CONVERT_EXPR, type, TREE_OPERAND (op0, 0));
return NULL_TREE;
case NEGATE_EXPR:
--- 7397,7408 ----
case VIEW_CONVERT_EXPR:
if (TREE_CODE (op0) == VIEW_CONVERT_EXPR)
! return fold_build1 (VIEW_CONVERT_EXPR, type, TREE_OPERAND (op0, 0));
! if (TREE_CODE (op0) == INTEGER_CST
! || TREE_CODE (op0) == REAL_CST
! || TREE_CODE (op0) == COMPLEX_CST
! || TREE_CODE (op0) == VECTOR_CST)
! return fold_view_convert_expr (type, op0);
return NULL_TREE;
case NEGATE_EXPR:
Index: testsuite/gcc.target/i386/20050113-1.c
===================================================================
*** testsuite/gcc.target/i386/20050113-1.c (revision 109033)
--- testsuite/gcc.target/i386/20050113-1.c (working copy)
***************
*** 3,6 ****
/* { dg-options "-mmmx" } */
typedef short int V __attribute__ ((vector_size (8)));
! static V v = (V) 0x00FF00FF00FF00FFLL; /* { dg-error "is not constant" } */
--- 3,6 ----
/* { dg-options "-mmmx" } */
typedef short int V __attribute__ ((vector_size (8)));
! static V v = (V) 0x00FF00FF00FF00FFLL;
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-dom1" } */
typedef unsigned char v4qi __attribute__ ((vector_size (4)));
v4qi c;
void foo()
{
v4qi a = { 1, 2, 3, 4 };
v4qi b = { 5, 6, 7, 8 };
c = a + b;
}
/* { dg-final { scan-tree-dump-times "c = { 6, 8, 10, 12 }" 1 "dom1" } } */
/* { dg-final { cleanup-tree-dump "dom1" } } */
Roger
--