[PATCH] Work-around for PR opt/8634

Glen Nakamura glen@imodulo.com
Mon Dec 9 16:49:00 GMT 2002


Aloha,

PR opt/8634 exposes a problem with const initializers of aggregate types.
GCC generates code for the initializer with RTX_UNCHANGING = 1, which implies
that the memory location is set at most once.  While this seems reasonable,
there are a few cases where this isn't true.

In the testcase provided with the PR, GCC generates code for the initializer
as 8 separate single byte stores that look something like the following:
  mem8[0] = CONST_A;
  mem8[1] = CONST_B;
  mem8[2] = CONST_C;
  mem8[3] = CONST_D;
  mem8[4] = CONST_E;
  mem8[5] = CONST_F;
  mem8[6] = CONST_G;
  mem8[7] = CONST_H;
The problem occurs when GCC converts the first store into a 32-bit load/store:
  mem32[0] = (mem32[0] & ~0xff) | CONST_A;
Now the mem8[1-3] bytes are actually set twice, but are still flagged as
RTX_UNCHANGING.  This causes GCC to generate incorrect code.  I would
argue that GCC should *not* convert an 8-bit store into a 32-bit load/store
when the RTX_UNCHANGING flag is set, and hopefully more experienced developers
can decide whether that needs to be fixed or not.

Some other cases where const initializers have multiple sets include:
(1)
  struct foo {
    char a; char b; char c; char d;
  };
  const struct foo bar = { a : 'A', d : 'D' };
GCC generates:
  memset (bar, 0, sizeof (bar));
  bar.a = 'A';
  bar.d = 'D';

(2)
  const char foo[4] = { 'A', 'B' };
GCC generates:
  memset (foo, 0, sizeof (foo));
  foo[0] = 'A';
  foo[1] = 'B';

Anyway here is the patch:

2002-12-09  Glen Nakamura  <glen@imodulo.com>

	* explow.c (maybe_set_unchanging): Don't flag non-static const
	aggregate type initializers with RTX_UNCHANGING.

--- gcc-3.3.orig/gcc/explow.c	2002-10-31 21:35:57.000000000 +0000
+++ gcc-3.3/gcc/explow.c	2002-10-31 21:35:57.000000000 +0000
@@ -657,6 +657,7 @@ maybe_set_unchanging (ref, t)
      has the same value.  Currently we simplify this to PARM_DECLs in the
      first case, and decls with TREE_CONSTANT initializers in the second.  */
   if ((TREE_READONLY (t) && DECL_P (t)
+       && (TREE_STATIC (t) || ! AGGREGATE_TYPE_P (TREE_TYPE (t)))
        && (TREE_CODE (t) == PARM_DECL
 	   || DECL_INITIAL (t) == NULL_TREE
 	   || TREE_CONSTANT (DECL_INITIAL (t))))

While this may not be the "correct" fix, it works as a simple work-around.
Regression tested on i686-pc-linux-gnu.

- glen



More information about the Gcc-patches mailing list