[PATCH PR 21478] gimplify.c: improve initialization of sparse local arrays

Josh Conner jconner@apple.com
Mon May 9 23:04:00 GMT 2005


Currently, all local arrays that are initialized with constant values 
generate a constant template that is copied in when the scope is 
entered.  For example:

	void bar (char *);
	void foo (void)
	{
   		char bigarray[0x1000] = { 0 };

   		bar (&bigarray[0]);
	}
	> gcc test.c -c -o test.o
	> size test.o
	   text    data     bss     dec     hex filename
	   4180       0       0    4180    1054 test.o
	> gcc test.c -S -o test.s
	> cat test.s
	...
         .section        .rodata
         .type   C.0.1192, %object
         .size   C.0.1192, 4096
	C.0.1192:
         .space  4096
	...
	 mov     ip, #4096
         mov     r0, r3
         mov     r1, r2
         mov     r2, ip
         bl      memcpy
	...

The decision to use an initializer template is made in the function 
gimplify_init_constructor, where it decides to change a CONSTRUCTOR 
tree with constant initializers into an assignment, and generates the 
initializer template:

         /* If there are "lots" of initialized elements, and all of them
            are valid address constants, then the entire initializer can
            be dropped to memory, and then memcpy'd out.  */
         if (num_nonconstant_elements == 0)
	...

If we disable this conversion, and allow the tree to live on as a 
CONSTRUCTOR, we get behavior that is what we would like to see for the 
example above (i.e., memset to zero).  However, this is unoptimal for 
non-sparse arrays because we no longer ever use an initializer 
template.  There is already a test later in the function that 
determines sparseness (to decide whether to clear the array before 
assigning elements) -- by moving this test higher in the function, and 
then making the initialization template be generated conditionally 
based on the result of the test, we get good performance for both 
sparse and non-sparse arrays.

I benchmarked the attached patch on SPEC2000 CINT and saw no difference 
in the generated code for any of the bmarks.  There is a 25%+ reduction 
in code size on the FAAD2 open-source AAC library with this patch, 
which is where I discovered this issue.

Tested on arm-none-elf and i686-pc-cygwin-gcc with no regressions.

OK for the mainline?

- Josh
~~~~~~
Josh Conner

2005-05-09  Josh Conner <jconner@apple.com>
         * gimplify.c: PR 21478 - improve efficiency of sparse local
         array initialization.
Index: gimplify.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/gimplify.c,v
retrieving revision 2.126
diff -c -3 -p -r2.126 gimplify.c
*** gimplify.c  1 May 2005 23:23:33 -0000       2.126
--- gimplify.c  9 May 2005 22:36:01 -0000
*************** gimplify_init_constructor (tree *expr_p,
*** 2653,2662 ****
             break;
           }

         /* If there are "lots" of initialized elements, and all of them
            are valid address constants, then the entire initializer can
!          be dropped to memory, and then memcpy'd out.  */
!       if (num_nonconstant_elements == 0)
           {
             HOST_WIDE_INT size = int_size_in_bytes (type);
             unsigned int align;
--- 2653,2687 ----
             break;
           }

+       /* If there are "lots" of initialized elements, even discounting
+          those that are not address constants (and thus *must* be
+          computed at runtime), then partition the constructor into
+          constant and non-constant parts.  Block copy the constant
+          parts in, then generate code for the non-constant parts.  */
+       /* TODO.  There's code in cp/typeck.c to do this.  */
+
+       num_type_elements = count_type_elements (TREE_TYPE (ctor));
+
+       /* If there are "lots" of zeros, then block clear the object 
first.  */
+       if (num_type_elements - num_nonzero_elements > CLEAR_RATIO
+           && num_nonzero_elements < num_type_elements/4)
+         cleared = true;
+
+       /* ??? This bit ought not be needed.  For any element not 
present
+          in the initializer, we should simply set them to zero.  
Except
+          we'd need to *find* the elements that are not present, and 
that
+          requires trickery to avoid quadratic compile-time behavior in
+          large cases or excessive memory use in small cases.  */
+       else if (num_ctor_elements < num_type_elements)
+         cleared = true;
+
         /* If there are "lots" of initialized elements, and all of them
            are valid address constants, then the entire initializer can
!          be dropped to memory, and then memcpy'd out.  Don't do this
!          for sparse arrays, though, as it's more efficient to follow
!          the standard CONSTRUCTOR behavior of memset followed by
!          individual element initialization.  */
!       if (num_nonconstant_elements == 0 && !cleared)
           {
             HOST_WIDE_INT size = int_size_in_bytes (type);
             unsigned int align;
*************** gimplify_init_constructor (tree *expr_p,
*** 2702,2729 ****
               }
           }

-       /* If there are "lots" of initialized elements, even discounting
-          those that are not address constants (and thus *must* be
-          computed at runtime), then partition the constructor into
-          constant and non-constant parts.  Block copy the constant
-          parts in, then generate code for the non-constant parts.  */
-       /* TODO.  There's code in cp/typeck.c to do this.  */
-
-       num_type_elements = count_type_elements (TREE_TYPE (ctor));
-
-       /* If there are "lots" of zeros, then block clear the object 
first.  */
-       if (num_type_elements - num_nonzero_elements > CLEAR_RATIO
-           && num_nonzero_elements < num_type_elements/4)
-         cleared = true;
-
-       /* ??? This bit ought not be needed.  For any element not 
present
-          in the initializer, we should simply set them to zero.  
Except
-          we'd need to *find* the elements that are not present, and 
that
-          requires trickery to avoid quadratic compile-time behavior in
-          large cases or excessive memory use in small cases.  */
-       else if (num_ctor_elements < num_type_elements)
-         cleared = true;
-
         if (cleared)
           {
             /* Zap the CONSTRUCTOR element list, which simplifies this 
case.
--- 2727,2732 ----



More information about the Gcc-patches mailing list