[PATCH PR 21478] gimplify.c: improve initialization of sparse local arrays
Josh Conner
jconner@apple.com
Mon May 9 23:04:00 GMT 2005
Currently, all local arrays that are initialized with constant values
generate a constant template that is copied in when the scope is
entered. For example:
void bar (char *);
void foo (void)
{
char bigarray[0x1000] = { 0 };
bar (&bigarray[0]);
}
> gcc test.c -c -o test.o
> size test.o
text data bss dec hex filename
4180 0 0 4180 1054 test.o
> gcc test.c -S -o test.s
> cat test.s
...
.section .rodata
.type C.0.1192, %object
.size C.0.1192, 4096
C.0.1192:
.space 4096
...
mov ip, #4096
mov r0, r3
mov r1, r2
mov r2, ip
bl memcpy
...
The decision to use an initializer template is made in the function
gimplify_init_constructor, where it decides to change a CONSTRUCTOR
tree with constant initializers into an assignment, and generates the
initializer template:
/* If there are "lots" of initialized elements, and all of them
are valid address constants, then the entire initializer can
be dropped to memory, and then memcpy'd out. */
if (num_nonconstant_elements == 0)
...
If we disable this conversion, and allow the tree to live on as a
CONSTRUCTOR, we get behavior that is what we would like to see for the
example above (i.e., memset to zero). However, this is unoptimal for
non-sparse arrays because we no longer ever use an initializer
template. There is already a test later in the function that
determines sparseness (to decide whether to clear the array before
assigning elements) -- by moving this test higher in the function, and
then making the initialization template be generated conditionally
based on the result of the test, we get good performance for both
sparse and non-sparse arrays.
I benchmarked the attached patch on SPEC2000 CINT and saw no difference
in the generated code for any of the bmarks. There is a 25%+ reduction
in code size on the FAAD2 open-source AAC library with this patch,
which is where I discovered this issue.
Tested on arm-none-elf and i686-pc-cygwin-gcc with no regressions.
OK for the mainline?
- Josh
~~~~~~
Josh Conner
2005-05-09 Josh Conner <jconner@apple.com>
* gimplify.c: PR 21478 - improve efficiency of sparse local
array initialization.
Index: gimplify.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/gimplify.c,v
retrieving revision 2.126
diff -c -3 -p -r2.126 gimplify.c
*** gimplify.c 1 May 2005 23:23:33 -0000 2.126
--- gimplify.c 9 May 2005 22:36:01 -0000
*************** gimplify_init_constructor (tree *expr_p,
*** 2653,2662 ****
break;
}
/* If there are "lots" of initialized elements, and all of them
are valid address constants, then the entire initializer can
! be dropped to memory, and then memcpy'd out. */
! if (num_nonconstant_elements == 0)
{
HOST_WIDE_INT size = int_size_in_bytes (type);
unsigned int align;
--- 2653,2687 ----
break;
}
+ /* If there are "lots" of initialized elements, even discounting
+ those that are not address constants (and thus *must* be
+ computed at runtime), then partition the constructor into
+ constant and non-constant parts. Block copy the constant
+ parts in, then generate code for the non-constant parts. */
+ /* TODO. There's code in cp/typeck.c to do this. */
+
+ num_type_elements = count_type_elements (TREE_TYPE (ctor));
+
+ /* If there are "lots" of zeros, then block clear the object
first. */
+ if (num_type_elements - num_nonzero_elements > CLEAR_RATIO
+ && num_nonzero_elements < num_type_elements/4)
+ cleared = true;
+
+ /* ??? This bit ought not be needed. For any element not
present
+ in the initializer, we should simply set them to zero.
Except
+ we'd need to *find* the elements that are not present, and
that
+ requires trickery to avoid quadratic compile-time behavior in
+ large cases or excessive memory use in small cases. */
+ else if (num_ctor_elements < num_type_elements)
+ cleared = true;
+
/* If there are "lots" of initialized elements, and all of them
are valid address constants, then the entire initializer can
! be dropped to memory, and then memcpy'd out. Don't do this
! for sparse arrays, though, as it's more efficient to follow
! the standard CONSTRUCTOR behavior of memset followed by
! individual element initialization. */
! if (num_nonconstant_elements == 0 && !cleared)
{
HOST_WIDE_INT size = int_size_in_bytes (type);
unsigned int align;
*************** gimplify_init_constructor (tree *expr_p,
*** 2702,2729 ****
}
}
- /* If there are "lots" of initialized elements, even discounting
- those that are not address constants (and thus *must* be
- computed at runtime), then partition the constructor into
- constant and non-constant parts. Block copy the constant
- parts in, then generate code for the non-constant parts. */
- /* TODO. There's code in cp/typeck.c to do this. */
-
- num_type_elements = count_type_elements (TREE_TYPE (ctor));
-
- /* If there are "lots" of zeros, then block clear the object
first. */
- if (num_type_elements - num_nonzero_elements > CLEAR_RATIO
- && num_nonzero_elements < num_type_elements/4)
- cleared = true;
-
- /* ??? This bit ought not be needed. For any element not
present
- in the initializer, we should simply set them to zero.
Except
- we'd need to *find* the elements that are not present, and
that
- requires trickery to avoid quadratic compile-time behavior in
- large cases or excessive memory use in small cases. */
- else if (num_ctor_elements < num_type_elements)
- cleared = true;
-
if (cleared)
{
/* Zap the CONSTRUCTOR element list, which simplifies this
case.
--- 2727,2732 ----
More information about the Gcc-patches
mailing list