This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Help with generating 'memset' for loop initialization


On Tue, Dec 20, 2011 at 2:23 PM, Rohit Arul Raj <rohitarulraj@gmail.com> wrote:
> Hello All,
>
> With the code given below, i expected the ppc compiler (e500mc v4.6.2)
> to generate 'memset' zero ?call for loop initialization (at '-O3'),
> but it generates a loop.
>
> Case:1
>
> int a[18], b[18];
> foo () {
> ? int i;
>
> ? for (i=0; i < 18; i++)
> ? ? ?a[i] = 0;
> }
>
> Also based on the '-ftree-loop-distribute-patterns' flag, if the test
> case (taken from gcc doc) is as shown below, the compiler does
> generate 'memset' zero.
>
> Case:2
>
> int a[18], b[18];
> foo () {
> ? int i;
>
> ? for (i=0; i < 18; i++) {
> ? ? ?a[i] = 0; ? ? ? ? ? ? ? -------------(A)
> ? ? ?b[i] = a[i] + i; ? ? ? -------------(B)
> ? }
> }
>
> Here statements (A) and (B) are split in to two loops and for the 1st
> loop the compiler generates 'memset' zero call. Isn't the same
> optimization supposed to happen with case (1)?
>
> Also with case(2) ?statement (A), for loop iterations < 18, the
> compiler unrolls the loop and for iterations >= 18, 'memset' zero is
> generated.
>
> Looking at 'gcc/tree-loop-distribution.c' file,
>
> static int
> ldist_gen (struct loop *loop, struct graph *rdg,
> ? ? ? ? ? VEC (int, heap) *starting_vertices)
> {
> ? ...
> BITMAP_FREE (processed);
> ?nbp = VEC_length (bitmap, partitions);
>
> ?if (nbp <= 1
> ? ? ?|| partition_contains_all_rw (rdg, partitions))
> ? ?goto ldist_done;
> ? ?------------------------(Z)
>
> ?if (dump_file && (dump_flags & TDF_DETAILS))
> ? ?dump_rdg_partitions (dump_file, partitions);
>
> ?FOR_EACH_VEC_ELT (bitmap, partitions, i, partition)
> ? ?if (!generate_code_for_partition (loop, partition, i < nbp - 1))
> -------------------(Y) ? ? ? ? ? ? ?// code for generating built-in
> 'memset' is called from here.
> ? ? ?goto ldist_done;
>
> ?rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
> ?update_ssa (TODO_update_ssa_only_virtuals | TODO_update_ssa);
>
> ?ldist_done:
>
> ?BITMAP_FREE (remaining_stmts);
>
> ?.........
> ?return nbp;
> ?}
>
> From statement (Z), if the no of distributed loops is <=1 , then the
> code generating built-in function (Y) is not executed.
>
> Is it a good solution to update this conditional check for single loop
> (which is not split) also? or Is there any other place/pass where we
> can implement this.

Well, at least we do not want to create any code if the builtin code
generation would fail.

Richard.

> Regards,
> Rohit


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]