[PATCH] More aggressive __builtin_memcpy optimizations (PR middle-end/29215)

H.J. Lu hjl.tools@gmail.com
Fri Nov 21 08:35:00 GMT 2008


On Thu, Nov 20, 2008 at 7:54 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Nov 19, 2008 at 06:23:28PM +0100, Richard Guenther wrote:
>> The patch is ok.
>
> Before checking in, I've bootstrapped/regtested it on ia64-linux
> in addition to the usual x86_64-linux bootstrap/regtest.
>
> Unfortunately this revealed a few issues:
>
> 1) on ia64 allocatable_function_3.f90 ICEd at -O1 and higher, because
>   memcpy from a TREE_STATIC readonly VAR_DECL with a CONSTRUCTOR
>   initializer was optimized into a store, but TREE_PURPOSE in the
>   ctor elements weren't filled up, which upsets e.g.
>   gimplify_init_ctor_eval or categorize_ctor_elements_1.
>   When this wasn't optimized, Fortran FE managed to survive,
>   because none of these routines were called.  Either we could teach
>   these what to do if purpose isn't present, or, as done in the patch
>   below, fill in TREE_PURPOSE like the C/C++ FEs do.
> 2) when looking why that test didn't fail on x86_64-linux as well,
>   I found out the
>      if (!var_decl_component_p (srcvar)
>          /* Accept: memcpy (*char_var, "test", 1); that simplify
>             to char_var='t';  */
>          || is_gimple_min_invariant (srcvar)
>          || readonly_data_expr (src))
>        return NULL_TREE;
>   lines (in current SVN, my previous patch just moved these around).
>   On ia64 src was actually emitted into .sdata section, even when
>   read-only and thus memcpy was optimized, on x86_64 where src
>   lived in .rodata it wasn't.  This logic seems to be very much
>   broken, I mean if something is var_decl_component_p, then it
>   won't be gimple invariant, and whether it is in readonly data
>   doesn't matter too.  return NULL_TREE here means "don't optimize".
>   I've tried changing this into !var_decl_component_p () &&
>   !is_gimple_min_invariant () && !readonly_data_expr (),
>   but that created invalid GIMPLE, assigning STRING_CST into something.
>   So the patch below just removes those, optimization of the
>   memcpy (*char_var, "test", 1) can be looked at later, but
>   would certainly need extra code.
> 3) the new testcase failed on ia64, because memcpy wasn't optimized there.
>   memcpy is optimized only if the other pointer is known to be sufficiently
>   aligned (in the testcase it wasn't) or if unaligned access is allowed
>   and not slow (true on x86_64, false on ia64).  I've adjusted
>   the testcase so that gcc knows the buffer is aligned.
> 4) array_memcpy_3.f90 test failed to match on x86_64-linux after the
>   2) changes, memcpy is now optimized into a store, the pattern has
>   been adjusted not to care if memcpy has been optimized or not.
>
> Ok for trunk (together with the earlier patch)?
>
> 2008-11-20  Jakub Jelinek  <jakub@redhat.com>
>
>        PR middle-end/29215
>        * builtins.c (fold_builtin_memory_op): Remove
>        is_gimple_min_invariant and readonly_data_expr src check.
>
>        * trans-array.c (trans_array_constructor_value,
>        gfc_build_constant_array_constructor): Fill in TREE_PURPOSE.
>
>        * gfortran.dg/array_memcpy_3.f90: Adjust pattern to match even
>        memcpy optimized into ref-all store.
>        * gcc.dg/pr29215.c: Adjust so that all stores are known to be
>        aligned.
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38208


-- 
H.J.



More information about the Gcc-patches mailing list