[RFC, WIP] tree-ssa-strlen optimization pass
Richard Guenther
richard.guenther@gmail.com
Mon Sep 5 08:55:00 GMT 2011
On Fri, Sep 2, 2011 at 6:50 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> The following patch contains a WIP implementation of a new pass,
> which attempts to track C string lengths and perform various
> optimizations using that information.
>
> Optimizations it currently performs:
> 1) optimizing away strlen calls, if the string length is known
> at that point already (either constant or some expression)
> 2) replacement of strcpy calls with memcpy if the string length
> of the source is known
> 3) replacement of strcat with either memcpy (if source length
> is known) or strcpy (if not), if the destination length
> before the call is known
> - over the years I've seen way too much spaghetti code
> doing many strcat calls to the same string one after another
> During bootstrap/regtest (excluding the newly added testcases)
> 1) hits 184 times on x86_64-linux, 182 tmes on i686-linux,
> 2) hits 158 times on x86_64 and 159 times on i686,
> 3) into memcpy hits 33 times and 3) into strcpy 2 times.
>
> Example from gcc sources that is optimized:
> filename = (char *) alloca (strlen (module_name) + strlen (MODULE_EXTENSION)
> + 1);
> strcpy (filename, module_name);
> strcat (filename, MODULE_EXTENSION);
> which can be optimized into
> filename = (char *) alloca ((tmp1 = strlen (module_name)) + (tmp2 = strlen (MODULE_EXTENSION))
> + 1);
> memcpy (filename, module_name, tmp1 + 1);
> memcpy (filename + tmp1, MODULE_EXTENSION, tmp2 + 1);
>
> Some further optimizations I'm currently considering for the pass:
> - handle *p = 0; stores like memcpy (p, "", 1)
> - lame coders often do *p = 0; strcat (p, str1); strcat (p, str2);
> - if a memcpy call (either original or strcpy/strcat transformed into it)
> copies known src string length + 1 and the immediately following
> .MEM use (or possibly non-immediately if there are only non-aliasing
> ones?) is a strcpy/(or to be optimized strcat or non-zero length memcpy)
> call which overwrites the final '\0', decrease the memcpy size by one
> - if source length for strcpy isn't known and the destination length
> is needed for optimizations, for -fhosted, glibc and with stpcpy
> compatible prototype in headers consider transforming that strcpy
> into stpcpy and use result - dst as string length (and see whether
> following optimizations are able to optimize series of unknown
> source length strcpy+strcat into a chain of stpcpy calls)
> - similarly if string length of strcat destination isn't known but is
> helpful for optimization consider optimizing strcat into strlen+stpcpy
>
> Any comments related to the implementation, or examples of real-world
> lame C string length code sequences that would be nice to optimize
> will be greatly appreciated.
>
> The patch has been bootstrapped/regtested on x86_64-linux and i686-linux,
> no regressions, but I'm not proposing it for trunk yet (would like to
> implement at least a few of the above mentioned optimizations), just am
> posting it early as a pass preview.
>
> 2011-09-02 Jakub Jelinek <jakub@redhat.com>
>
> * common.opt: Add -ftree-strlen option.
Maybe sth more generic? -foptimize-string-ops? Eventually guard
the existing string op foldings with that flag as well.
> * Makefile.in (OBJS): Add tree-ssa-strlen.o.
> (tree-sssa-strlen.o): Add dependencies.
> * opts.c (default_options_table): Enable -ftree-strlen
> by default at -O2 if not -Os.
> * passes.c (init_optimization_passes): Add pass_strlen
> after pass_object_sizes.
> * timevar.def (TV_TREE_STRLEN): New timevar.
> * tree-pass.h (pass_strlen): Declare.
> * tree-ssa-strlen.c: New file.
>
> * gcc.dg/strlenopt-1.c: New test.
> * gcc.dg/strlenopt-2.c: New test.
> * gcc.dg/strlenopt-3.c: New test.
> * gcc.dg/strlenopt.h: New file.
>
> --- gcc/common.opt.jj 2011-08-26 18:41:44.000000000 +0200
> +++ gcc/common.opt 2011-08-30 10:57:36.000000000 +0200
> @@ -1953,6 +1953,10 @@ ftree-fre
> Common Report Var(flag_tree_fre) Optimization
> Enable Full Redundancy Elimination (FRE) on trees
>
> +ftree-strlen
> +Common Report Var(flag_tree_strlen) Optimization
> +Enable string length optimizations on trees
> +
> ftree-loop-distribution
> Common Report Var(flag_tree_loop_distribution) Optimization
> Enable loop distribution on trees
> --- gcc/Makefile.in.jj 2011-08-26 18:41:44.000000000 +0200
> +++ gcc/Makefile.in 2011-09-02 15:43:02.000000000 +0200
> @@ -1472,6 +1472,7 @@ OBJS = \
> tree-ssa-reassoc.o \
> tree-ssa-sccvn.o \
> tree-ssa-sink.o \
> + tree-ssa-strlen.o \
> tree-ssa-structalias.o \
> tree-ssa-ter.o \
> tree-ssa-threadedge.o \
> @@ -3157,6 +3158,9 @@ tree-ssa-ccp.o : tree-ssa-ccp.c $(TREE_F
> $(TREE_DUMP_H) $(BASIC_BLOCK_H) $(TREE_PASS_H) langhooks.h \
> tree-ssa-propagate.h value-prof.h $(FLAGS_H) $(TARGET_H) $(DIAGNOSTIC_CORE_H) \
> $(DBGCNT_H) tree-pretty-print.h gimple-pretty-print.h gimple-fold.h
> +tree-ssa-strlen.o : tree-ssa-strlen.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
> + $(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h alloc-pool.h tree-ssa-propagate.h \
> + gimple-pretty-print.h
> tree-sra.o : tree-sra.c $(CONFIG_H) $(SYSTEM_H) coretypes.h alloc-pool.h \
> $(TM_H) $(TREE_H) $(GIMPLE_H) $(CGRAPH_H) $(TREE_FLOW_H) \
> $(IPA_PROP_H) $(DIAGNOSTIC_H) statistics.h $(TREE_DUMP_H) $(TIMEVAR_H) \
> --- gcc/opts.c.jj 2011-06-30 17:58:03.000000000 +0200
> +++ gcc/opts.c 2011-09-02 15:53:06.000000000 +0200
> @@ -484,6 +484,7 @@ static const struct default_options defa
> { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 },
> { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 },
> { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 },
> + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_strlen, NULL, 1 },
Why not -Os? Doesn't it remove strlen calls?
> /* -O3 optimizations. */
> { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
> --- gcc/passes.c.jj 2011-07-12 07:58:48.000000000 +0200
> +++ gcc/passes.c 2011-08-31 12:03:29.000000000 +0200
> @@ -1,7 +1,7 @@
> /* Top level of GCC compilers (cc1, cc1plus, etc.)
> Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
> - 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
> - Free Software Foundation, Inc.
> + 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
> + 2011 Free Software Foundation, Inc.
>
> This file is part of GCC.
>
> @@ -1321,6 +1321,7 @@ init_optimization_passes (void)
> NEXT_PASS (pass_forwprop);
> NEXT_PASS (pass_phiopt);
> NEXT_PASS (pass_object_sizes);
> + NEXT_PASS (pass_strlen);
> NEXT_PASS (pass_ccp);
> NEXT_PASS (pass_copy_prop);
> NEXT_PASS (pass_cse_sincos);
> --- gcc/timevar.def.jj 2011-05-04 10:14:08.000000000 +0200
> +++ gcc/timevar.def 2011-08-30 10:54:32.000000000 +0200
> @@ -1,7 +1,7 @@
> /* This file contains the definitions for timing variables used to
> measure run-time performance of the compiler.
> Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008,
> - 2009, 2010
> + 2009, 2010, 2011
> Free Software Foundation, Inc.
> Contributed by Alex Samuel <samuel@codesourcery.com>
>
> @@ -183,6 +183,7 @@ DEFTIMEVAR (TV_TREE_COPY_RENAME , "
> DEFTIMEVAR (TV_TREE_SSA_VERIFY , "tree SSA verifier")
> DEFTIMEVAR (TV_TREE_STMT_VERIFY , "tree STMT verifier")
> DEFTIMEVAR (TV_TREE_SWITCH_CONVERSION, "tree switch initialization conversion")
> +DEFTIMEVAR (TV_TREE_STRLEN , "tree strlen optimization")
> DEFTIMEVAR (TV_CGRAPH_VERIFY , "callgraph verifier")
> DEFTIMEVAR (TV_DOM_FRONTIERS , "dominance frontiers")
> DEFTIMEVAR (TV_DOMINANCE , "dominance computation")
> --- gcc/tree-pass.h.jj 2011-07-08 15:09:38.000000000 +0200
> +++ gcc/tree-pass.h 2011-08-30 10:55:27.000000000 +0200
> @@ -412,6 +412,7 @@ extern struct gimple_opt_pass pass_diagn
> extern struct gimple_opt_pass pass_expand_omp;
> extern struct gimple_opt_pass pass_expand_omp_ssa;
> extern struct gimple_opt_pass pass_object_sizes;
> +extern struct gimple_opt_pass pass_strlen;
> extern struct gimple_opt_pass pass_fold_builtins;
> extern struct gimple_opt_pass pass_stdarg;
> extern struct gimple_opt_pass pass_early_warn_uninitialized;
> --- gcc/tree-ssa-strlen.c.jj 2011-08-30 10:56:51.000000000 +0200
> +++ gcc/tree-ssa-strlen.c 2011-09-02 15:50:55.000000000 +0200
> @@ -0,0 +1,1288 @@
> +/* String length optimization
> + Copyright (C) 2011 Free Software Foundation, Inc.
> + Contributed by Jakub Jelinek <jakub@redhat.com>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree-flow.h"
> +#include "tree-pass.h"
> +#include "domwalk.h"
> +#include "alloc-pool.h"
> +#include "tree-ssa-propagate.h"
> +#include "gimple-pretty-print.h"
> +
> +/* Array indexed by SSA_NAME_VERSION. 0 means unknown, positive value
> + is an index into strinfo vector, negative value stands for
> + string length of a string literal (~strlen). */
> +static int *ssa_ver_to_stridx;
> +
> +/* Number of currently active string indexes plus one. */
> +static int max_stridx;
USe a VEC?
> +/* String information record. */
> +typedef struct strinfo_struct
> +{
> + /* String length of this string. */
> + tree length;
> + /* Any of the corresponding pointers for querying alias oracle. */
> + tree ptr;
> + /* Reference count. Any changes to strinfo entry possibly shared
> + with dominating basic blocks need unshare_strinfo first, except
> + for dont_invalidate which affects only the immediately next
> + maybe_invalidate. */
> + int refcount;
> + /* Copy of index. get_strinfo (si->idx) should return si; */
> + int idx;
> + /* These 3 fields are for chaining related string pointers together.
> + E.g. for
> + bl = strlen (b); dl = strlen (d); strcpy (a, b); c = a + bl;
> + strcpy (c, d); e = c + dl;
> + strinfo(a) -> strinfo(c) -> strinfo(e)
> + All have ->first field equal to strinfo(a)->idx and are doubly
> + chained through prev/next fields. The later strinfos are required
> + to point into the same string with zero or more bytes after
> + the previous pointer and all bytes in between the two pointers
> + must be non-zero. Functions like strcpy or memcpy are supposed
> + to adjust all previous strinfo lengths, but not following strinfo
> + lengths (those are uncertain, usually invalidated during
> + maybe_invalidate, except when the alias oracle knows better).
> + Functions like strcat on the other side adjust the whole
> + related strinfo chain.
> + They are updated lazily, so to use the chain the same first fields
> + and si->prev->next == si->idx needs to be verified. */
> + int first;
> + int next;
> + int prev;
> + /* A flag for the next maybe_invalidate that this strinfo shouldn't
> + be invalidated. Always cleared by maybe_invalidate. */
> + bool dont_invalidate;
> +} *strinfo;
> +DEF_VEC_P(strinfo);
> +DEF_VEC_ALLOC_P(strinfo,heap);
> +
> +/* Pool for allocating strinfo_struct entries. */
> +static alloc_pool strinfo_pool;
> +
> +/* Vector mapping positive string indexes to strinfo, for the
> + current basic block. The first pointer in the vector is special,
> + it is either NULL, meaning the vector isn't shared, or it is
> + a basic block pointer to the owner basic_block if shared.
> + If some other bb wants to modify the vector, the vector needs
> + to be unshared first, and only the owner bb is supposed to free it. */
> +static VEC(strinfo, heap) *stridx_to_strinfo;
> +
> +struct stridxlist
> +{
> + struct stridxlist *next;
> + HOST_WIDE_INT offset;
> + int idx;
> +};
> +
> +struct decl_stridxlist_map
> +{
> + struct tree_map_base base;
> + struct stridxlist list;
> +};
> +
> +/* Hash table for mapping decls to a chained list of offset -> idx
> + mappings. */
> +static htab_t decl_to_stridxlist_htab;
> +
> +/* Hash a from tree in a decl_stridxlist_map. */
> +
> +static unsigned int
> +decl_to_stridxlist_hash (const void *item)
> +{
> + return DECL_UID (((const struct decl_stridxlist_map *) item)->base.from);
> +}
> +
> +/* Free a decl_stridxlist_map. Callback for htab_delete. */
> +
> +static void
> +decl_to_stridxlist_free (void *item)
> +{
> + struct stridxlist *next;
> + struct stridxlist *list = ((struct decl_stridxlist_map *) item)->list.next;
> +
> + while (list)
> + {
> + next = list->next;
> + XDELETE (list);
> + list = next;
> + }
> + XDELETE (item);
Maybe use an obstack or alloc-pool dependent on re-use?
> +}
> +
> +/* Return string index for EXP. */
> +
> +static int
> +get_stridx (tree exp)
> +{
> + tree l;
> +
> + if (TREE_CODE (exp) == SSA_NAME)
> + return ssa_ver_to_stridx[SSA_NAME_VERSION (exp)];
> +
> + if (TREE_CODE (exp) == ADDR_EXPR && decl_to_stridxlist_htab)
> + {
> + HOST_WIDE_INT off;
> + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0),
> + &off);
> + if (base && DECL_P (base))
> + {
> + struct decl_stridxlist_map ent, *e;
> + ent.base.from = base;
> + e = (struct decl_stridxlist_map *)
> + htab_find_with_hash (decl_to_stridxlist_htab, &ent,
> + DECL_UID (base));
> + if (e)
> + {
> + struct stridxlist *list = &e->list;
> + do
> + {
> + if (list->offset == off)
> + return list->idx;
> + list = list->next;
> + }
> + while (list);
> + }
> + }
> + }
> +
> + l = c_strlen (exp, 0);
> + if (l != NULL_TREE
> + && host_integerp (l, 1))
> + {
> + unsigned HOST_WIDE_INT len = tree_low_cst (l, 1);
> + if (len == (unsigned int) len
> + && (int) len >= 0)
> + return ~(int) len;
> + }
> + return 0;
> +}
> +
> +/* Return true if strinfo vector is shared with the immediate dominator. */
> +
> +static inline bool
> +strinfo_shared (void)
> +{
> + return VEC_length (strinfo, stridx_to_strinfo)
> + && VEC_index (strinfo, stridx_to_strinfo, 0) != NULL;
> +}
> +
> +/* Unshare strinfo vector that is shared with the immediate dominator. */
> +
> +static void
> +unshare_strinfo_vec (void)
> +{
> + strinfo si;
> + unsigned int i = 0;
> +
> + gcc_assert (strinfo_shared ());
> + stridx_to_strinfo = VEC_copy (strinfo, heap, stridx_to_strinfo);
> + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i)
> + if (si != NULL)
> + si->refcount++;
> + VEC_replace (strinfo, stridx_to_strinfo, 0, NULL);
> +}
> +
> +/* Attempt to create a string index for ADDR_EXPR exp.
> + Return a pointer to the location where the string index can
> + be stored (if 0) or is stored, or NULL if this can't be tracked. */
> +
> +static int *
> +addr_stridxptr (tree exp)
> +{
> + void **slot;
> + struct decl_stridxlist_map ent;
> + struct stridxlist *list;
> + HOST_WIDE_INT off;
> +
> + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0), &off);
> + if (base == NULL_TREE || !DECL_P (base))
> + return NULL;
> +
> + if (decl_to_stridxlist_htab == NULL)
> + decl_to_stridxlist_htab
> + = htab_create (64, decl_to_stridxlist_hash, tree_map_base_eq,
> + decl_to_stridxlist_free);
> + ent.base.from = base;
> + slot = htab_find_slot_with_hash (decl_to_stridxlist_htab, &ent,
> + DECL_UID (base), INSERT);
> + if (*slot)
> + {
> + int i;
> + list = &((struct decl_stridxlist_map *)*slot)->list;
> + for (i = 0; i < 16; i++)
> + {
> + if (list->offset == off)
> + return &list->idx;
> + if (list->next == NULL)
> + break;
> + }
> + if (i == 16)
> + return NULL;
> + list->next = XNEW (struct stridxlist);
> + list = list->next;
> + }
> + else
> + {
> + struct decl_stridxlist_map *e = XNEW (struct decl_stridxlist_map);
> + e->base.from = base;
> + *slot = (void *) e;
> + list = &e->list;
> + }
> + list->next = NULL;
> + list->offset = off;
> + list->idx = 0;
> + return &list->idx;
> +}
> +
> +/* Create a new string index, or return 0 if reached limit. */
> +
> +static int
> +new_stridx (tree exp)
> +{
> + int idx;
> + if (max_stridx == 1000)
I suppose make this a #define or --param
> + return 0;
> + if (TREE_CODE (exp) == SSA_NAME)
> + {
> + idx = max_stridx++;
> + ssa_ver_to_stridx[SSA_NAME_VERSION (exp)] = idx;
> + return idx;
> + }
> + if (TREE_CODE (exp) == ADDR_EXPR)
> + {
> + int *pidx = addr_stridxptr (exp);
> + if (pidx != NULL)
> + {
> + gcc_assert (*pidx == 0);
> + *pidx = max_stridx++;
> + return *pidx;
> + }
> + }
> + return 0;
> +}
> +
> +/* Create a new strinfo. */
> +
> +static strinfo
> +new_strinfo (tree ptr, int idx, tree length)
> +{
> + strinfo si = (strinfo) pool_alloc (strinfo_pool);
> + si->length = length;
> + si->ptr = ptr;
> + si->refcount = 1;
> + si->idx = idx;
> + si->first = 0;
> + si->prev = 0;
> + si->next = 0;
> + si->dont_invalidate = false;
> + return si;
> +}
> +
> +/* Decrease strinfo refcount and free it if not referenced anymore. */
> +
> +static inline void
> +free_strinfo (strinfo si)
> +{
> + if (si && --si->refcount == 0)
> + pool_free (strinfo_pool, si);
> +}
> +
> +/* Return strinfo vector entry IDX. */
> +
> +static inline strinfo
> +get_strinfo (int idx)
> +{
> + if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx)
> + return NULL;
> + return VEC_index (strinfo, stridx_to_strinfo, idx);
> +}
> +
> +/* Set strinfo in the vector entry IDX to SI. */
> +
> +static inline void
> +set_strinfo (int idx, strinfo si)
> +{
> + if (VEC_length (strinfo, stridx_to_strinfo) && VEC_index (strinfo, stridx_to_strinfo, 0))
> + unshare_strinfo_vec ();
> + if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx)
> + VEC_safe_grow_cleared (strinfo, heap, stridx_to_strinfo, idx + 1);
> + VEC_replace (strinfo, stridx_to_strinfo, idx, si);
> +}
> +
> +/* Invalidate string length information for strings whose length
> + might change due to stores in stmt. */
> +
> +static bool
> +maybe_invalidate (gimple stmt)
> +{
> + strinfo si;
> + unsigned int i;
> + bool nonempty = false;
> +
> + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i)
> + if (si != NULL)
> + {
> + if (!si->dont_invalidate)
> + {
> + ao_ref r;
> + ao_ref_init_from_ptr_and_size (&r, si->ptr, NULL_TREE);
> + if (stmt_may_clobber_ref_p_1 (stmt, &r))
> + {
> + set_strinfo (i, NULL);
> + free_strinfo (si);
> + continue;
> + }
> + }
> + si->dont_invalidate = false;
> + nonempty = true;
> + }
> + return nonempty;
> +}
> +
> +/* Unshare strinfo record SI, if it has recount > 1 or
> + if stridx_to_strinfo vector is shared with some other
> + bbs. */
> +
> +static strinfo
> +unshare_strinfo (strinfo si)
> +{
> + strinfo nsi;
> +
> + if (si->refcount == 1 && !strinfo_shared ())
> + return si;
> +
> + nsi = new_strinfo (si->ptr, si->idx, si->length);
> + nsi->first = si->first;
> + nsi->prev = si->prev;
> + nsi->next = si->next;
> + set_strinfo (si->idx, nsi);
> + free_strinfo (si);
> + return nsi;
> +}
> +
> +/* Return first strinfo in the related strinfo chain
> + if all strinfos in between belong to the chain, otherwise
> + NULL. */
> +
> +static strinfo
> +verify_related_strinfos (strinfo origsi)
> +{
> + strinfo si = origsi, psi;
> +
> + if (origsi->first == 0)
> + return NULL;
> + for (; si->prev; si = psi)
> + {
> + if (si->first != origsi->first)
> + return NULL;
> + psi = get_strinfo (si->prev);
> + if (psi == NULL)
> + return NULL;
> + if (psi->next != si->idx)
> + return NULL;
> + }
> + if (si->idx != si->first)
> + return NULL;
> + return si;
> +}
> +
> +/* Note that PTR, a pointer SSA_NAME initialized in the current stmt, points
> + to a zero-length string and if possible chain it to a related strinfo
> + chain whose part is or might be CHAINSI. */
> +
> +static strinfo
> +zero_length_string (tree ptr, strinfo chainsi)
> +{
> + strinfo si;
> + int idx;
> + gcc_checking_assert (TREE_CODE (ptr) == SSA_NAME
> + && get_stridx (ptr) == 0);
> +
> + if (chainsi != NULL)
> + {
> + if (verify_related_strinfos (chainsi))
> + {
> + for (; chainsi->next; chainsi = si)
> + {
> + si = get_strinfo (chainsi->next);
> + if (si == NULL
> + || si->first != chainsi->first
> + || si->prev != chainsi->idx)
> + break;
> + }
> + if (integer_zerop (chainsi->length))
> + {
> + if (chainsi->next)
> + {
> + chainsi = unshare_strinfo (chainsi);
> + chainsi->next = 0;
> + }
> + ssa_ver_to_stridx [SSA_NAME_VERSION (ptr)] = chainsi->idx;
> + return chainsi;
> + }
> + }
> + else if (chainsi->first || chainsi->prev || chainsi->next)
> + {
> + chainsi = unshare_strinfo (chainsi);
> + chainsi->first = 0;
> + chainsi->prev = 0;
> + chainsi->next = 0;
> + }
> + }
> + idx = new_stridx (ptr);
> + if (idx == 0)
> + return NULL;
> + si = new_strinfo (ptr, idx, build_int_cst (size_type_node, 0));
> + set_strinfo (idx, si);
> + if (chainsi != NULL)
> + {
> + chainsi = unshare_strinfo (chainsi);
> + if (chainsi->first == 0)
> + chainsi->first = chainsi->idx;
> + chainsi->next = idx;
> + si->prev = chainsi->idx;
> + si->first = chainsi->first;
> + }
> + return si;
> +}
> +
> +/* For strinfo ORIGSI whose length has been just updated
> + update also related strinfo lengths (add ADJ to each,
> + but don't adjust ORIGSI). */
> +
> +static void
> +adjust_related_strinfos (location_t loc, strinfo origsi, tree adj)
> +{
> + strinfo si = verify_related_strinfos (origsi);
> +
> + if (si == NULL)
> + return;
> +
> + while (1)
> + {
> + strinfo nsi;
> +
> + if (si != origsi)
> + {
> + tree tem;
> +
> + si = unshare_strinfo (si);
> + tem = fold_convert_loc (loc, TREE_TYPE (si->length), adj);
> + si->length = fold_build2_loc (loc, PLUS_EXPR,
> + TREE_TYPE (si->length), si->length,
> + tem);
> + si->dont_invalidate = true;
> + }
> + if (si->next == 0)
> + return;
> + nsi = get_strinfo (si->next);
> + if (nsi == NULL
> + || nsi->first != si->first
> + || nsi->prev != si->idx)
> + return;
> + si = nsi;
> + }
> +}
> +
> +/* Find if there are other SSA_NAME pointers equal to PTR
> + for which we don't track their string lengths yet. If so, use
> + IDX for them. */
> +
> +static void
> +find_equal_ptrs (tree ptr, int idx)
> +{
> + if (TREE_CODE (ptr) != SSA_NAME)
> + return;
> + while (1)
> + {
> + gimple stmt = SSA_NAME_DEF_STMT (ptr);
> + if (!gimple_assign_single_p (stmt)
> + && (!gimple_assign_cast_p (stmt)
> + || !POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))))
> + return;
I'd prefer postive checks to guard code below. So, you're handling
SSA name copies, conversions from pointers and assignments from
invariants. You could simply do
if (!is+gimple_assign (stmt))
return;
switch (gimple_assign_rhs_code (stmt))
{
case SSA_NAME:
..
case ADDR_EXPR:
..
CASE_CONVERT:
...
default:
return;
}
> + ptr = gimple_assign_rhs1 (stmt);
> + if (TREE_CODE (ptr) != SSA_NAME)
> + {
> + if (TREE_CODE (ptr) == ADDR_EXPR)
> + {
> + int *pidx = addr_stridxptr (ptr);
> + if (pidx != NULL && *pidx == 0)
> + *pidx = idx;
> + return;
> + }
> + return;
> + }
> + if (ssa_ver_to_stridx[SSA_NAME_VERSION (ptr)] != 0)
> + return;
> + ssa_ver_to_stridx[SSA_NAME_VERSION (ptr)] = idx;
> + }
> +}
> +
> +/* Handle a strlen call. If strlen of the argument is known, replace
> + the strlen call with the known value, otherwise remember that strlen
> + of the argument is stored in the lhs SSA_NAME. */
> +
> +static void
> +handle_builtin_strlen (gimple_stmt_iterator *gsi)
> +{
> + int idx;
> + tree src;
> + gimple stmt = gsi_stmt (*gsi);
> + tree lhs = gimple_call_lhs (stmt);
> +
> + if (lhs == NULL_TREE)
> + return;
> +
> + src = gimple_call_arg (stmt, 0);
> + idx = get_stridx (src);
> + if (idx)
> + {
> + if (idx < 0)
> + {
> + tree rhs = build_int_cst (TREE_TYPE (lhs), ~idx);
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + {
> + fprintf (dump_file, "Optimizing: ");
> + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> + }
> + if (update_call_from_tree (gsi, rhs))
> + {
> + update_stmt (gsi_stmt (*gsi));
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + {
> + fprintf (dump_file, "into: ");
> + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM);
> + }
> + }
> + else if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + fprintf (dump_file, "not possible.\n");
> + }
> + else
> + {
> + strinfo si = get_strinfo (idx);
> + if (si != NULL)
> + {
> + tree rhs = si->length;
> + if (!useless_type_conversion_p (TREE_TYPE (lhs),
> + TREE_TYPE (rhs)))
> + rhs = fold_convert_loc (gimple_location (stmt),
> + TREE_TYPE (lhs), si->length);
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + {
> + fprintf (dump_file, "Optimizing: ");
> + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> + }
> + if (!update_call_from_tree (gsi, rhs))
> + {
> + rhs = force_gimple_operand_gsi (gsi, rhs, true, NULL_TREE,
> + true, GSI_SAME_STMT);
> + if (!update_call_from_tree (gsi, rhs))
if update_call_from_tree fails then gimplify_and_update_call_from_tree
will always succeed. See gimple_fold_call.
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + fprintf (dump_file, "not possible.\n");
> + return;
> + }
> + }
> + update_stmt (gsi_stmt (*gsi));
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + {
> + fprintf (dump_file, "into: ");
> + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM);
> + }
> + if (TREE_CODE (si->length) != SSA_NAME
> + && TREE_CODE (si->length) != INTEGER_CST
> + && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
> + {
> + si = unshare_strinfo (si);
> + si->length = lhs;
> + }
> + }
> + }
> + }
> + else if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
> + {
> + idx = new_stridx (src);
> + if (idx)
> + {
> + strinfo si = new_strinfo (src, idx, lhs);
> + set_strinfo (idx, si);
> + find_equal_ptrs (src, idx);
> + }
> + }
> +}
> +
> +/* Handle a strcpy-like ({st{r,p}cpy,__st{r,p}cpy_chk}) call.
> + If strlen of the second argument is known, strlen of the first argument
> + is the same after this call. Furthermore, attempt to convert it to
> + memcpy. */
> +
> +static void
> +handle_builtin_strcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi)
> +{
> + int idx, didx;
> + tree src, dst, len, lhs, rhs, args, type, fn, oldlen;
> + gimple stmt = gsi_stmt (*gsi);
> + strinfo si, dsi, olddsi, zsi;
> + location_t loc;
> +
> + src = gimple_call_arg (stmt, 1);
> + dst = gimple_call_arg (stmt, 0);
> + idx = get_stridx (src);
> + if (idx <= 0)
> + return;
> +
> + si = get_strinfo (idx);
> + if (si == NULL)
> + return;
> +
> + didx = get_stridx (dst);
> + olddsi = NULL;
> + oldlen = NULL_TREE;
> + if (didx > 0)
> + olddsi = get_strinfo (didx);
> + else if (didx < 0)
> + return;
> + else
> + {
> + didx = new_stridx (dst);
> + if (didx == 0)
> + return;
> + }
> + if (olddsi != NULL)
> + {
> + oldlen = olddsi->length;
> + dsi = unshare_strinfo (olddsi);
> + dsi->length = si->length;
> + /* Break the chain, so adjust_related_strinfo on later pointers in
> + the chain won't adjust this one anymore. */
> + dsi->next = 0;
> + }
> + else
> + {
> + dsi = new_strinfo (dst, didx, si->length);
> + set_strinfo (didx, dsi);
> + find_equal_ptrs (dst, didx);
> + }
> + dsi->dont_invalidate = true;
> + loc = gimple_location (stmt);
> + if (olddsi != NULL)
> + {
> + tree adj = NULL_TREE;
> + if (integer_zerop (oldlen))
> + adj = si->length;
> + else if (TREE_CODE (oldlen) == INTEGER_CST
> + || TREE_CODE (si->length) == INTEGER_CST)
> + adj = fold_build2_loc (loc, MINUS_EXPR,
> + TREE_TYPE (si->length), si->length,
> + fold_convert_loc (loc, TREE_TYPE (si->length),
> + oldlen));
> + if (adj != NULL_TREE)
> + adjust_related_strinfos (loc, dsi, adj);
> + }
> + /* strcpy src may not overlap dst, so src doesn't need to be
> + invalidated either. */
> + si->dont_invalidate = true;
> +
> + lhs = gimple_call_lhs (stmt);
> + fn = NULL_TREE;
> + zsi = NULL;
> + switch (bcode)
> + {
> + case BUILT_IN_STRCPY:
> + fn = implicit_built_in_decls[BUILT_IN_MEMCPY];
> + if (lhs)
> + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx;
> + break;
> + case BUILT_IN_STRCPY_CHK:
> + fn = built_in_decls[BUILT_IN_MEMCPY_CHK];
> + if (lhs)
> + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx;
> + break;
> + case BUILT_IN_STPCPY:
> + /* This would need adjustment of the lhs (subtract one),
> + or detection that the trailing '\0' doesn't need to be
> + written, if it will be immediately overwritten.
> + fn = built_in_decls[BUILT_IN_MEMPCPY]; */
> + if (lhs)
> + zsi = zero_length_string (lhs, dsi);
> + break;
> + case BUILT_IN_STPCPY_CHK:
> + /* This would need adjustment of the lhs (subtract one),
> + or detection that the trailing '\0' doesn't need to be
> + written, if it will be immediately overwritten.
> + fn = built_in_decls[BUILT_IN_MEMPCPY_CHK]; */
> + if (lhs)
> + zsi = zero_length_string (lhs, dsi);
> + break;
> + default:
> + gcc_unreachable ();
> + }
> + if (zsi != NULL)
> + zsi->dont_invalidate = true;
> +
> + if (fn == NULL_TREE)
> + return;
> +
> + args = TYPE_ARG_TYPES (TREE_TYPE (fn));
> + type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args)));
> +
> + len = fold_convert_loc (loc, type, si->length);
> + len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type, 1));
> + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true,
> + GSI_SAME_STMT);
> + if (gimple_call_num_args (stmt) == 2)
> + rhs = build_call_expr_loc (loc, fn, 3, dst, src, len);
> + else
> + rhs = build_call_expr_loc (loc, fn, 4, dst, src, len,
> + gimple_call_arg (stmt, 2));
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + {
> + fprintf (dump_file, "Optimizing: ");
> + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> + }
> + if (update_call_from_tree (gsi, rhs))
Btw, it would be nice if you'd not use build_call_expr and then gimplify
the call but instead build a new gimple call directly ... or modify it
in-place.
> + {
> + update_stmt (gsi_stmt (*gsi));
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + {
> + fprintf (dump_file, "into: ");
> + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM);
> + }
> + }
> + else if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + fprintf (dump_file, "not possible.\n");
> +}
> +
> +/* Handle a memcpy-like ({mem{,p}cpy,__mem{,p}cpy_chk}) call.
> + If strlen of the second argument is known and length of the third argument
> + is that plus one, strlen of the first argument is the same after this
> + call. */
> +
> +static void
> +handle_builtin_memcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi)
> +{
> + int idx, didx;
> + tree src, dst, len, lhs, oldlen, newlen;
> + gimple stmt = gsi_stmt (*gsi);
> + strinfo si, dsi, olddsi;
> +
> + len = gimple_call_arg (stmt, 2);
> + src = gimple_call_arg (stmt, 1);
> + dst = gimple_call_arg (stmt, 0);
> + idx = get_stridx (src);
> + if (idx == 0)
> + return;
> +
> + if (idx > 0)
> + {
> + gimple def_stmt;
> +
> + /* Handle memcpy (x, y, l) where l is strlen (y) + 1. */
> + si = get_strinfo (idx);
> + if (si == NULL)
> + return;
> + if (TREE_CODE (len) != SSA_NAME)
> + return;
> + def_stmt = SSA_NAME_DEF_STMT (len);
> + if (!is_gimple_assign (def_stmt)
> + || gimple_assign_rhs_code (def_stmt) != PLUS_EXPR
> + || gimple_assign_rhs1 (def_stmt) != si->length
> + || !integer_onep (gimple_assign_rhs2 (def_stmt)))
> + return;
> + }
> + else
> + {
> + si = NULL;
> + /* Handle memcpy (x, "abcd", 5) or
> + memcpy (x, "abc\0uvw", 7). */
> + if (!host_integerp (len, 1)
> + || (unsigned HOST_WIDE_INT) tree_low_cst (len, 1)
> + <= (unsigned HOST_WIDE_INT) ~idx)
> + return;
> + }
> +
> + didx = get_stridx (dst);
> + olddsi = NULL;
> + if (didx > 0)
> + olddsi = get_strinfo (didx);
> + else if (didx < 0)
> + return;
> + else
> + {
> + didx = new_stridx (dst);
> + if (didx == 0)
> + return;
> + }
> + if (si != NULL)
> + newlen = si->length;
> + else
> + newlen = build_int_cst (TREE_TYPE (len), ~idx);
> + oldlen = NULL_TREE;
> + if (olddsi != NULL)
> + {
> + dsi = unshare_strinfo (olddsi);
> + oldlen = olddsi->length;
> + dsi->length = newlen;
> + /* Break the chain, so adjust_related_strinfo on later pointers in
> + the chain won't adjust this one anymore. */
> + dsi->next = 0;
> + }
> + else
> + {
> + dsi = new_strinfo (dst, didx, newlen);
> + set_strinfo (didx, dsi);
> + find_equal_ptrs (dst, didx);
> + }
> + dsi->dont_invalidate = true;
> + if (olddsi != NULL)
> + {
> + tree adj = NULL_TREE;
> + location_t loc = gimple_location (stmt);
> + if (integer_zerop (oldlen))
> + adj = dsi->length;
> + else if (TREE_CODE (oldlen) == INTEGER_CST
> + || TREE_CODE (dsi->length) == INTEGER_CST)
> + adj = fold_build2_loc (loc, MINUS_EXPR,
> + TREE_TYPE (dsi->length), dsi->length,
> + fold_convert_loc (loc, TREE_TYPE (dsi->length),
> + oldlen));
> + if (adj != NULL_TREE)
> + adjust_related_strinfos (loc, dsi, adj);
> + }
> + /* memcpy src may not overlap dst, so src doesn't need to be
> + invalidated either. */
> + if (si != NULL)
> + si->dont_invalidate = true;
> +
> + lhs = gimple_call_lhs (stmt);
> + switch (bcode)
> + {
> + case BUILT_IN_MEMCPY:
> + case BUILT_IN_MEMCPY_CHK:
> + if (lhs)
> + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx;
> + break;
> + case BUILT_IN_MEMPCPY:
> + case BUILT_IN_MEMPCPY_CHK:
> + break;
> + default:
> + gcc_unreachable ();
> + }
> +}
> +
> +/* Handle a strcat-like ({strcat,__strcat_chk}) call.
> + If strlen of the second argument is known, strlen of the first argument
> + is increased by the length of the second argument. Furthermore, attempt
> + to convert it to memcpy/strcpy if the length of the first argument
> + is known. */
> +
> +static void
> +handle_builtin_strcat (enum built_in_function bcode, gimple_stmt_iterator *gsi)
> +{
> + int idx, didx;
> + tree src, dst, dstlen, len, lhs, rhs, args, type, fn, objsz;
> + gimple stmt = gsi_stmt (*gsi);
> + strinfo si, dsi;
> + location_t loc;
> +
> + src = gimple_call_arg (stmt, 1);
> + dst = gimple_call_arg (stmt, 0);
> +
> + didx = get_stridx (dst);
> + if (didx <= 0)
> + return;
> +
> + dsi = get_strinfo (didx);
> + if (dsi == NULL)
> + return;
> +
> + /* This should have been folded, don't handle it here. */
> + idx = get_stridx (src);
> + if (idx < 0)
> + return;
> +
> + si = NULL;
> + if (idx)
> + si = get_strinfo (idx);
> +
> + loc = gimple_location (stmt);
> + dstlen = dsi->length;
> + if (si != NULL)
> + {
> + dsi = unshare_strinfo (dsi);
> + dsi->dont_invalidate = true;
> + dsi->length = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (dsi->length),
> + dsi->length, si->length);
> + adjust_related_strinfos (loc, dsi, dstlen);
> + }
> + else
> + {
> + set_strinfo (didx, NULL);
> + free_strinfo (dsi);
> + }
> + if (si != NULL)
> + /* strcat src may not overlap dst, so src doesn't need to be
> + invalidated either. */
> + si->dont_invalidate = true;
> +
> + lhs = gimple_call_lhs (stmt);
> + /* For now. Could remove the lhs from the call and add
> + lhs = dst; afterwards. */
> + if (lhs)
> + return;
> +
> + fn = NULL_TREE;
> + objsz = NULL_TREE;
> + switch (bcode)
> + {
> + case BUILT_IN_STRCAT:
> + if (si)
> + fn = implicit_built_in_decls[BUILT_IN_MEMCPY];
> + else
> + fn = implicit_built_in_decls[BUILT_IN_STRCPY];
> + break;
> + case BUILT_IN_STRCAT_CHK:
> + if (si)
> + fn = built_in_decls[BUILT_IN_MEMCPY_CHK];
> + else
> + fn = built_in_decls[BUILT_IN_STRCPY_CHK];
> + objsz = gimple_call_arg (stmt, 2);
> + break;
> + default:
> + gcc_unreachable ();
> + }
> +
> + if (fn == NULL_TREE)
> + return;
> +
> + len = NULL_TREE;
> + if (si)
> + {
> + args = TYPE_ARG_TYPES (TREE_TYPE (fn));
> + type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args)));
> +
> + len = fold_convert_loc (loc, type, si->length);
> + len = fold_build2_loc (loc, PLUS_EXPR, type, len,
> + build_int_cst (type, 1));
> + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true,
> + GSI_SAME_STMT);
> + }
> + dst = fold_build2_loc (loc, POINTER_PLUS_EXPR,
> + TREE_TYPE (dst), dst,
> + fold_convert_loc (loc, sizetype, dstlen));
> + dst = force_gimple_operand_gsi (gsi, dst, true, NULL_TREE, true,
> + GSI_SAME_STMT);
> + if (si)
> + rhs = build_call_expr_loc (loc, fn, 3 + (objsz != NULL_TREE),
> + dst, src, len, objsz);
> + else
> + rhs = build_call_expr_loc (loc, fn, 2 + (objsz != NULL_TREE),
> + dst, src, objsz);
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + {
> + fprintf (dump_file, "Optimizing: ");
> + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> + }
> + if (update_call_from_tree (gsi, rhs))
> + {
> + update_stmt (gsi_stmt (*gsi));
> + if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + {
> + fprintf (dump_file, "into: ");
> + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM);
> + }
> + }
> + else if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> + fprintf (dump_file, "not possible.\n");
> +}
> +
> +/* Attempt to optimize a single statement at *GSI using string length
> + knowledge. */
> +
> +static void
> +strlen_optimize_stmt (gimple_stmt_iterator *gsi)
> +{
> + gimple stmt = gsi_stmt (*gsi);
> + tree lhs;
> +
> + if (is_gimple_call (stmt))
> + {
> + tree callee = gimple_call_fndecl (stmt);
> + if (callee && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL)
> + switch (DECL_FUNCTION_CODE (callee))
> + {
> + case BUILT_IN_STRLEN:
> + handle_builtin_strlen (gsi);
> + break;
> + case BUILT_IN_STRCPY:
> + case BUILT_IN_STRCPY_CHK:
> + case BUILT_IN_STPCPY:
> + case BUILT_IN_STPCPY_CHK:
> + handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi);
> + break;
> + case BUILT_IN_MEMCPY:
> + case BUILT_IN_MEMCPY_CHK:
> + case BUILT_IN_MEMPCPY:
> + case BUILT_IN_MEMPCPY_CHK:
> + handle_builtin_memcpy (DECL_FUNCTION_CODE (callee), gsi);
> + break;
> + case BUILT_IN_STRCAT:
> + case BUILT_IN_STRCAT_CHK:
> + handle_builtin_strcat (DECL_FUNCTION_CODE (callee), gsi);
> + break;
> + default:
> + break;
> + }
> + }
> + else if (is_gimple_assign (stmt)
> + && (lhs = gimple_assign_lhs (stmt)) != NULL_TREE
> + && TREE_CODE (lhs) == SSA_NAME
> + && POINTER_TYPE_P (TREE_TYPE (lhs)))
> + {
> + if (gimple_assign_single_p (stmt)
> + || (gimple_assign_cast_p (stmt)
> + && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))))
> + {
> + int idx = get_stridx (gimple_assign_rhs1 (stmt));
> + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = idx;
> + }
> + else if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
> + {
> + int idx = get_stridx (gimple_assign_rhs1 (stmt));
> + if (idx > 0)
> + {
> + strinfo si = get_strinfo (idx);
> + if (si != NULL)
> + {
> + tree off = gimple_assign_rhs2 (stmt);
> + if (operand_equal_p (si->length, off, 0))
> + zero_length_string (lhs, si);
> + else if (TREE_CODE (off) == SSA_NAME)
> + {
> + gimple def_stmt = SSA_NAME_DEF_STMT (off);
> + if (gimple_assign_single_p (def_stmt)
> + && operand_equal_p (si->length,
> + gimple_assign_rhs1 (def_stmt),
> + 0))
> + zero_length_string (lhs, si);
> + }
> + }
> + }
> + else if (idx < 0)
> + {
> + tree off = gimple_assign_rhs2 (stmt);
> + if (host_integerp (off, 1)
> + && (unsigned HOST_WIDE_INT) tree_low_cst (off, 1)
> + <= (unsigned HOST_WIDE_INT) ~idx)
> + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)]
> + = ~(int) tree_low_cst (off, 1);
> + }
> + }
> + }
> +
> + if (gimple_vdef (stmt))
> + maybe_invalidate (stmt);
> +}
> +
> +/* Recursively call maybe_invalidate on stmts that might be executed
> + in between dombb and current bb and that contain a vdef. Stop when
> + *count stmts are inspected, or if the whole strinfo vector has
> + been invalidated. */
> +
> +static void
> +do_invalidate (basic_block dombb, gimple phi, bitmap visited, int *count)
> +{
> + unsigned int i, n = gimple_phi_num_args (phi);
> +
> + for (i = 0; i < n; i++)
> + {
> + tree vuse = gimple_phi_arg_def (phi, i);
> + gimple stmt = SSA_NAME_DEF_STMT (vuse);
> + basic_block bb = gimple_bb (stmt);
> + if (bb == NULL
> + || !bitmap_set_bit (visited, bb->index)
> + || !dominated_by_p (CDI_DOMINATORS, bb, dombb))
> + continue;
> + while (1)
> + {
> + if (gimple_code (stmt) == GIMPLE_PHI)
> + {
> + do_invalidate (dombb, stmt, visited, count);
> + if (*count == 0)
> + return;
> + break;
> + }
> + if (--*count == 0)
> + return;
> + if (!maybe_invalidate (stmt))
> + {
> + *count = 0;
> + return;
> + }
> + vuse = gimple_vuse (stmt);
> + stmt = SSA_NAME_DEF_STMT (vuse);
> + if (gimple_bb (stmt) != bb)
> + {
> + bb = gimple_bb (stmt);
> + if (bb == NULL
> + || !bitmap_set_bit (visited, bb->index)
> + || !dominated_by_p (CDI_DOMINATORS, bb, dombb))
> + break;
> + }
> + }
> + }
> +}
> +
> +/* Callback for walk_dominator_tree. Attempt to optimize various
> + string ops by remembering string lenths pointed by pointer SSA_NAMEs. */
> +
> +static void
> +strlen_enter_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED,
> + basic_block bb)
> +{
> + gimple_stmt_iterator gsi;
> + basic_block dombb = get_immediate_dominator (CDI_DOMINATORS, bb);
> +
> + if (dombb == NULL)
> + stridx_to_strinfo = NULL;
> + else
> + {
> + stridx_to_strinfo = (VEC(strinfo, heap) *) dombb->aux;
> + if (stridx_to_strinfo)
> + {
> + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> + {
> + gimple phi = gsi_stmt (gsi);
> + if (!is_gimple_reg (gimple_phi_result (phi)))
> + {
> + bitmap visited = BITMAP_ALLOC (NULL);
> + int count_vdef = 100;
> + do_invalidate (dombb, phi, visited, &count_vdef);
> + BITMAP_FREE (visited);
> + break;
> + }
> + }
> + }
> + }
> +
> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> + strlen_optimize_stmt (&gsi);
> +
> + bb->aux = stridx_to_strinfo;
> + if (VEC_length (strinfo, stridx_to_strinfo) && !strinfo_shared ())
> + VEC_replace (strinfo, stridx_to_strinfo, 0, (strinfo) bb);
> +}
> +
> +/* Callback for walk_dominator_tree. Free strinfo vector if it is
> + owned by the current bb, clear bb->aux. */
> +
> +static void
> +strlen_leave_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED,
> + basic_block bb)
> +{
> + if (bb->aux)
> + {
> + stridx_to_strinfo = (VEC(strinfo, heap) *) bb->aux;
> + if (VEC_length (strinfo, stridx_to_strinfo)
> + && VEC_index (strinfo, stridx_to_strinfo, 0) == (strinfo) bb)
> + {
> + unsigned int i;
> + strinfo si;
> +
> + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i)
> + free_strinfo (si);
> + VEC_free (strinfo, heap, stridx_to_strinfo);
> + }
> + bb->aux = NULL;
> + }
> +}
> +
> +/* Main entry point. */
> +
> +static unsigned int
> +tree_ssa_strlen (void)
> +{
> + struct dom_walk_data walk_data;
> +
> + ssa_ver_to_stridx = XCNEWVEC (int, num_ssa_names);
> + max_stridx = 1;
> + strinfo_pool = create_alloc_pool ("strinfo_struct pool",
> + sizeof (struct strinfo_struct), 64);
> +
> + calculate_dominance_info (CDI_DOMINATORS);
> +
> + /* String length optimization is implemented as a walk of the dominator
> + tree and a forward walk of statements within each block. */
> + walk_data.dom_direction = CDI_DOMINATORS;
> + walk_data.initialize_block_local_data = NULL;
> + walk_data.before_dom_children = strlen_enter_block;
> + walk_data.after_dom_children = strlen_leave_block;
> + walk_data.block_local_data_size = 0;
> + walk_data.global_data = NULL;
> +
> + /* Initialize the dominator walker. */
> + init_walk_dominator_tree (&walk_data);
> +
> + /* Recursively walk the dominator tree. */
> + walk_dominator_tree (&walk_data, ENTRY_BLOCK_PTR);
> +
> + /* Finalize the dominator walker. */
> + fini_walk_dominator_tree (&walk_data);
> +
> + XDELETEVEC (ssa_ver_to_stridx);
> + free_alloc_pool (strinfo_pool);
> + if (decl_to_stridxlist_htab)
> + {
> + htab_delete (decl_to_stridxlist_htab);
> + decl_to_stridxlist_htab = NULL;
> + }
> +
> + return 0;
> +}
> +
> +static bool
> +gate_strlen (void)
> +{
> + return flag_tree_strlen != 0;
> +}
Overall this looks good - it feels a bit tree-ish, but I suppose support
for non-constant lengths requires this. It would be nice to avoid
building new tree calls - yeah, all our call folding still does this ... :/
Thanks,
Richard.
> +struct gimple_opt_pass pass_strlen =
> +{
> + {
> + GIMPLE_PASS,
> + "strlen", /* name */
> + gate_strlen, /* gate */
> + tree_ssa_strlen, /* execute */
> + NULL, /* sub */
> + NULL, /* next */
> + 0, /* static_pass_number */
> + TV_TREE_STRLEN, /* tv_id */
> + PROP_cfg | PROP_ssa, /* properties_required */
> + 0, /* properties_provided */
> + 0, /* properties_destroyed */
> + 0, /* todo_flags_start */
> + TODO_ggc_collect
> + | TODO_verify_ssa /* todo_flags_finish */
> + }
> +};
> --- gcc/testsuite/gcc.dg/strlenopt-1.c.jj 2011-09-02 10:39:12.000000000 +0200
> +++ gcc/testsuite/gcc.dg/strlenopt-1.c 2011-09-02 15:54:23.000000000 +0200
> @@ -0,0 +1,45 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-strlen" } */
> +
> +#include "strlenopt.h"
> +
> +__attribute__((noinline, noclone)) char *
> +foo (char *p, char *r)
> +{
> + char *q = malloc (strlen (p) + strlen (r) + 64);
> + if (q == NULL) return NULL;
> + /* This strcpy can be optimized into memcpy, using the remembered
> + strlen (p). */
> + strcpy (q, p);
> + /* These two strcat can be optimized into memcpy. The first one
> + could be even optimized into a *ptr = '/'; store as the '\0'
> + is immediately overwritten. */
> + strcat (q, "/");
> + strcat (q, "abcde");
> + /* Due to inefficient PTA (PR50262) the above calls invalidate
> + string length of r, so it is optimized just into strcpy instead
> + of memcpy. */
> + strcat (q, r);
> + return q;
> +}
> +
> +int
> +main ()
> +{
> + char *volatile p = "string1";
> + char *volatile r = "string2";
> + char *q = foo (p, r);
> + if (q != NULL)
> + {
> + if (strcmp (q, "string1/abcdestring2"))
> + abort ();
> + free (q);
> + }
> + return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "memcpy \\(" 3 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
> +/* { dg-final { cleanup-tree-dump "strlen" } } */
> --- gcc/testsuite/gcc.dg/strlenopt-2.c.jj 2011-09-02 15:31:33.000000000 +0200
> +++ gcc/testsuite/gcc.dg/strlenopt-2.c 2011-09-02 15:54:28.000000000 +0200
> @@ -0,0 +1,47 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-strlen" } */
> +
> +#include "strlenopt.h"
> +
> +__attribute__((noinline, noclone)) char *
> +foo (char *p, char *r)
> +{
> + char buf[26];
> + if (strlen (p) + strlen (r) + 9 > 26)
> + return NULL;
> + /* This strcpy can be optimized into memcpy, using the remembered
> + strlen (p). */
> + strcpy (buf, p);
> + /* These two strcat can be optimized into memcpy. The first one
> + could be even optimized into a *ptr = '/'; store as the '\0'
> + is immediately overwritten. */
> + strcat (buf, "/");
> + strcat (buf, "abcde");
> + /* This strcpy can be optimized into memcpy, using the remembered
> + strlen (r). */
> + strcat (buf, r);
> + /* And this can be optimized into memcpy too. */
> + strcat (buf, "fg");
> + return strdup (buf);
> +}
> +
> +int
> +main ()
> +{
> + char *volatile p = "string1";
> + char *volatile r = "string2";
> + char *q = foo (p, r);
> + if (q != NULL)
> + {
> + if (strcmp (q, "string1/abcdestring2fg"))
> + abort ();
> + free (q);
> + }
> + return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "memcpy \\(" 5 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
> +/* { dg-final { cleanup-tree-dump "strlen" } } */
> --- gcc/testsuite/gcc.dg/strlenopt-3.c.jj 2011-09-02 11:18:52.000000000 +0200
> +++ gcc/testsuite/gcc.dg/strlenopt-3.c 2011-09-02 15:54:32.000000000 +0200
> @@ -0,0 +1,63 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-strlen -fdump-tree-optimized" } */
> +
> +#include "strlenopt.h"
> +
> +__attribute__((noinline, noclone)) size_t
> +fn1 (char *p, char *q)
> +{
> + size_t s = strlen (q);
> + strcpy (p, q);
> + return s - strlen (p);
> +}
> +
> +__attribute__((noinline, noclone)) size_t
> +fn2 (char *p, char *q)
> +{
> + size_t s = strlen (q);
> + memcpy (p, q, s + 1);
> + return s - strlen (p);
> +}
> +
> +__attribute__((noinline, noclone)) size_t
> +fn3 (char *p)
> +{
> + memcpy (p, "abcd", 5);
> + return strlen (p);
> +}
> +
> +__attribute__((noinline, noclone)) size_t
> +fn4 (char *p)
> +{
> + memcpy (p, "efg\0hij", 6);
> + return strlen (p);
> +}
> +
> +int
> +main ()
> +{
> + char buf[64];
> + char *volatile p = buf;
> + char *volatile q = "ABCDEF";
> + buf[7] = 'G';
> + if (fn1 (p, q) != 0 || memcmp (buf, "ABCDEF\0G", 8))
> + abort ();
> + q = "HIJ";
> + if (fn2 (p + 1, q) != 0 || memcmp (buf, "AHIJ\0F\0G", 8))
> + abort ();
> + buf[6] = 'K';
> + if (fn3 (p + 1) != 4 || memcmp (buf, "Aabcd\0KG", 8))
> + abort ();
> + if (fn4 (p) != 3 || memcmp (buf, "efg\0hiKG", 8))
> + abort ();
> + return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "memcpy \\(" 4 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */
> +/* { dg-final { cleanup-tree-dump "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "return 0" 3 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "return 4" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "return 3" 1 "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> --- gcc/testsuite/gcc.dg/strlenopt.h.jj 2011-09-02 10:36:04.000000000 +0200
> +++ gcc/testsuite/gcc.dg/strlenopt.h 2011-09-02 12:39:08.000000000 +0200
> @@ -0,0 +1,57 @@
> +/* This is a replacement of needed parts from stdlib.h and string.h
> + for -ftree-strlen testing, to ensure we are testing the builtins
> + rather than whatever the OS has in its headers. */
> +
> +#define NULL ((void *) 0)
> +typedef __SIZE_TYPE__ size_t;
> +extern void abort (void);
> +void *malloc (size_t);
> +void free (void *);
> +char *strdup (const char *);
> +size_t strlen (const char *);
> +void *memcpy (void *__restrict, const void *__restrict, size_t);
> +char *strcpy (char *__restrict, const char *__restrict);
> +char *strcat (char *__restrict, const char *__restrict);
> +int memcmp (const void *, const void *, size_t);
> +int strcmp (const char *, const char *);
> +#ifdef USE_GNU
> +void *mempcpy (void *__restrict, const void *__restrict, size_t);
> +char *stpcpy (char *__restrict, const char *__restrict);
> +#endif
> +
> +#if defined(FORTIFY_SOURCE) && FORTIFY_SOURCE > 0 && __OPTIMIZE__
> +# define bos(ptr) __builtin_object_size (x, FORTIFY_SOURCE > 0)
> +# define bos0(ptr) __builtin_object_size (x, 0)
> +
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) void *
> +memcpy (void *__restrict dest, const void *__restrict src, size_t len)
> +{
> + return __builtin___memcpy_chk (dest, src, len, bos0 (dest));
> +}
> +
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) char *
> +strcpy (char *__restrict dest, const char *__restrict src)
> +{
> + return __builtin___strcpy_chk (dest, src, bos (dest));
> +}
> +
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) char *
> +strcat (char *__restrict dest, const char *__restrict src)
> +{
> + return __builtin___strcat_chk (dest, src, bos (dest));
> +}
> +
> +# ifdef USE_GNU
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) void *
> +mempcpy (void *__restrict dest, const void *__restrict src, size_t len)
> +{
> + return __builtin___mempcpy_chk (dest, src, len, bos0 (dest));
> +}
> +
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) char *
> +stpcpy (char *__restrict dest, const char *__restrict src)
> +{
> + return __builtin___stpcpy_chk (dest, src, bos (dest));
> +}
> +# endif
> +#endif
>
> Jakub
>
More information about the Gcc-patches
mailing list