[RFC, WIP] tree-ssa-strlen optimization pass

Richard Guenther richard.guenther@gmail.com
Mon Sep 5 08:55:00 GMT 2011


On Fri, Sep 2, 2011 at 6:50 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> The following patch contains a WIP implementation of a new pass,
> which attempts to track C string lengths and perform various
> optimizations using that information.
>
> Optimizations it currently performs:
> 1) optimizing away strlen calls, if the string length is known
>   at that point already (either constant or some expression)
> 2) replacement of strcpy calls with memcpy if the string length
>   of the source is known
> 3) replacement of strcat with either memcpy (if source length
>   is known) or strcpy (if not), if the destination length
>   before the call is known
>   - over the years I've seen way too much spaghetti code
>   doing many strcat calls to the same string one after another
> During bootstrap/regtest (excluding the newly added testcases)
> 1) hits 184 times on x86_64-linux, 182 tmes on i686-linux,
> 2) hits 158 times on x86_64 and 159 times on i686,
> 3) into memcpy hits 33 times and 3) into strcpy 2 times.
>
> Example from gcc sources that is optimized:
>  filename = (char *) alloca (strlen (module_name) + strlen (MODULE_EXTENSION)
>                              + 1);
>  strcpy (filename, module_name);
>  strcat (filename, MODULE_EXTENSION);
> which can be optimized into
>  filename = (char *) alloca ((tmp1 = strlen (module_name)) + (tmp2 = strlen (MODULE_EXTENSION))
>                              + 1);
>  memcpy (filename, module_name, tmp1 + 1);
>  memcpy (filename + tmp1, MODULE_EXTENSION, tmp2 + 1);
>
> Some further optimizations I'm currently considering for the pass:
> - handle *p = 0; stores like memcpy (p, "", 1)
>  - lame coders often do *p = 0; strcat (p, str1); strcat (p, str2);
> - if a memcpy call (either original or strcpy/strcat transformed into it)
>  copies known src string length + 1 and the immediately following
>  .MEM use (or possibly non-immediately if there are only non-aliasing
>  ones?) is a strcpy/(or to be optimized strcat or non-zero length memcpy)
>  call which overwrites the final '\0', decrease the memcpy size by one
> - if source length for strcpy isn't known and the destination length
>  is needed for optimizations, for -fhosted, glibc and with stpcpy
>  compatible prototype in headers consider transforming that strcpy
>  into stpcpy and use result - dst as string length (and see whether
>  following optimizations are able to optimize series of unknown
>  source length strcpy+strcat into a chain of stpcpy calls)
> - similarly if string length of strcat destination isn't known but is
>  helpful for optimization consider optimizing strcat into strlen+stpcpy
>
> Any comments related to the implementation, or examples of real-world
> lame C string length code sequences that would be nice to optimize
> will be greatly appreciated.
>
> The patch has been bootstrapped/regtested on x86_64-linux and i686-linux,
> no regressions, but I'm not proposing it for trunk yet (would like to
> implement at least a few of the above mentioned optimizations), just am
> posting it early as a pass preview.
>
> 2011-09-02  Jakub Jelinek  <jakub@redhat.com>
>
>        * common.opt: Add -ftree-strlen option.

Maybe sth more generic?  -foptimize-string-ops?  Eventually guard
the existing string op foldings with that flag as well.

>        * Makefile.in (OBJS): Add tree-ssa-strlen.o.
>        (tree-sssa-strlen.o): Add dependencies.
>        * opts.c (default_options_table): Enable -ftree-strlen
>        by default at -O2 if not -Os.
>        * passes.c (init_optimization_passes): Add pass_strlen
>        after pass_object_sizes.
>        * timevar.def (TV_TREE_STRLEN): New timevar.
>        * tree-pass.h (pass_strlen): Declare.
>        * tree-ssa-strlen.c: New file.
>
>        * gcc.dg/strlenopt-1.c: New test.
>        * gcc.dg/strlenopt-2.c: New test.
>        * gcc.dg/strlenopt-3.c: New test.
>        * gcc.dg/strlenopt.h: New file.
>
> --- gcc/common.opt.jj   2011-08-26 18:41:44.000000000 +0200
> +++ gcc/common.opt      2011-08-30 10:57:36.000000000 +0200
> @@ -1953,6 +1953,10 @@ ftree-fre
>  Common Report Var(flag_tree_fre) Optimization
>  Enable Full Redundancy Elimination (FRE) on trees
>
> +ftree-strlen
> +Common Report Var(flag_tree_strlen) Optimization
> +Enable string length optimizations on trees
> +
>  ftree-loop-distribution
>  Common Report Var(flag_tree_loop_distribution) Optimization
>  Enable loop distribution on trees
> --- gcc/Makefile.in.jj  2011-08-26 18:41:44.000000000 +0200
> +++ gcc/Makefile.in     2011-09-02 15:43:02.000000000 +0200
> @@ -1472,6 +1472,7 @@ OBJS = \
>        tree-ssa-reassoc.o \
>        tree-ssa-sccvn.o \
>        tree-ssa-sink.o \
> +       tree-ssa-strlen.o \
>        tree-ssa-structalias.o \
>        tree-ssa-ter.o \
>        tree-ssa-threadedge.o \
> @@ -3157,6 +3158,9 @@ tree-ssa-ccp.o : tree-ssa-ccp.c $(TREE_F
>    $(TREE_DUMP_H) $(BASIC_BLOCK_H) $(TREE_PASS_H) langhooks.h \
>    tree-ssa-propagate.h value-prof.h $(FLAGS_H) $(TARGET_H) $(DIAGNOSTIC_CORE_H) \
>    $(DBGCNT_H) tree-pretty-print.h gimple-pretty-print.h gimple-fold.h
> +tree-ssa-strlen.o : tree-ssa-strlen.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
> +   $(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h alloc-pool.h tree-ssa-propagate.h \
> +   gimple-pretty-print.h
>  tree-sra.o : tree-sra.c $(CONFIG_H) $(SYSTEM_H) coretypes.h alloc-pool.h \
>    $(TM_H) $(TREE_H) $(GIMPLE_H) $(CGRAPH_H) $(TREE_FLOW_H) \
>    $(IPA_PROP_H) $(DIAGNOSTIC_H) statistics.h $(TREE_DUMP_H) $(TIMEVAR_H) \
> --- gcc/opts.c.jj       2011-06-30 17:58:03.000000000 +0200
> +++ gcc/opts.c  2011-09-02 15:53:06.000000000 +0200
> @@ -484,6 +484,7 @@ static const struct default_options defa
>     { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 },
>     { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 },
>     { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 },
> +    { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_strlen, NULL, 1 },

Why not -Os?  Doesn't it remove strlen calls?

>     /* -O3 optimizations.  */
>     { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
> --- gcc/passes.c.jj     2011-07-12 07:58:48.000000000 +0200
> +++ gcc/passes.c        2011-08-31 12:03:29.000000000 +0200
> @@ -1,7 +1,7 @@
>  /* Top level of GCC compilers (cc1, cc1plus, etc.)
>    Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
> -   1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
> -   Free Software Foundation, Inc.
> +   1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
> +   2011  Free Software Foundation, Inc.
>
>  This file is part of GCC.
>
> @@ -1321,6 +1321,7 @@ init_optimization_passes (void)
>       NEXT_PASS (pass_forwprop);
>       NEXT_PASS (pass_phiopt);
>       NEXT_PASS (pass_object_sizes);
> +      NEXT_PASS (pass_strlen);
>       NEXT_PASS (pass_ccp);
>       NEXT_PASS (pass_copy_prop);
>       NEXT_PASS (pass_cse_sincos);
> --- gcc/timevar.def.jj  2011-05-04 10:14:08.000000000 +0200
> +++ gcc/timevar.def     2011-08-30 10:54:32.000000000 +0200
> @@ -1,7 +1,7 @@
>  /* This file contains the definitions for timing variables used to
>    measure run-time performance of the compiler.
>    Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008,
> -   2009, 2010
> +   2009, 2010, 2011
>    Free Software Foundation, Inc.
>    Contributed by Alex Samuel <samuel@codesourcery.com>
>
> @@ -183,6 +183,7 @@ DEFTIMEVAR (TV_TREE_COPY_RENAME          , "
>  DEFTIMEVAR (TV_TREE_SSA_VERIFY       , "tree SSA verifier")
>  DEFTIMEVAR (TV_TREE_STMT_VERIFY      , "tree STMT verifier")
>  DEFTIMEVAR (TV_TREE_SWITCH_CONVERSION, "tree switch initialization conversion")
> +DEFTIMEVAR (TV_TREE_STRLEN           , "tree strlen optimization")
>  DEFTIMEVAR (TV_CGRAPH_VERIFY         , "callgraph verifier")
>  DEFTIMEVAR (TV_DOM_FRONTIERS         , "dominance frontiers")
>  DEFTIMEVAR (TV_DOMINANCE             , "dominance computation")
> --- gcc/tree-pass.h.jj  2011-07-08 15:09:38.000000000 +0200
> +++ gcc/tree-pass.h     2011-08-30 10:55:27.000000000 +0200
> @@ -412,6 +412,7 @@ extern struct gimple_opt_pass pass_diagn
>  extern struct gimple_opt_pass pass_expand_omp;
>  extern struct gimple_opt_pass pass_expand_omp_ssa;
>  extern struct gimple_opt_pass pass_object_sizes;
> +extern struct gimple_opt_pass pass_strlen;
>  extern struct gimple_opt_pass pass_fold_builtins;
>  extern struct gimple_opt_pass pass_stdarg;
>  extern struct gimple_opt_pass pass_early_warn_uninitialized;
> --- gcc/tree-ssa-strlen.c.jj    2011-08-30 10:56:51.000000000 +0200
> +++ gcc/tree-ssa-strlen.c       2011-09-02 15:50:55.000000000 +0200
> @@ -0,0 +1,1288 @@
> +/* String length optimization
> +   Copyright (C) 2011 Free Software Foundation, Inc.
> +   Contributed by Jakub Jelinek <jakub@redhat.com>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree-flow.h"
> +#include "tree-pass.h"
> +#include "domwalk.h"
> +#include "alloc-pool.h"
> +#include "tree-ssa-propagate.h"
> +#include "gimple-pretty-print.h"
> +
> +/* Array indexed by SSA_NAME_VERSION.  0 means unknown, positive value
> +   is an index into strinfo vector, negative value stands for
> +   string length of a string literal (~strlen).  */
> +static int *ssa_ver_to_stridx;
> +
> +/* Number of currently active string indexes plus one.  */
> +static int max_stridx;

USe a VEC?

> +/* String information record.  */
> +typedef struct strinfo_struct
> +{
> +  /* String length of this string.  */
> +  tree length;
> +  /* Any of the corresponding pointers for querying alias oracle.  */
> +  tree ptr;
> +  /* Reference count.  Any changes to strinfo entry possibly shared
> +     with dominating basic blocks need unshare_strinfo first, except
> +     for dont_invalidate which affects only the immediately next
> +     maybe_invalidate.  */
> +  int refcount;
> +  /* Copy of index.  get_strinfo (si->idx) should return si;  */
> +  int idx;
> +  /* These 3 fields are for chaining related string pointers together.
> +     E.g. for
> +     bl = strlen (b); dl = strlen (d); strcpy (a, b); c = a + bl;
> +     strcpy (c, d); e = c + dl;
> +     strinfo(a) -> strinfo(c) -> strinfo(e)
> +     All have ->first field equal to strinfo(a)->idx and are doubly
> +     chained through prev/next fields.  The later strinfos are required
> +     to point into the same string with zero or more bytes after
> +     the previous pointer and all bytes in between the two pointers
> +     must be non-zero.  Functions like strcpy or memcpy are supposed
> +     to adjust all previous strinfo lengths, but not following strinfo
> +     lengths (those are uncertain, usually invalidated during
> +     maybe_invalidate, except when the alias oracle knows better).
> +     Functions like strcat on the other side adjust the whole
> +     related strinfo chain.
> +     They are updated lazily, so to use the chain the same first fields
> +     and si->prev->next == si->idx needs to be verified.  */
> +  int first;
> +  int next;
> +  int prev;
> +  /* A flag for the next maybe_invalidate that this strinfo shouldn't
> +     be invalidated.  Always cleared by maybe_invalidate.  */
> +  bool dont_invalidate;
> +} *strinfo;
> +DEF_VEC_P(strinfo);
> +DEF_VEC_ALLOC_P(strinfo,heap);
> +
> +/* Pool for allocating strinfo_struct entries.  */
> +static alloc_pool strinfo_pool;
> +
> +/* Vector mapping positive string indexes to strinfo, for the
> +   current basic block.  The first pointer in the vector is special,
> +   it is either NULL, meaning the vector isn't shared, or it is
> +   a basic block pointer to the owner basic_block if shared.
> +   If some other bb wants to modify the vector, the vector needs
> +   to be unshared first, and only the owner bb is supposed to free it.  */
> +static VEC(strinfo, heap) *stridx_to_strinfo;
> +
> +struct stridxlist
> +{
> +  struct stridxlist *next;
> +  HOST_WIDE_INT offset;
> +  int idx;
> +};
> +
> +struct decl_stridxlist_map
> +{
> +  struct tree_map_base base;
> +  struct stridxlist list;
> +};
> +
> +/* Hash table for mapping decls to a chained list of offset -> idx
> +   mappings.  */
> +static htab_t decl_to_stridxlist_htab;
> +
> +/* Hash a from tree in a decl_stridxlist_map.  */
> +
> +static unsigned int
> +decl_to_stridxlist_hash (const void *item)
> +{
> +  return DECL_UID (((const struct decl_stridxlist_map *) item)->base.from);
> +}
> +
> +/* Free a decl_stridxlist_map.  Callback for htab_delete.  */
> +
> +static void
> +decl_to_stridxlist_free (void *item)
> +{
> +  struct stridxlist *next;
> +  struct stridxlist *list = ((struct decl_stridxlist_map *) item)->list.next;
> +
> +  while (list)
> +    {
> +      next = list->next;
> +      XDELETE (list);
> +      list = next;
> +    }
> +  XDELETE (item);

Maybe use an obstack or alloc-pool dependent on re-use?

> +}
> +
> +/* Return string index for EXP.  */
> +
> +static int
> +get_stridx (tree exp)
> +{
> +  tree l;
> +
> +  if (TREE_CODE (exp) == SSA_NAME)
> +    return ssa_ver_to_stridx[SSA_NAME_VERSION (exp)];
> +
> +  if (TREE_CODE (exp) == ADDR_EXPR && decl_to_stridxlist_htab)
> +    {
> +      HOST_WIDE_INT off;
> +      tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0),
> +                                                &off);
> +      if (base && DECL_P (base))
> +       {
> +         struct decl_stridxlist_map ent, *e;
> +         ent.base.from = base;
> +         e = (struct decl_stridxlist_map *)
> +             htab_find_with_hash (decl_to_stridxlist_htab, &ent,
> +                                  DECL_UID (base));
> +         if (e)
> +           {
> +             struct stridxlist *list = &e->list;
> +             do
> +               {
> +                 if (list->offset == off)
> +                   return list->idx;
> +                 list = list->next;
> +               }
> +             while (list);
> +           }
> +       }
> +    }
> +
> +  l = c_strlen (exp, 0);
> +  if (l != NULL_TREE
> +      && host_integerp (l, 1))
> +    {
> +      unsigned HOST_WIDE_INT len = tree_low_cst (l, 1);
> +      if (len == (unsigned int) len
> +         && (int) len >= 0)
> +       return ~(int) len;
> +    }
> +  return 0;
> +}
> +
> +/* Return true if strinfo vector is shared with the immediate dominator.  */
> +
> +static inline bool
> +strinfo_shared (void)
> +{
> +  return VEC_length (strinfo, stridx_to_strinfo)
> +        && VEC_index (strinfo, stridx_to_strinfo, 0) != NULL;
> +}
> +
> +/* Unshare strinfo vector that is shared with the immediate dominator.  */
> +
> +static void
> +unshare_strinfo_vec (void)
> +{
> +  strinfo si;
> +  unsigned int i = 0;
> +
> +  gcc_assert (strinfo_shared ());
> +  stridx_to_strinfo = VEC_copy (strinfo, heap, stridx_to_strinfo);
> +  for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i)
> +    if (si != NULL)
> +      si->refcount++;
> +  VEC_replace (strinfo, stridx_to_strinfo, 0, NULL);
> +}
> +
> +/* Attempt to create a string index for ADDR_EXPR exp.
> +   Return a pointer to the location where the string index can
> +   be stored (if 0) or is stored, or NULL if this can't be tracked.  */
> +
> +static int *
> +addr_stridxptr (tree exp)
> +{
> +  void **slot;
> +  struct decl_stridxlist_map ent;
> +  struct stridxlist *list;
> +  HOST_WIDE_INT off;
> +
> +  tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0), &off);
> +  if (base == NULL_TREE || !DECL_P (base))
> +    return NULL;
> +
> +  if (decl_to_stridxlist_htab == NULL)
> +    decl_to_stridxlist_htab
> +      = htab_create (64, decl_to_stridxlist_hash, tree_map_base_eq,
> +                    decl_to_stridxlist_free);
> +  ent.base.from = base;
> +  slot = htab_find_slot_with_hash (decl_to_stridxlist_htab, &ent,
> +                                  DECL_UID (base), INSERT);
> +  if (*slot)
> +    {
> +      int i;
> +      list = &((struct decl_stridxlist_map *)*slot)->list;
> +      for (i = 0; i < 16; i++)
> +       {
> +         if (list->offset == off)
> +           return &list->idx;
> +         if (list->next == NULL)
> +           break;
> +       }
> +      if (i == 16)
> +       return NULL;
> +      list->next = XNEW (struct stridxlist);
> +      list = list->next;
> +    }
> +  else
> +    {
> +      struct decl_stridxlist_map *e = XNEW (struct decl_stridxlist_map);
> +      e->base.from = base;
> +      *slot = (void *) e;
> +      list = &e->list;
> +    }
> +  list->next = NULL;
> +  list->offset = off;
> +  list->idx = 0;
> +  return &list->idx;
> +}
> +
> +/* Create a new string index, or return 0 if reached limit.  */
> +
> +static int
> +new_stridx (tree exp)
> +{
> +  int idx;
> +  if (max_stridx == 1000)

I suppose make this a #define or --param

> +    return 0;
> +  if (TREE_CODE (exp) == SSA_NAME)
> +    {
> +      idx = max_stridx++;
> +      ssa_ver_to_stridx[SSA_NAME_VERSION (exp)] = idx;
> +      return idx;
> +    }
> +  if (TREE_CODE (exp) == ADDR_EXPR)
> +    {
> +      int *pidx = addr_stridxptr (exp);
> +      if (pidx != NULL)
> +       {
> +         gcc_assert (*pidx == 0);
> +         *pidx = max_stridx++;
> +         return *pidx;
> +       }
> +    }
> +  return 0;
> +}
> +
> +/* Create a new strinfo.  */
> +
> +static strinfo
> +new_strinfo (tree ptr, int idx, tree length)
> +{
> +  strinfo si = (strinfo) pool_alloc (strinfo_pool);
> +  si->length = length;
> +  si->ptr = ptr;
> +  si->refcount = 1;
> +  si->idx = idx;
> +  si->first = 0;
> +  si->prev = 0;
> +  si->next = 0;
> +  si->dont_invalidate = false;
> +  return si;
> +}
> +
> +/* Decrease strinfo refcount and free it if not referenced anymore.  */
> +
> +static inline void
> +free_strinfo (strinfo si)
> +{
> +  if (si && --si->refcount == 0)
> +    pool_free (strinfo_pool, si);
> +}
> +
> +/* Return strinfo vector entry IDX.  */
> +
> +static inline strinfo
> +get_strinfo (int idx)
> +{
> +  if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx)
> +    return NULL;
> +  return VEC_index (strinfo, stridx_to_strinfo, idx);
> +}
> +
> +/* Set strinfo in the vector entry IDX to SI.  */
> +
> +static inline void
> +set_strinfo (int idx, strinfo si)
> +{
> +  if (VEC_length (strinfo, stridx_to_strinfo) && VEC_index (strinfo, stridx_to_strinfo, 0))
> +    unshare_strinfo_vec ();
> +  if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx)
> +    VEC_safe_grow_cleared (strinfo, heap, stridx_to_strinfo, idx + 1);
> +  VEC_replace (strinfo, stridx_to_strinfo, idx, si);
> +}
> +
> +/* Invalidate string length information for strings whose length
> +   might change due to stores in stmt.  */
> +
> +static bool
> +maybe_invalidate (gimple stmt)
> +{
> +  strinfo si;
> +  unsigned int i;
> +  bool nonempty = false;
> +
> +  for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i)
> +    if (si != NULL)
> +      {
> +       if (!si->dont_invalidate)
> +         {
> +           ao_ref r;
> +           ao_ref_init_from_ptr_and_size (&r, si->ptr, NULL_TREE);
> +           if (stmt_may_clobber_ref_p_1 (stmt, &r))
> +             {
> +               set_strinfo (i, NULL);
> +               free_strinfo (si);
> +               continue;
> +             }
> +         }
> +       si->dont_invalidate = false;
> +       nonempty = true;
> +      }
> +  return nonempty;
> +}
> +
> +/* Unshare strinfo record SI, if it has recount > 1 or
> +   if stridx_to_strinfo vector is shared with some other
> +   bbs.  */
> +
> +static strinfo
> +unshare_strinfo (strinfo si)
> +{
> +  strinfo nsi;
> +
> +  if (si->refcount == 1 && !strinfo_shared ())
> +    return si;
> +
> +  nsi = new_strinfo (si->ptr, si->idx, si->length);
> +  nsi->first = si->first;
> +  nsi->prev = si->prev;
> +  nsi->next = si->next;
> +  set_strinfo (si->idx, nsi);
> +  free_strinfo (si);
> +  return nsi;
> +}
> +
> +/* Return first strinfo in the related strinfo chain
> +   if all strinfos in between belong to the chain, otherwise
> +   NULL.  */
> +
> +static strinfo
> +verify_related_strinfos (strinfo origsi)
> +{
> +  strinfo si = origsi, psi;
> +
> +  if (origsi->first == 0)
> +    return NULL;
> +  for (; si->prev; si = psi)
> +    {
> +      if (si->first != origsi->first)
> +       return NULL;
> +      psi = get_strinfo (si->prev);
> +      if (psi == NULL)
> +       return NULL;
> +      if (psi->next != si->idx)
> +       return NULL;
> +    }
> +  if (si->idx != si->first)
> +    return NULL;
> +  return si;
> +}
> +
> +/* Note that PTR, a pointer SSA_NAME initialized in the current stmt, points
> +   to a zero-length string and if possible chain it to a related strinfo
> +   chain whose part is or might be CHAINSI.  */
> +
> +static strinfo
> +zero_length_string (tree ptr, strinfo chainsi)
> +{
> +  strinfo si;
> +  int idx;
> +  gcc_checking_assert (TREE_CODE (ptr) == SSA_NAME
> +                      && get_stridx (ptr) == 0);
> +
> +  if (chainsi != NULL)
> +    {
> +      if (verify_related_strinfos (chainsi))
> +       {
> +         for (; chainsi->next; chainsi = si)
> +           {
> +             si = get_strinfo (chainsi->next);
> +             if (si == NULL
> +                 || si->first != chainsi->first
> +                 || si->prev != chainsi->idx)
> +               break;
> +           }
> +         if (integer_zerop (chainsi->length))
> +           {
> +             if (chainsi->next)
> +               {
> +                 chainsi = unshare_strinfo (chainsi);
> +                 chainsi->next = 0;
> +               }
> +             ssa_ver_to_stridx [SSA_NAME_VERSION (ptr)] = chainsi->idx;
> +             return chainsi;
> +           }
> +       }
> +      else if (chainsi->first || chainsi->prev || chainsi->next)
> +       {
> +         chainsi = unshare_strinfo (chainsi);
> +         chainsi->first = 0;
> +         chainsi->prev = 0;
> +         chainsi->next = 0;
> +       }
> +    }
> +  idx = new_stridx (ptr);
> +  if (idx == 0)
> +    return NULL;
> +  si = new_strinfo (ptr, idx, build_int_cst (size_type_node, 0));
> +  set_strinfo (idx, si);
> +  if (chainsi != NULL)
> +    {
> +      chainsi = unshare_strinfo (chainsi);
> +      if (chainsi->first == 0)
> +       chainsi->first = chainsi->idx;
> +      chainsi->next = idx;
> +      si->prev = chainsi->idx;
> +      si->first = chainsi->first;
> +    }
> +  return si;
> +}
> +
> +/* For strinfo ORIGSI whose length has been just updated
> +   update also related strinfo lengths (add ADJ to each,
> +   but don't adjust ORIGSI).  */
> +
> +static void
> +adjust_related_strinfos (location_t loc, strinfo origsi, tree adj)
> +{
> +  strinfo si = verify_related_strinfos (origsi);
> +
> +  if (si == NULL)
> +    return;
> +
> +  while (1)
> +    {
> +      strinfo nsi;
> +
> +      if (si != origsi)
> +       {
> +         tree tem;
> +
> +         si = unshare_strinfo (si);
> +         tem = fold_convert_loc (loc, TREE_TYPE (si->length), adj);
> +         si->length = fold_build2_loc (loc, PLUS_EXPR,
> +                                       TREE_TYPE (si->length), si->length,
> +                                       tem);
> +         si->dont_invalidate = true;
> +       }
> +      if (si->next == 0)
> +       return;
> +      nsi = get_strinfo (si->next);
> +      if (nsi == NULL
> +         || nsi->first != si->first
> +         || nsi->prev != si->idx)
> +       return;
> +      si = nsi;
> +    }
> +}
> +
> +/* Find if there are other SSA_NAME pointers equal to PTR
> +   for which we don't track their string lengths yet.  If so, use
> +   IDX for them.  */
> +
> +static void
> +find_equal_ptrs (tree ptr, int idx)
> +{
> +  if (TREE_CODE (ptr) != SSA_NAME)
> +    return;
> +  while (1)
> +    {
> +      gimple stmt = SSA_NAME_DEF_STMT (ptr);
> +      if (!gimple_assign_single_p (stmt)
> +         && (!gimple_assign_cast_p (stmt)
> +             || !POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))))
> +       return;

I'd prefer postive checks to guard code below.  So, you're handling
SSA name copies, conversions from pointers and assignments from
invariants.  You could simply do

 if (!is+gimple_assign (stmt))
   return;
 switch (gimple_assign_rhs_code (stmt))
  {
   case SSA_NAME:
  ..
case ADDR_EXPR:
  ..
  CASE_CONVERT:
 ...
default:
return;
}
> +      ptr = gimple_assign_rhs1 (stmt);
> +      if (TREE_CODE (ptr) != SSA_NAME)
> +       {
> +         if (TREE_CODE (ptr) == ADDR_EXPR)
> +           {
> +             int *pidx = addr_stridxptr (ptr);
> +             if (pidx != NULL && *pidx == 0)
> +               *pidx = idx;
> +             return;
> +           }
> +         return;
> +       }
> +      if (ssa_ver_to_stridx[SSA_NAME_VERSION (ptr)] != 0)
> +       return;
> +      ssa_ver_to_stridx[SSA_NAME_VERSION (ptr)] = idx;
> +    }
> +}
> +
> +/* Handle a strlen call.  If strlen of the argument is known, replace
> +   the strlen call with the known value, otherwise remember that strlen
> +   of the argument is stored in the lhs SSA_NAME.  */
> +
> +static void
> +handle_builtin_strlen (gimple_stmt_iterator *gsi)
> +{
> +  int idx;
> +  tree src;
> +  gimple stmt = gsi_stmt (*gsi);
> +  tree lhs = gimple_call_lhs (stmt);
> +
> +  if (lhs == NULL_TREE)
> +    return;
> +
> +  src = gimple_call_arg (stmt, 0);
> +  idx = get_stridx (src);
> +  if (idx)
> +    {
> +      if (idx < 0)
> +       {
> +         tree rhs = build_int_cst (TREE_TYPE (lhs), ~idx);
> +         if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +           {
> +             fprintf (dump_file, "Optimizing: ");
> +             print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> +           }
> +         if (update_call_from_tree (gsi, rhs))
> +           {
> +             update_stmt (gsi_stmt (*gsi));
> +             if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +               {
> +                 fprintf (dump_file, "into: ");
> +                 print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM);
> +               }
> +           }
> +         else if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +           fprintf (dump_file, "not possible.\n");
> +       }
> +      else
> +       {
> +         strinfo si = get_strinfo (idx);
> +         if (si != NULL)
> +           {
> +             tree rhs = si->length;
> +             if (!useless_type_conversion_p (TREE_TYPE (lhs),
> +                                             TREE_TYPE (rhs)))
> +               rhs = fold_convert_loc (gimple_location (stmt),
> +                                       TREE_TYPE (lhs), si->length);
> +             if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +               {
> +                 fprintf (dump_file, "Optimizing: ");
> +                 print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> +               }
> +             if (!update_call_from_tree (gsi, rhs))
> +               {
> +                 rhs = force_gimple_operand_gsi (gsi, rhs, true, NULL_TREE,
> +                                                 true, GSI_SAME_STMT);
> +                 if (!update_call_from_tree (gsi, rhs))

if update_call_from_tree fails then gimplify_and_update_call_from_tree
will always succeed.  See gimple_fold_call.

> +                   {
> +                     if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +                       fprintf (dump_file, "not possible.\n");
> +                     return;
> +                   }
> +               }
> +             update_stmt (gsi_stmt (*gsi));
> +             if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +               {
> +                 fprintf (dump_file, "into: ");
> +                 print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM);
> +               }
> +             if (TREE_CODE (si->length) != SSA_NAME
> +                 && TREE_CODE (si->length) != INTEGER_CST
> +                 && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
> +               {
> +                 si = unshare_strinfo (si);
> +                 si->length = lhs;
> +               }
> +           }
> +       }
> +    }
> +  else if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
> +    {
> +      idx = new_stridx (src);
> +      if (idx)
> +       {
> +         strinfo si = new_strinfo (src, idx, lhs);
> +         set_strinfo (idx, si);
> +         find_equal_ptrs (src, idx);
> +       }
> +    }
> +}
> +
> +/* Handle a strcpy-like ({st{r,p}cpy,__st{r,p}cpy_chk}) call.
> +   If strlen of the second argument is known, strlen of the first argument
> +   is the same after this call.  Furthermore, attempt to convert it to
> +   memcpy.  */
> +
> +static void
> +handle_builtin_strcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi)
> +{
> +  int idx, didx;
> +  tree src, dst, len, lhs, rhs, args, type, fn, oldlen;
> +  gimple stmt = gsi_stmt (*gsi);
> +  strinfo si, dsi, olddsi, zsi;
> +  location_t loc;
> +
> +  src = gimple_call_arg (stmt, 1);
> +  dst = gimple_call_arg (stmt, 0);
> +  idx = get_stridx (src);
> +  if (idx <= 0)
> +    return;
> +
> +  si = get_strinfo (idx);
> +  if (si == NULL)
> +    return;
> +
> +  didx = get_stridx (dst);
> +  olddsi = NULL;
> +  oldlen = NULL_TREE;
> +  if (didx > 0)
> +    olddsi = get_strinfo (didx);
> +  else if (didx < 0)
> +    return;
> +  else
> +    {
> +      didx = new_stridx (dst);
> +      if (didx == 0)
> +       return;
> +    }
> +  if (olddsi != NULL)
> +    {
> +      oldlen = olddsi->length;
> +      dsi = unshare_strinfo (olddsi);
> +      dsi->length = si->length;
> +      /* Break the chain, so adjust_related_strinfo on later pointers in
> +        the chain won't adjust this one anymore.  */
> +      dsi->next = 0;
> +    }
> +  else
> +    {
> +      dsi = new_strinfo (dst, didx, si->length);
> +      set_strinfo (didx, dsi);
> +      find_equal_ptrs (dst, didx);
> +    }
> +  dsi->dont_invalidate = true;
> +  loc = gimple_location (stmt);
> +  if (olddsi != NULL)
> +    {
> +      tree adj = NULL_TREE;
> +      if (integer_zerop (oldlen))
> +       adj = si->length;
> +      else if (TREE_CODE (oldlen) == INTEGER_CST
> +              || TREE_CODE (si->length) == INTEGER_CST)
> +       adj = fold_build2_loc (loc, MINUS_EXPR,
> +                              TREE_TYPE (si->length), si->length,
> +                              fold_convert_loc (loc, TREE_TYPE (si->length),
> +                                                oldlen));
> +      if (adj != NULL_TREE)
> +       adjust_related_strinfos (loc, dsi, adj);
> +    }
> +  /* strcpy src may not overlap dst, so src doesn't need to be
> +     invalidated either.  */
> +  si->dont_invalidate = true;
> +
> +  lhs = gimple_call_lhs (stmt);
> +  fn = NULL_TREE;
> +  zsi = NULL;
> +  switch (bcode)
> +    {
> +    case BUILT_IN_STRCPY:
> +      fn = implicit_built_in_decls[BUILT_IN_MEMCPY];
> +      if (lhs)
> +       ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx;
> +      break;
> +    case BUILT_IN_STRCPY_CHK:
> +      fn = built_in_decls[BUILT_IN_MEMCPY_CHK];
> +      if (lhs)
> +       ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx;
> +      break;
> +    case BUILT_IN_STPCPY:
> +      /* This would need adjustment of the lhs (subtract one),
> +        or detection that the trailing '\0' doesn't need to be
> +        written, if it will be immediately overwritten.
> +      fn = built_in_decls[BUILT_IN_MEMPCPY];  */
> +      if (lhs)
> +       zsi = zero_length_string (lhs, dsi);
> +      break;
> +    case BUILT_IN_STPCPY_CHK:
> +      /* This would need adjustment of the lhs (subtract one),
> +        or detection that the trailing '\0' doesn't need to be
> +        written, if it will be immediately overwritten.
> +      fn = built_in_decls[BUILT_IN_MEMPCPY_CHK];  */
> +      if (lhs)
> +       zsi = zero_length_string (lhs, dsi);
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +  if (zsi != NULL)
> +    zsi->dont_invalidate = true;
> +
> +  if (fn == NULL_TREE)
> +    return;
> +
> +  args = TYPE_ARG_TYPES (TREE_TYPE (fn));
> +  type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args)));
> +
> +  len = fold_convert_loc (loc, type, si->length);
> +  len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type, 1));
> +  len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true,
> +                                 GSI_SAME_STMT);
> +  if (gimple_call_num_args (stmt) == 2)
> +    rhs = build_call_expr_loc (loc, fn, 3, dst, src, len);
> +  else
> +    rhs = build_call_expr_loc (loc, fn, 4, dst, src, len,
> +                              gimple_call_arg (stmt, 2));
> +  if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +    {
> +      fprintf (dump_file, "Optimizing: ");
> +      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> +    }
> +  if (update_call_from_tree (gsi, rhs))

Btw, it would be nice if you'd not use build_call_expr and then gimplify
the call but instead build a new gimple call directly ... or modify it
in-place.

> +    {
> +      update_stmt (gsi_stmt (*gsi));
> +      if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +       {
> +         fprintf (dump_file, "into: ");
> +         print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM);
> +       }
> +    }
> +  else if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +    fprintf (dump_file, "not possible.\n");
> +}
> +
> +/* Handle a memcpy-like ({mem{,p}cpy,__mem{,p}cpy_chk}) call.
> +   If strlen of the second argument is known and length of the third argument
> +   is that plus one, strlen of the first argument is the same after this
> +   call.  */
> +
> +static void
> +handle_builtin_memcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi)
> +{
> +  int idx, didx;
> +  tree src, dst, len, lhs, oldlen, newlen;
> +  gimple stmt = gsi_stmt (*gsi);
> +  strinfo si, dsi, olddsi;
> +
> +  len = gimple_call_arg (stmt, 2);
> +  src = gimple_call_arg (stmt, 1);
> +  dst = gimple_call_arg (stmt, 0);
> +  idx = get_stridx (src);
> +  if (idx == 0)
> +    return;
> +
> +  if (idx > 0)
> +    {
> +      gimple def_stmt;
> +
> +      /* Handle memcpy (x, y, l) where l is strlen (y) + 1.  */
> +      si = get_strinfo (idx);
> +      if (si == NULL)
> +       return;
> +      if (TREE_CODE (len) != SSA_NAME)
> +       return;
> +      def_stmt = SSA_NAME_DEF_STMT (len);
> +      if (!is_gimple_assign (def_stmt)
> +         || gimple_assign_rhs_code (def_stmt) != PLUS_EXPR
> +         || gimple_assign_rhs1 (def_stmt) != si->length
> +         || !integer_onep (gimple_assign_rhs2 (def_stmt)))
> +       return;
> +    }
> +  else
> +    {
> +      si = NULL;
> +      /* Handle memcpy (x, "abcd", 5) or
> +        memcpy (x, "abc\0uvw", 7).  */
> +      if (!host_integerp (len, 1)
> +         || (unsigned HOST_WIDE_INT) tree_low_cst (len, 1)
> +            <= (unsigned HOST_WIDE_INT) ~idx)
> +       return;
> +    }
> +
> +  didx = get_stridx (dst);
> +  olddsi = NULL;
> +  if (didx > 0)
> +    olddsi = get_strinfo (didx);
> +  else if (didx < 0)
> +    return;
> +  else
> +    {
> +      didx = new_stridx (dst);
> +      if (didx == 0)
> +       return;
> +    }
> +  if (si != NULL)
> +    newlen = si->length;
> +  else
> +    newlen = build_int_cst (TREE_TYPE (len), ~idx);
> +  oldlen = NULL_TREE;
> +  if (olddsi != NULL)
> +    {
> +      dsi = unshare_strinfo (olddsi);
> +      oldlen = olddsi->length;
> +      dsi->length = newlen;
> +      /* Break the chain, so adjust_related_strinfo on later pointers in
> +        the chain won't adjust this one anymore.  */
> +      dsi->next = 0;
> +    }
> +  else
> +    {
> +      dsi = new_strinfo (dst, didx, newlen);
> +      set_strinfo (didx, dsi);
> +      find_equal_ptrs (dst, didx);
> +    }
> +  dsi->dont_invalidate = true;
> +  if (olddsi != NULL)
> +    {
> +      tree adj = NULL_TREE;
> +      location_t loc = gimple_location (stmt);
> +      if (integer_zerop (oldlen))
> +       adj = dsi->length;
> +      else if (TREE_CODE (oldlen) == INTEGER_CST
> +              || TREE_CODE (dsi->length) == INTEGER_CST)
> +       adj = fold_build2_loc (loc, MINUS_EXPR,
> +                              TREE_TYPE (dsi->length), dsi->length,
> +                              fold_convert_loc (loc, TREE_TYPE (dsi->length),
> +                                                oldlen));
> +      if (adj != NULL_TREE)
> +       adjust_related_strinfos (loc, dsi, adj);
> +    }
> +  /* memcpy src may not overlap dst, so src doesn't need to be
> +     invalidated either.  */
> +  if (si != NULL)
> +    si->dont_invalidate = true;
> +
> +  lhs = gimple_call_lhs (stmt);
> +  switch (bcode)
> +    {
> +    case BUILT_IN_MEMCPY:
> +    case BUILT_IN_MEMCPY_CHK:
> +      if (lhs)
> +       ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx;
> +      break;
> +    case BUILT_IN_MEMPCPY:
> +    case BUILT_IN_MEMPCPY_CHK:
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +}
> +
> +/* Handle a strcat-like ({strcat,__strcat_chk}) call.
> +   If strlen of the second argument is known, strlen of the first argument
> +   is increased by the length of the second argument.  Furthermore, attempt
> +   to convert it to memcpy/strcpy if the length of the first argument
> +   is known.  */
> +
> +static void
> +handle_builtin_strcat (enum built_in_function bcode, gimple_stmt_iterator *gsi)
> +{
> +  int idx, didx;
> +  tree src, dst, dstlen, len, lhs, rhs, args, type, fn, objsz;
> +  gimple stmt = gsi_stmt (*gsi);
> +  strinfo si, dsi;
> +  location_t loc;
> +
> +  src = gimple_call_arg (stmt, 1);
> +  dst = gimple_call_arg (stmt, 0);
> +
> +  didx = get_stridx (dst);
> +  if (didx <= 0)
> +    return;
> +
> +  dsi = get_strinfo (didx);
> +  if (dsi == NULL)
> +    return;
> +
> +  /* This should have been folded, don't handle it here.  */
> +  idx = get_stridx (src);
> +  if (idx < 0)
> +    return;
> +
> +  si = NULL;
> +  if (idx)
> +    si = get_strinfo (idx);
> +
> +  loc = gimple_location (stmt);
> +  dstlen = dsi->length;
> +  if (si != NULL)
> +    {
> +      dsi = unshare_strinfo (dsi);
> +      dsi->dont_invalidate = true;
> +      dsi->length = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (dsi->length),
> +                                    dsi->length, si->length);
> +      adjust_related_strinfos (loc, dsi, dstlen);
> +    }
> +  else
> +    {
> +      set_strinfo (didx, NULL);
> +      free_strinfo (dsi);
> +    }
> +  if (si != NULL)
> +    /* strcat src may not overlap dst, so src doesn't need to be
> +       invalidated either.  */
> +    si->dont_invalidate = true;
> +
> +  lhs = gimple_call_lhs (stmt);
> +  /* For now.  Could remove the lhs from the call and add
> +     lhs = dst; afterwards.  */
> +  if (lhs)
> +    return;
> +
> +  fn = NULL_TREE;
> +  objsz = NULL_TREE;
> +  switch (bcode)
> +    {
> +    case BUILT_IN_STRCAT:
> +      if (si)
> +       fn = implicit_built_in_decls[BUILT_IN_MEMCPY];
> +      else
> +       fn = implicit_built_in_decls[BUILT_IN_STRCPY];
> +      break;
> +    case BUILT_IN_STRCAT_CHK:
> +      if (si)
> +       fn = built_in_decls[BUILT_IN_MEMCPY_CHK];
> +      else
> +       fn = built_in_decls[BUILT_IN_STRCPY_CHK];
> +      objsz = gimple_call_arg (stmt, 2);
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +
> +  if (fn == NULL_TREE)
> +    return;
> +
> +  len = NULL_TREE;
> +  if (si)
> +    {
> +      args = TYPE_ARG_TYPES (TREE_TYPE (fn));
> +      type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args)));
> +
> +      len = fold_convert_loc (loc, type, si->length);
> +      len = fold_build2_loc (loc, PLUS_EXPR, type, len,
> +                            build_int_cst (type, 1));
> +      len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true,
> +                                     GSI_SAME_STMT);
> +    }
> +  dst = fold_build2_loc (loc, POINTER_PLUS_EXPR,
> +                        TREE_TYPE (dst), dst,
> +                        fold_convert_loc (loc, sizetype, dstlen));
> +  dst = force_gimple_operand_gsi (gsi, dst, true, NULL_TREE, true,
> +                                 GSI_SAME_STMT);
> +  if (si)
> +    rhs = build_call_expr_loc (loc, fn, 3 + (objsz != NULL_TREE),
> +                              dst, src, len, objsz);
> +  else
> +    rhs = build_call_expr_loc (loc, fn, 2 + (objsz != NULL_TREE),
> +                              dst, src, objsz);
> +  if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +    {
> +      fprintf (dump_file, "Optimizing: ");
> +      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> +    }
> +  if (update_call_from_tree (gsi, rhs))
> +    {
> +      update_stmt (gsi_stmt (*gsi));
> +      if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +       {
> +         fprintf (dump_file, "into: ");
> +         print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM);
> +       }
> +    }
> +  else if (dump_file && (dump_flags & TDF_DETAILS) != 0)
> +    fprintf (dump_file, "not possible.\n");
> +}
> +
> +/* Attempt to optimize a single statement at *GSI using string length
> +   knowledge.  */
> +
> +static void
> +strlen_optimize_stmt (gimple_stmt_iterator *gsi)
> +{
> +  gimple stmt = gsi_stmt (*gsi);
> +  tree lhs;
> +
> +  if (is_gimple_call (stmt))
> +    {
> +      tree callee = gimple_call_fndecl (stmt);
> +      if (callee && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL)
> +       switch (DECL_FUNCTION_CODE (callee))
> +         {
> +         case BUILT_IN_STRLEN:
> +           handle_builtin_strlen (gsi);
> +           break;
> +         case BUILT_IN_STRCPY:
> +         case BUILT_IN_STRCPY_CHK:
> +         case BUILT_IN_STPCPY:
> +         case BUILT_IN_STPCPY_CHK:
> +           handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi);
> +           break;
> +         case BUILT_IN_MEMCPY:
> +         case BUILT_IN_MEMCPY_CHK:
> +         case BUILT_IN_MEMPCPY:
> +         case BUILT_IN_MEMPCPY_CHK:
> +           handle_builtin_memcpy (DECL_FUNCTION_CODE (callee), gsi);
> +           break;
> +         case BUILT_IN_STRCAT:
> +         case BUILT_IN_STRCAT_CHK:
> +           handle_builtin_strcat (DECL_FUNCTION_CODE (callee), gsi);
> +           break;
> +         default:
> +           break;
> +         }
> +    }
> +  else if (is_gimple_assign (stmt)
> +          && (lhs = gimple_assign_lhs (stmt)) != NULL_TREE
> +          && TREE_CODE (lhs) == SSA_NAME
> +          && POINTER_TYPE_P (TREE_TYPE (lhs)))
> +    {
> +      if (gimple_assign_single_p (stmt)
> +         || (gimple_assign_cast_p (stmt)
> +             && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))))
> +       {
> +         int idx = get_stridx (gimple_assign_rhs1 (stmt));
> +         ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = idx;
> +       }
> +      else if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
> +       {
> +         int idx = get_stridx (gimple_assign_rhs1 (stmt));
> +         if (idx > 0)
> +           {
> +             strinfo si = get_strinfo (idx);
> +             if (si != NULL)
> +               {
> +                 tree off = gimple_assign_rhs2 (stmt);
> +                 if (operand_equal_p (si->length, off, 0))
> +                   zero_length_string (lhs, si);
> +                 else if (TREE_CODE (off) == SSA_NAME)
> +                   {
> +                     gimple def_stmt = SSA_NAME_DEF_STMT (off);
> +                     if (gimple_assign_single_p (def_stmt)
> +                         && operand_equal_p (si->length,
> +                                             gimple_assign_rhs1 (def_stmt),
> +                                             0))
> +                       zero_length_string (lhs, si);
> +                   }
> +               }
> +           }
> +         else if (idx < 0)
> +           {
> +             tree off = gimple_assign_rhs2 (stmt);
> +             if (host_integerp (off, 1)
> +                 && (unsigned HOST_WIDE_INT) tree_low_cst (off, 1)
> +                    <= (unsigned HOST_WIDE_INT) ~idx)
> +               ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)]
> +                 = ~(int) tree_low_cst (off, 1);
> +           }
> +       }
> +    }
> +
> +  if (gimple_vdef (stmt))
> +    maybe_invalidate (stmt);
> +}
> +
> +/* Recursively call maybe_invalidate on stmts that might be executed
> +   in between dombb and current bb and that contain a vdef.  Stop when
> +   *count stmts are inspected, or if the whole strinfo vector has
> +   been invalidated.  */
> +
> +static void
> +do_invalidate (basic_block dombb, gimple phi, bitmap visited, int *count)
> +{
> +  unsigned int i, n = gimple_phi_num_args (phi);
> +
> +  for (i = 0; i < n; i++)
> +    {
> +      tree vuse = gimple_phi_arg_def (phi, i);
> +      gimple stmt = SSA_NAME_DEF_STMT (vuse);
> +      basic_block bb = gimple_bb (stmt);
> +      if (bb == NULL
> +         || !bitmap_set_bit (visited, bb->index)
> +         || !dominated_by_p (CDI_DOMINATORS, bb, dombb))
> +       continue;
> +      while (1)
> +       {
> +         if (gimple_code (stmt) == GIMPLE_PHI)
> +           {
> +             do_invalidate (dombb, stmt, visited, count);
> +             if (*count == 0)
> +               return;
> +             break;
> +           }
> +         if (--*count == 0)
> +           return;
> +         if (!maybe_invalidate (stmt))
> +           {
> +             *count = 0;
> +             return;
> +           }
> +         vuse = gimple_vuse (stmt);
> +         stmt = SSA_NAME_DEF_STMT (vuse);
> +         if (gimple_bb (stmt) != bb)
> +           {
> +             bb = gimple_bb (stmt);
> +             if (bb == NULL
> +                 || !bitmap_set_bit (visited, bb->index)
> +                 || !dominated_by_p (CDI_DOMINATORS, bb, dombb))
> +               break;
> +           }
> +       }
> +    }
> +}
> +
> +/* Callback for walk_dominator_tree.  Attempt to optimize various
> +   string ops by remembering string lenths pointed by pointer SSA_NAMEs.  */
> +
> +static void
> +strlen_enter_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED,
> +                   basic_block bb)
> +{
> +  gimple_stmt_iterator gsi;
> +  basic_block dombb = get_immediate_dominator (CDI_DOMINATORS, bb);
> +
> +  if (dombb == NULL)
> +    stridx_to_strinfo = NULL;
> +  else
> +    {
> +      stridx_to_strinfo = (VEC(strinfo, heap) *) dombb->aux;
> +      if (stridx_to_strinfo)
> +       {
> +         for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +           {
> +             gimple phi = gsi_stmt (gsi);
> +             if (!is_gimple_reg (gimple_phi_result (phi)))
> +               {
> +                 bitmap visited = BITMAP_ALLOC (NULL);
> +                 int count_vdef = 100;
> +                 do_invalidate (dombb, phi, visited, &count_vdef);
> +                 BITMAP_FREE (visited);
> +                 break;
> +               }
> +           }
> +       }
> +    }
> +
> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +    strlen_optimize_stmt (&gsi);
> +
> +  bb->aux = stridx_to_strinfo;
> +  if (VEC_length (strinfo, stridx_to_strinfo) && !strinfo_shared ())
> +    VEC_replace (strinfo, stridx_to_strinfo, 0, (strinfo) bb);
> +}
> +
> +/* Callback for walk_dominator_tree.  Free strinfo vector if it is
> +   owned by the current bb, clear bb->aux.  */
> +
> +static void
> +strlen_leave_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED,
> +                   basic_block bb)
> +{
> +  if (bb->aux)
> +    {
> +      stridx_to_strinfo = (VEC(strinfo, heap) *) bb->aux;
> +      if (VEC_length (strinfo, stridx_to_strinfo)
> +         && VEC_index (strinfo, stridx_to_strinfo, 0) == (strinfo) bb)
> +       {
> +         unsigned int i;
> +         strinfo si;
> +
> +         for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i)
> +           free_strinfo (si);
> +         VEC_free (strinfo, heap, stridx_to_strinfo);
> +       }
> +      bb->aux = NULL;
> +    }
> +}
> +
> +/* Main entry point.  */
> +
> +static unsigned int
> +tree_ssa_strlen (void)
> +{
> +  struct dom_walk_data walk_data;
> +
> +  ssa_ver_to_stridx = XCNEWVEC (int, num_ssa_names);
> +  max_stridx = 1;
> +  strinfo_pool = create_alloc_pool ("strinfo_struct pool",
> +                                   sizeof (struct strinfo_struct), 64);
> +
> +  calculate_dominance_info (CDI_DOMINATORS);
> +
> +  /* String length optimization is implemented as a walk of the dominator
> +     tree and a forward walk of statements within each block.  */
> +  walk_data.dom_direction = CDI_DOMINATORS;
> +  walk_data.initialize_block_local_data = NULL;
> +  walk_data.before_dom_children = strlen_enter_block;
> +  walk_data.after_dom_children = strlen_leave_block;
> +  walk_data.block_local_data_size = 0;
> +  walk_data.global_data = NULL;
> +
> +  /* Initialize the dominator walker.  */
> +  init_walk_dominator_tree (&walk_data);
> +
> +  /* Recursively walk the dominator tree.  */
> +  walk_dominator_tree (&walk_data, ENTRY_BLOCK_PTR);
> +
> +  /* Finalize the dominator walker.  */
> +  fini_walk_dominator_tree (&walk_data);
> +
> +  XDELETEVEC (ssa_ver_to_stridx);
> +  free_alloc_pool (strinfo_pool);
> +  if (decl_to_stridxlist_htab)
> +    {
> +      htab_delete (decl_to_stridxlist_htab);
> +      decl_to_stridxlist_htab = NULL;
> +    }
> +
> +  return 0;
> +}
> +
> +static bool
> +gate_strlen (void)
> +{
> +  return flag_tree_strlen != 0;
> +}

Overall this looks good - it feels a bit tree-ish, but I suppose support
for non-constant lengths requires this.  It would be nice to avoid
building new tree calls - yeah, all our call folding still does this ... :/

Thanks,
Richard.

> +struct gimple_opt_pass pass_strlen =
> +{
> + {
> +  GIMPLE_PASS,
> +  "strlen",                    /* name */
> +  gate_strlen,                 /* gate */
> +  tree_ssa_strlen,             /* execute */
> +  NULL,                                /* sub */
> +  NULL,                                /* next */
> +  0,                           /* static_pass_number */
> +  TV_TREE_STRLEN,              /* tv_id */
> +  PROP_cfg | PROP_ssa,         /* properties_required */
> +  0,                           /* properties_provided */
> +  0,                           /* properties_destroyed */
> +  0,                           /* todo_flags_start */
> +  TODO_ggc_collect
> +    | TODO_verify_ssa          /* todo_flags_finish */
> + }
> +};
> --- gcc/testsuite/gcc.dg/strlenopt-1.c.jj       2011-09-02 10:39:12.000000000 +0200
> +++ gcc/testsuite/gcc.dg/strlenopt-1.c  2011-09-02 15:54:23.000000000 +0200
> @@ -0,0 +1,45 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-strlen" } */
> +
> +#include "strlenopt.h"
> +
> +__attribute__((noinline, noclone)) char *
> +foo (char *p, char *r)
> +{
> +  char *q = malloc (strlen (p) + strlen (r) + 64);
> +  if (q == NULL) return NULL;
> +  /* This strcpy can be optimized into memcpy, using the remembered
> +     strlen (p).  */
> +  strcpy (q, p);
> +  /* These two strcat can be optimized into memcpy.  The first one
> +     could be even optimized into a *ptr = '/'; store as the '\0'
> +     is immediately overwritten.  */
> +  strcat (q, "/");
> +  strcat (q, "abcde");
> +  /* Due to inefficient PTA (PR50262) the above calls invalidate
> +     string length of r, so it is optimized just into strcpy instead
> +     of memcpy.  */
> +  strcat (q, r);
> +  return q;
> +}
> +
> +int
> +main ()
> +{
> +  char *volatile p = "string1";
> +  char *volatile r = "string2";
> +  char *q = foo (p, r);
> +  if (q != NULL)
> +    {
> +      if (strcmp (q, "string1/abcdestring2"))
> +       abort ();
> +      free (q);
> +    }
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "memcpy \\(" 3 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
> +/* { dg-final { cleanup-tree-dump "strlen" } } */
> --- gcc/testsuite/gcc.dg/strlenopt-2.c.jj       2011-09-02 15:31:33.000000000 +0200
> +++ gcc/testsuite/gcc.dg/strlenopt-2.c  2011-09-02 15:54:28.000000000 +0200
> @@ -0,0 +1,47 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-strlen" } */
> +
> +#include "strlenopt.h"
> +
> +__attribute__((noinline, noclone)) char *
> +foo (char *p, char *r)
> +{
> +  char buf[26];
> +  if (strlen (p) + strlen (r) + 9 > 26)
> +    return NULL;
> +  /* This strcpy can be optimized into memcpy, using the remembered
> +     strlen (p).  */
> +  strcpy (buf, p);
> +  /* These two strcat can be optimized into memcpy.  The first one
> +     could be even optimized into a *ptr = '/'; store as the '\0'
> +     is immediately overwritten.  */
> +  strcat (buf, "/");
> +  strcat (buf, "abcde");
> +  /* This strcpy can be optimized into memcpy, using the remembered
> +     strlen (r).  */
> +  strcat (buf, r);
> +  /* And this can be optimized into memcpy too.  */
> +  strcat (buf, "fg");
> +  return strdup (buf);
> +}
> +
> +int
> +main ()
> +{
> +  char *volatile p = "string1";
> +  char *volatile r = "string2";
> +  char *q = foo (p, r);
> +  if (q != NULL)
> +    {
> +      if (strcmp (q, "string1/abcdestring2fg"))
> +       abort ();
> +      free (q);
> +    }
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "memcpy \\(" 5 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
> +/* { dg-final { cleanup-tree-dump "strlen" } } */
> --- gcc/testsuite/gcc.dg/strlenopt-3.c.jj       2011-09-02 11:18:52.000000000 +0200
> +++ gcc/testsuite/gcc.dg/strlenopt-3.c  2011-09-02 15:54:32.000000000 +0200
> @@ -0,0 +1,63 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-strlen -fdump-tree-optimized" } */
> +
> +#include "strlenopt.h"
> +
> +__attribute__((noinline, noclone)) size_t
> +fn1 (char *p, char *q)
> +{
> +  size_t s = strlen (q);
> +  strcpy (p, q);
> +  return s - strlen (p);
> +}
> +
> +__attribute__((noinline, noclone)) size_t
> +fn2 (char *p, char *q)
> +{
> +  size_t s = strlen (q);
> +  memcpy (p, q, s + 1);
> +  return s - strlen (p);
> +}
> +
> +__attribute__((noinline, noclone)) size_t
> +fn3 (char *p)
> +{
> +  memcpy (p, "abcd", 5);
> +  return strlen (p);
> +}
> +
> +__attribute__((noinline, noclone)) size_t
> +fn4 (char *p)
> +{
> +  memcpy (p, "efg\0hij", 6);
> +  return strlen (p);
> +}
> +
> +int
> +main ()
> +{
> +  char buf[64];
> +  char *volatile p = buf;
> +  char *volatile q = "ABCDEF";
> +  buf[7] = 'G';
> +  if (fn1 (p, q) != 0 || memcmp (buf, "ABCDEF\0G", 8))
> +    abort ();
> +  q = "HIJ";
> +  if (fn2 (p + 1, q) != 0 || memcmp (buf, "AHIJ\0F\0G", 8))
> +    abort ();
> +  buf[6] = 'K';
> +  if (fn3 (p + 1) != 4 || memcmp (buf, "Aabcd\0KG", 8))
> +    abort ();
> +  if (fn4 (p) != 3 || memcmp (buf, "efg\0hiKG", 8))
> +    abort ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "memcpy \\(" 4 "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */
> +/* { dg-final { cleanup-tree-dump "strlen" } } */
> +/* { dg-final { scan-tree-dump-times "return 0" 3 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "return 4" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "return 3" 1 "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> --- gcc/testsuite/gcc.dg/strlenopt.h.jj 2011-09-02 10:36:04.000000000 +0200
> +++ gcc/testsuite/gcc.dg/strlenopt.h    2011-09-02 12:39:08.000000000 +0200
> @@ -0,0 +1,57 @@
> +/* This is a replacement of needed parts from stdlib.h and string.h
> +   for -ftree-strlen testing, to ensure we are testing the builtins
> +   rather than whatever the OS has in its headers.  */
> +
> +#define NULL ((void *) 0)
> +typedef __SIZE_TYPE__ size_t;
> +extern void abort (void);
> +void *malloc (size_t);
> +void free (void *);
> +char *strdup (const char *);
> +size_t strlen (const char *);
> +void *memcpy (void *__restrict, const void *__restrict, size_t);
> +char *strcpy (char *__restrict, const char *__restrict);
> +char *strcat (char *__restrict, const char *__restrict);
> +int memcmp (const void *, const void *, size_t);
> +int strcmp (const char *, const char *);
> +#ifdef USE_GNU
> +void *mempcpy (void *__restrict, const void *__restrict, size_t);
> +char *stpcpy (char *__restrict, const char *__restrict);
> +#endif
> +
> +#if defined(FORTIFY_SOURCE) && FORTIFY_SOURCE > 0 && __OPTIMIZE__
> +# define bos(ptr) __builtin_object_size (x, FORTIFY_SOURCE > 0)
> +# define bos0(ptr) __builtin_object_size (x, 0)
> +
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) void *
> +memcpy (void *__restrict dest, const void *__restrict src, size_t len)
> +{
> +  return __builtin___memcpy_chk (dest, src, len, bos0 (dest));
> +}
> +
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) char *
> +strcpy (char *__restrict dest, const char *__restrict src)
> +{
> +  return __builtin___strcpy_chk (dest, src, bos (dest));
> +}
> +
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) char *
> +strcat (char *__restrict dest, const char *__restrict src)
> +{
> +  return __builtin___strcat_chk (dest, src, bos (dest));
> +}
> +
> +# ifdef USE_GNU
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) void *
> +mempcpy (void *__restrict dest, const void *__restrict src, size_t len)
> +{
> +  return __builtin___mempcpy_chk (dest, src, len, bos0 (dest));
> +}
> +
> +extern inline __attribute__((gnu_inline, always_inline, artificial)) char *
> +stpcpy (char *__restrict dest, const char *__restrict src)
> +{
> +  return __builtin___stpcpy_chk (dest, src, bos (dest));
> +}
> +# endif
> +#endif
>
>        Jakub
>



More information about the Gcc-patches mailing list