This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: strlen optimization of decompose strcat?!


On Fri, Oct 21, 2011 at 03:20:54PM +0200, Andreas Krebbel wrote:
> on s390 a strcat is already decomposed by fold_builtin_strcat into a
> strlen and a strcpy.  Due to that 3 strlenopt testcases currently
> fail: strlenopt-17g.c, strlenopt-4.c, strlenopt-4g.c.

Well, fold_builtin_strcat does such kind of optimization on all targets,
but only for known lengths of the second argument.

> Perhaps we can remove the strcat folding relying on the strlen
> optimization pass to catch all relevant cases?

I think that would be the best thing to do, nuke the
if (optimize_insn_for_speed_p ()) part of fold_builtin_strcat and
add it instead into handle_builtin_strcat if we'd keep it as strcat
after that pass.  That said, your change makes sense and the nuking
of that folding can be done separately.

> *** gcc/tree-ssa-strlen.c.orig
> --- gcc/tree-ssa-strlen.c
> *************** get_string_length (strinfo si)
> *** 397,403 ****
>         callee = gimple_call_fndecl (stmt);
>         gcc_assert (callee && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL);
>         lhs = gimple_call_lhs (stmt);
> !       gcc_assert (builtin_decl_implicit_p (BUILT_IN_STRCPY));
>         /* unshare_strinfo is intentionally not called here.  The (delayed)
>   	 transformation of strcpy or strcat into stpcpy is done at the place
>   	 of the former strcpy/strcat call and so can affect all the strinfos
> --- 397,403 ----
>         callee = gimple_call_fndecl (stmt);
>         gcc_assert (callee && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL);
>         lhs = gimple_call_lhs (stmt);
> !       gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY));
>         /* unshare_strinfo is intentionally not called here.  The (delayed)
>   	 transformation of strcpy or strcat into stpcpy is done at the place
>   	 of the former strcpy/strcat call and so can affect all the strinfos

The above hunk is correct.

> *************** handle_builtin_strcpy (enum built_in_fun
> *** 1115,1127 ****
>     dsi->writable = true;
>     dsi->dont_invalidate = true;
>   
> !   if (dsi->length == NULL_TREE)
>       {
>         /* If string length of src is unknown, use delayed length
>   	 computation.  If string lenth of dst will be needed, it
>   	 can be computed by transforming this strcpy call into
>   	 stpcpy and subtracting dst from the return value.  */
> !       dsi->stmt = stmt;
>         return;
>       }
>   
> --- 1115,1144 ----
>     dsi->writable = true;
>     dsi->dont_invalidate = true;
>   
> !   if (builtin_decl_implicit_p (BUILT_IN_STPCPY) && dsi->length == NULL_TREE)

Why this?  dsi->length will be NULL only if src length is unknown.
And at that point
      case BUILT_IN_STRCPY:
      case BUILT_IN_STRCPY_CHK:
        if (lhs != NULL_TREE || !builtin_decl_implicit_p (BUILT_IN_STPCPY))
          return;
        break;
should have already returned if it is not true.

>       {
> +       strinfo psi;
> + 
>         /* If string length of src is unknown, use delayed length
>   	 computation.  If string lenth of dst will be needed, it
>   	 can be computed by transforming this strcpy call into
>   	 stpcpy and subtracting dst from the return value.  */
> !       psi = get_strinfo (dsi->prev);

I think you should do
  if (dsi->prev != 0 && verify_related_strinfos (dsi) != NULL)
    {
      psi = get_strinfo (dsi->prev);
      if (psi->endptr == dsi->ptr)
	...
    }

> ! 	  strinfo npsi = unshare_strinfo (psi);

No need to add a new pointer for that.  Just do psi = unshare_strinfo (psi);

> ! 	  npsi->stmt = stmt;
> ! 	  npsi->next = 0;
> ! 	  npsi->length = NULL_TREE;
> ! 	  npsi->endptr = NULL_TREE;
> ! 	  npsi->dont_invalidate = true;
> ! 	}
> !       else
> ! 	dsi->stmt = stmt;

Do you really need to clear npsi->next and not set dsi->stmt
if you tweak npsi->stmt?  I mean, if you have:
#define _GNU_SOURCE /* to make stpcpy prototype visible */
#include <string.h>

size_t
foo (char *p, char *q)
{
  size_t l1, l2;
  char *r = strchr (p, '\0');
  strcpy (r, q);
  l1 = strlen (p);
  l2 = strlen (r); /* Or with swapped order.  */
  return l1 + l2;
}

you can compute either l1, or l2, or both using get_string_length
strcpy -> stpcpy transformation (and in any order).
Perhaps we could avoid invalidating not just the previous one,
but also all earlier ones with endptr == dsi->ptr, say if you have

size_t
bar (char *p, char *q)
{
  size_t l1, l2, l3;
  char *r = strchr (p, '\0');
  strcpy (r, "abcde");
  char *s = strchr (r, '\0');
  strcpy (s, q);
  l1 = strlen (p);
  l2 = strlen (r);
  l3 = strlen (s);
  return l1 + l2 + l3;
}

No matter what, it needs to have sufficient testsuite coverage added,
using something that will not depend on the HAVE_movstr strcat
folding (i.e. strchr + strcpy (or strlen + strcpy)).

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]