This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: strlen optimization of decompose strcat?!
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Andreas Krebbel <krebbel at linux dot vnet dot ibm dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Fri, 21 Oct 2011 16:17:13 +0200
- Subject: Re: strlen optimization of decompose strcat?!
- References: <20111021132054.GA6892@bart>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Fri, Oct 21, 2011 at 03:20:54PM +0200, Andreas Krebbel wrote:
> on s390 a strcat is already decomposed by fold_builtin_strcat into a
> strlen and a strcpy. Due to that 3 strlenopt testcases currently
> fail: strlenopt-17g.c, strlenopt-4.c, strlenopt-4g.c.
Well, fold_builtin_strcat does such kind of optimization on all targets,
but only for known lengths of the second argument.
> Perhaps we can remove the strcat folding relying on the strlen
> optimization pass to catch all relevant cases?
I think that would be the best thing to do, nuke the
if (optimize_insn_for_speed_p ()) part of fold_builtin_strcat and
add it instead into handle_builtin_strcat if we'd keep it as strcat
after that pass. That said, your change makes sense and the nuking
of that folding can be done separately.
> *** gcc/tree-ssa-strlen.c.orig
> --- gcc/tree-ssa-strlen.c
> *************** get_string_length (strinfo si)
> *** 397,403 ****
> callee = gimple_call_fndecl (stmt);
> gcc_assert (callee && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL);
> lhs = gimple_call_lhs (stmt);
> ! gcc_assert (builtin_decl_implicit_p (BUILT_IN_STRCPY));
> /* unshare_strinfo is intentionally not called here. The (delayed)
> transformation of strcpy or strcat into stpcpy is done at the place
> of the former strcpy/strcat call and so can affect all the strinfos
> --- 397,403 ----
> callee = gimple_call_fndecl (stmt);
> gcc_assert (callee && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL);
> lhs = gimple_call_lhs (stmt);
> ! gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY));
> /* unshare_strinfo is intentionally not called here. The (delayed)
> transformation of strcpy or strcat into stpcpy is done at the place
> of the former strcpy/strcat call and so can affect all the strinfos
The above hunk is correct.
> *************** handle_builtin_strcpy (enum built_in_fun
> *** 1115,1127 ****
> dsi->writable = true;
> dsi->dont_invalidate = true;
>
> ! if (dsi->length == NULL_TREE)
> {
> /* If string length of src is unknown, use delayed length
> computation. If string lenth of dst will be needed, it
> can be computed by transforming this strcpy call into
> stpcpy and subtracting dst from the return value. */
> ! dsi->stmt = stmt;
> return;
> }
>
> --- 1115,1144 ----
> dsi->writable = true;
> dsi->dont_invalidate = true;
>
> ! if (builtin_decl_implicit_p (BUILT_IN_STPCPY) && dsi->length == NULL_TREE)
Why this? dsi->length will be NULL only if src length is unknown.
And at that point
case BUILT_IN_STRCPY:
case BUILT_IN_STRCPY_CHK:
if (lhs != NULL_TREE || !builtin_decl_implicit_p (BUILT_IN_STPCPY))
return;
break;
should have already returned if it is not true.
> {
> + strinfo psi;
> +
> /* If string length of src is unknown, use delayed length
> computation. If string lenth of dst will be needed, it
> can be computed by transforming this strcpy call into
> stpcpy and subtracting dst from the return value. */
> ! psi = get_strinfo (dsi->prev);
I think you should do
if (dsi->prev != 0 && verify_related_strinfos (dsi) != NULL)
{
psi = get_strinfo (dsi->prev);
if (psi->endptr == dsi->ptr)
...
}
> ! strinfo npsi = unshare_strinfo (psi);
No need to add a new pointer for that. Just do psi = unshare_strinfo (psi);
> ! npsi->stmt = stmt;
> ! npsi->next = 0;
> ! npsi->length = NULL_TREE;
> ! npsi->endptr = NULL_TREE;
> ! npsi->dont_invalidate = true;
> ! }
> ! else
> ! dsi->stmt = stmt;
Do you really need to clear npsi->next and not set dsi->stmt
if you tweak npsi->stmt? I mean, if you have:
#define _GNU_SOURCE /* to make stpcpy prototype visible */
#include <string.h>
size_t
foo (char *p, char *q)
{
size_t l1, l2;
char *r = strchr (p, '\0');
strcpy (r, q);
l1 = strlen (p);
l2 = strlen (r); /* Or with swapped order. */
return l1 + l2;
}
you can compute either l1, or l2, or both using get_string_length
strcpy -> stpcpy transformation (and in any order).
Perhaps we could avoid invalidating not just the previous one,
but also all earlier ones with endptr == dsi->ptr, say if you have
size_t
bar (char *p, char *q)
{
size_t l1, l2, l3;
char *r = strchr (p, '\0');
strcpy (r, "abcde");
char *s = strchr (r, '\0');
strcpy (s, q);
l1 = strlen (p);
l2 = strlen (r);
l3 = strlen (s);
return l1 + l2 + l3;
}
No matter what, it needs to have sufficient testsuite coverage added,
using something that will not depend on the HAVE_movstr strcat
folding (i.e. strchr + strcpy (or strlen + strcpy)).
Jakub