Bug 86400 - set<string>::set<char (*)[2]> constructor does not work with array argument
Summary: set<string>::set<char (*)[2]> constructor does not work with array argument
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 8.1.1
: P2 normal
Target Milestone: 8.2
Assignee: Martin Sebor
URL:
Keywords: patch, wrong-code
Depends on:
Blocks: strlen
  Show dependency treegraph
 
Reported: 2018-07-04 08:42 UTC by Andreas Schwab
Modified: 2018-07-05 16:56 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 8.1.0
Last reconfirmed: 2018-07-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Schwab 2018-07-04 08:42:23 UTC
This fails at -O2, but not at -O.  The problem appears to be the basic_string(const char*) constructor miscomputing the arguments for _M_construct, with traits_type::length(__s) returning 0.

$ cat set.cc
#include <cassert>
#include <string>
#include <set>

void
foo1 ()
{
  static char root[2] = {"/"};
  std::set<std::string, std::less<std::string>> d (&root, &root + 1);
  bool b = d.find ("/") != d.end ();
  assert (b);
}

void
foo2 ()
{
  static char root[1][2] = {"/"};
  std::set<std::string, std::less<std::string>> d (root, root + 1);
  bool b = d.find ("/") != d.end ();
  assert (b);
}

int
main ()
{
  foo1 ();
  foo2 ();
}
$ g++ -O2 -g -Wall set.cc -o set
$ ./set
set: set.cc:20: void foo2(): Assertion `b' failed.
Comment 1 Richard Biener 2018-07-04 08:45:51 UTC
Note it works with the old string ABI so maybe it's a libstdc++ issue after all.
Comment 2 Jonathan Wakely 2018-07-04 20:02:42 UTC
It fails with -O1 -foptimize-strlen -fipa-sra -finline-small-functions
Comment 3 Jonathan Wakely 2018-07-04 22:35:52 UTC
The set constructs a std::string element from *root i.e. "/" with this constructor:

514  basic_string(const _CharT* __s, const _Alloc& __a = _Alloc())
515  : _M_dataplus(_M_local_data(), __a)
516  { _M_construct(__s, __s ? __s + traits_type::length(__s) : __s+npos); }

Which then calls _M_construct with the same pointer for __beg and __end:

std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*> (__end=0x6020a0 <foo2()::root> "/", __beg=0x6020a0 <foo2()::root> "/", this=0x7fffffffd320) at /home/jwakely/gcc/9/include/c++/9.0.0/bits/basic_string.h:516

This constructs an empty string object, and so of course the d.find("/") call returns the end iterator.

The problem is that traits_type::length(__s) returns 0, which is consistent with my earlier observation that -foptimize-strlen is required to reproduce the failure.
Comment 4 Jonathan Wakely 2018-07-04 22:40:36 UTC
# /home/jwakely/gcc/9/include/c++/9.0.0/bits/basic_string.h:236:           _M_construct(__beg, __end, _Tag());
	.loc 3 236 11 is_stmt 0
	movl	$_ZZ4foo2vE4root, %edx	#,
	movq	%rdx, %rsi	#,
	movq	%rsp, %rdi	#,
.LEHB0:
	call	_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIPKcEEvT_S8_St20forward_iterator_tag	#


The output of -fdump-tree-optimized:


  <bb 2> [local count: 8687547398]:
  MEM[(struct  &)&d] ={v} {CLOBBER};
  MEM[(struct  &)&d] ={v} {CLOBBER};
  MEM[(struct  &)&d + 8] ={v} {CLOBBER};
  MEM[(struct _Rb_tree_header *)&d + 8B]._M_header._M_color = 0;
  MEM[(struct _Rb_tree_header *)&d + 8B]._M_header._M_parent = 0B;
  MEM[(struct _Rb_tree_header *)&d + 8B]._M_header._M_left = &MEM[(struct _Rb_tree_header *)&d + 8B]._M_header;
  MEM[(struct _Rb_tree_header *)&d + 8B]._M_header._M_right = &MEM[(struct _Rb_tree_header *)&d + 8B]._M_header;
  MEM[(struct _Rb_tree_header *)&d + 8B]._M_node_count = 0;
  MEM[(struct _Rb_tree_const_iterator *)&__position] = &MEM[(struct _Rb_tree *)&d]._M_impl.D.30202._M_header;
  MEM[(struct  &)&D.38131] ={v} {CLOBBER};
  MEM[(struct  &)&D.38131] ={v} {CLOBBER};
  MEM[(struct _Alloc_hider *)&D.38131]._M_p = &D.38131.D.18738._M_local_buf;
  iftmp.26_33 = &root;
  std::__cxx11::basic_string<char>::_M_construct<const char*> (&D.38131, &root, iftmp.26_33, D.38135);

iftmp.26_33 is the __end argument, and is set to the same value as __beg.
Comment 5 Jonathan Wakely 2018-07-04 22:45:54 UTC
#include <cassert>
void
foo2 ()
{
  static char root[1][2] = {"/"};
  auto len = __builtin_strlen(*root);
  assert(len == 1);
}

int
main ()
{
  foo2 ();
}


$ g++ set.cc -O1 -foptimize-strlen
$ ./a.out
a.out: set.cc:7: void foo2(): Assertion `len == 1' failed.
Aborted (core dumped)
Comment 6 Jonathan Wakely 2018-07-04 22:50:12 UTC
Reduced further:

void
foo2 ()
{
  static char root[1][2] = {"/"};
  auto len = __builtin_strlen(*root);
  if (len != 1)
          __builtin_abort();
}

int
main ()
{
  foo2 ();
}


This started to fail with r255790

    PR middle-end/83373 - False positive reported by -Wstringop-overflow                                                                                                                                                                                                                                                     
    PR tree-optimization/78450 - strlen(s) return value can be assumed to be less than the size of s                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                                                                                             
    gcc/ChangeLog:                                                                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                                                                             
            PR middle-end/83373                                                                                                                                                                                                                                                                                              
            PR tree-optimization/78450                                                                                                                                                                                                                                                                                       
            * tree-ssa-strlen.c (maybe_set_strlen_range): New function.                                                                                                                                                                                                                                                      
            (handle_builtin_strlen): Call it.                                                                                                                                                                                                                                                                                
                                                                                                                                                                                                                                                                                                                             
    gcc/testsuite/ChangeLog:                                                                                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                                                                                             
            PR middle-end/83373                                                                                                                                                                                                                                                                                              
            PR tree-optimization/78450                                                                                                                                                                                                                                                                                       
            * gcc.dg/pr83373.c: New test.                                                                                                                                                                                                                                                                                    
            * gcc.dg/strlenopt-36.c: New test.                                                                                                                                                                                                                                                                               
            * gcc.dg/strlenopt-37.c: New test.
Comment 7 Martin Sebor 2018-07-05 00:15:38 UTC
Confirmed.  The strlen pass sees:

  static char root[1][2] = {"/"};
  ...
  _1 = __builtin_strlen (&root);


but the code in maybe_set_strlen_range() assumes the argument has the correct type (i.e., char*, not char[][2]) and uses TYPE_DOMAIN to determine the upper bound of the array which returns the bound of the outer array in this case.  Using TYPE_SIZE instead avoids the problem:

Index: gcc/tree-ssa-strlen.c
===================================================================
--- gcc/tree-ssa-strlen.c	(revision 262418)
+++ gcc/tree-ssa-strlen.c	(working copy)
@@ -1156,22 +1156,17 @@ maybe_set_strlen_range (tree lhs, tree src, tree b
       if (src_is_array && !array_at_struct_end_p (src))
 	{
 	  tree type = TREE_TYPE (src);
-	  if (tree dom = TYPE_DOMAIN (type))
-	    {
-	      tree maxval = TYPE_MAX_VALUE (dom);
-	      if (maxval)
-		max = wi::to_wide (maxval);
-	      else
-		max = wi::zero (min.get_precision ());
+	  if (tree size = TYPE_SIZE_UNIT (type))
+	    if (size && TREE_CODE (size) == INTEGER_CST)
+	      max = wi::to_wide (size);
 
-	      /* For strlen() the upper bound above is equal to
-		 the longest string that can be stored in the array
-		 (i.e., it accounts for the terminating nul.  For
-		 strnlen() bump up the maximum by one since the array
-		 need not be nul-terminated.  */
-	      if (bound)
-		++max;
-	    }
+	  /* For strlen() the upper bound above is equal to
+	     the longest string that can be stored in the array
+	     (i.e., it accounts for the terminating nul.  For
+	     strnlen() bump up the maximum by one since the array
+	     need not be nul-terminated.  */
+	  if (!bound && max != 0)
+	    --max;
 	}
       else
 	{
Comment 8 Martin Sebor 2018-07-05 02:18:03 UTC
Patch: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00220.html
Comment 9 rguenther@suse.de 2018-07-05 07:29:21 UTC
On Thu, 5 Jul 2018, msebor at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86400
> 
> Martin Sebor <msebor at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>            Keywords|                            |wrong-code
>              Status|NEW                         |ASSIGNED
>              Blocks|                            |83819
>            Assignee|unassigned at gcc dot gnu.org      |msebor at gcc dot gnu.org
> 
> --- Comment #7 from Martin Sebor <msebor at gcc dot gnu.org> ---
> Confirmed.  The strlen pass sees:
> 
>   static char root[1][2] = {"/"};
>   ...
>   _1 = __builtin_strlen (&root);
> 
> 
> but the code in maybe_set_strlen_range() assumes the argument has the correct
> type (i.e., char*, not char[][2])

But the address of an array type is a pointer to an array type
in GENERIC.

> and uses TYPE_DOMAIN to determine the upper
> bound of the array which returns the bound of the outer array in this case. 
> Using TYPE_SIZE instead avoids the problem:
> 
> Index: gcc/tree-ssa-strlen.c
> ===================================================================
> --- gcc/tree-ssa-strlen.c       (revision 262418)
> +++ gcc/tree-ssa-strlen.c       (working copy)
> @@ -1156,22 +1156,17 @@ maybe_set_strlen_range (tree lhs, tree src, tree b
>        if (src_is_array && !array_at_struct_end_p (src))
>         {
>           tree type = TREE_TYPE (src);
> -         if (tree dom = TYPE_DOMAIN (type))
> -           {
> -             tree maxval = TYPE_MAX_VALUE (dom);
> -             if (maxval)
> -               max = wi::to_wide (maxval);
> -             else
> -               max = wi::zero (min.get_precision ());
> +         if (tree size = TYPE_SIZE_UNIT (type))

That should indeed include sub-dimensions.  Note the above
is only correct if BITS_PER_UNIT == CHAR_TYPE_SIZE (we probably
do not have any target where that isn't true).

> +           if (size && TREE_CODE (size) == INTEGER_CST)
> +             max = wi::to_wide (size);
> 
> -             /* For strlen() the upper bound above is equal to
> -                the longest string that can be stored in the array
> -                (i.e., it accounts for the terminating nul.  For
> -                strnlen() bump up the maximum by one since the array
> -                need not be nul-terminated.  */
> -             if (bound)
> -               ++max;
> -           }
> +         /* For strlen() the upper bound above is equal to
> +            the longest string that can be stored in the array
> +            (i.e., it accounts for the terminating nul.  For
> +            strnlen() bump up the maximum by one since the array
> +            need not be nul-terminated.  */
> +         if (!bound && max != 0)
> +           --max;
>         }
>        else
>         {
> 
> 
> Referenced Bugs:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83819
> [Bug 83819] [meta-bug] missing strlen optimizations
>
Comment 10 Martin Sebor 2018-07-05 14:36:41 UTC
Author: msebor
Date: Thu Jul  5 14:36:09 2018
New Revision: 262438

URL: https://gcc.gnu.org/viewcvs?rev=262438&root=gcc&view=rev
Log:
PR tree-optimization/86400 - set<string>::set<char (*)[2]) constructor does not work with array argument

gcc/ChangeLog:
	* tree-ssa-strlen.c (maybe_set_strlen_range): Use type size rather
	than its domain to compute its the upper bound of a char array.

gcc/testsuite/ChangeLog:
	* gcc.dg/strlenopt-47.c: New test.
	* gcc.dg/strlenopt-48.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/strlenopt-47.c
    trunk/gcc/testsuite/gcc.dg/strlenopt-48.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-strlen.c
Comment 11 Martin Sebor 2018-07-05 16:43:22 UTC
Author: msebor
Date: Thu Jul  5 16:42:50 2018
New Revision: 262446

URL: https://gcc.gnu.org/viewcvs?rev=262446&root=gcc&view=rev
Log:
Backport from trunk.

PR tree-optimization/86400 - set<string>::set<char (*)[2]) constructor does not
work with array argument.

gcc/ChangeLog:
	* gcc.dg/strlenopt-47.c: New test.
	* gcc.dg/strlenopt-48.c: New test.

gcc/testsuite/ChangeLog:
	* gcc.dg/strlenopt-47.c: New test.
	* gcc.dg/strlenopt-48.c: New test.


Added:
    branches/gcc-8-branch/gcc/testsuite/gcc.dg/strlenopt-47.c
    branches/gcc-8-branch/gcc/testsuite/gcc.dg/strlenopt-48.c
Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/testsuite/ChangeLog
    branches/gcc-8-branch/gcc/tree-ssa-strlen.c
Comment 12 Martin Sebor 2018-07-05 16:44:21 UTC
Fixed on trunk (9.0) and 8.2.0.