Bug 7961 - compare( char *) implemented incorrectly.
Summary: compare( char *) implemented incorrectly.
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 3.1.1
: P3 normal
Target Milestone: ---
Assignee: Paolo Carlini
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-09-17 20:36 UTC by john.carter
Modified: 2003-07-25 17:33 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description john.carter 2002-09-17 20:36:00 UTC
In bits/basic_string.h

  template<typename _CharT, typename _Traits, typename _Alloc>
    inline bool
    operator==(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
	       const _CharT* __rhs)
    { return __lhs.compare(__rhs) == 0; }

Which invokes in bits/basic_string.tc....

  template<typename _CharT, typename _Traits, typename _Alloc>
    int
    basic_string<_CharT, _Traits, _Alloc>::
    compare(const _CharT* __s) const
    {
      size_type __size = this->size();
      int __r = traits_type::compare(_M_data(), __s, __size);
      if (!__r)
	__r = __size - traits_type::length(__s);
      return __r;
    }

Which invokes ...

bits/char_traits.h

      static int 
      compare(const char_type* __s1, const char_type* __s2, size_t __n)
      { return memcmp(__s1, __s2, __n); }

Release:
gcc-3.1.1

Environment:
All.

How-To-Repeat:
So this bit of code can possibly segviolate if it happens to be in the wrong place at the wrong time....

  string lhs( "abc");
  
  lhs.append( '\0', 1);
 
  lhs += "def";

  lhs == "abc"
Comment 1 john.carter 2002-09-17 20:36:00 UTC
Fix:
A correct implementation would be...
  template<typename _CharT, typename _Traits, typename _Alloc>
    int
    basic_string<_CharT, _Traits, _Alloc>::
    compare(const _CharT* __s) const
    {
      size_type __size = this->size();
      size_type __s_size = traits_types::length(__s);
      size_type __min = __size;
      if ( __size  > __s_size) 
        __min = __s_size;

      int __r = traits_type::compare(_M_data(), __s, __min);
      if (!__r)
	__r = __size - _s_size;
       
      return __r;
    }


I haven't checked, but I suspect other code using the mem* functions in char_traits.h may suffer from the same problem.
Comment 2 Andreas Schwab 2002-09-18 10:39:54 UTC
From: Andreas Schwab <schwab@suse.de>
To: john.carter@tait.co.nz
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/7961: compare( char *) implemented incorrectly.
Date: Wed, 18 Sep 2002 10:39:54 +0200

 john.carter@tait.co.nz writes:
 
 |> A correct implementation would be...
 |>   template<typename _CharT, typename _Traits, typename _Alloc>
 |>     int
 |>     basic_string<_CharT, _Traits, _Alloc>::
 |>     compare(const _CharT* __s) const
 |>     {
 |>       size_type __size = this->size();
 |>       size_type __s_size = traits_types::length(__s);
 |>       size_type __min = __size;
 |>       if ( __size  > __s_size) 
 |>         __min = __s_size;
 |> 
 |>       int __r = traits_type::compare(_M_data(), __s, __min);
 |>       if (!__r)
 |> 	__r = __size - _s_size;
 |>        
 |>       return __r;
 |>     }
 
 This is not correct either, because __size - __s_size may overflow the
 range of int.  Try this instead:
 
       if (!__r)
 	__r = (__size > __s_size) - (__size < __s_size);
 
 Andreas.
 
 -- 
 Andreas Schwab, SuSE Labs, schwab@suse.de
 SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
 Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
 "And now for something completely different."

Comment 3 john.carter 2002-09-19 09:27:14 UTC
From: John Carter <john.carter@tait.co.nz>
To: Andreas Schwab <schwab@suse.de>
Cc: john.carter@tait.co.nz, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/7961: compare( char *) implemented incorrectly.
Date: Thu, 19 Sep 2002 09:27:14 +1200 (NZST)

 If we going to do two comparisons and a subtract why not
 
    template<typename _CharT, typename _Traits, typename _Alloc>
      int
      basic_string<_CharT, _Traits, _Alloc>::
      compare(const _CharT* __s) const
      {
        size_type __size = this->size();
        size_type __s_size = traits_types::length(__s);
 
        if (__size == __s_size) 
           return traits_type::compare(_M_data(), __s, __min);
 
        size_type __min = __size;
        int __result = -1;
        if ( __size  > __s_size) {
          __result = 1;
          __min = __s_size;
        }
  
        int __r = traits_type::compare(_M_data(), __s, __min);
        if (__r)
          return __r;
 
        return __result;
      }
 
 Hmm, I'm not sure that is better, probably need to look at the
 assembler generated or benchmark it. 
 
 Even faster would be to fold in an implementation of strcmp. (We
 already scan the c-string once with the trait_types::length,
 cosmically speaking we shouldn't need to do that.). 
 
 Unfortunately, just using strcmp directly is probably a bad idea as
 you have to take care. A string can have null's in any place but a c
 string can only have it at the end. 
 
  ie. 
    string abcNull( "abc\0");
 
    abcNull > "abc" 
    strcmp( abcNull.c_str(), "abs") == 0
 
 I would go for correctness and simplicity over speed.
 
    
 
 On Wed, 18 Sep 2002, Andreas Schwab wrote:
 
 > john.carter@tait.co.nz writes:
 > 
 > |> A correct implementation would be...
 > |>   template<typename _CharT, typename _Traits, typename _Alloc>
 > |>     int
 > |>     basic_string<_CharT, _Traits, _Alloc>::
 > |>     compare(const _CharT* __s) const
 > |>     {
 > |>       size_type __size = this->size();
 > |>       size_type __s_size = traits_types::length(__s);
 > |>       size_type __min = __size;
 > |>       if ( __size  > __s_size) 
 > |>         __min = __s_size;
 > |> 
 > |>       int __r = traits_type::compare(_M_data(), __s, __min);
 > |>       if (!__r)
 > |> 	__r = __size - _s_size;
 > |>        
 > |>       return __r;
 > |>     }
 > 
 > This is not correct either, because __size - __s_size may overflow the
 > range of int.  Try this instead:
 > 
 >       if (!__r)
 > 	__r = (__size > __s_size) - (__size < __s_size);
 > 
 > Andreas.
 > 
 > 
 
 -- 
 
 
 John Carter                             Phone : (64)(3) 358 6639
 Tait Electronics                        Fax   : (64)(3) 359 4632
 PO Box 1645 Christchurch                Email : john.carter@tait.co.nz
 New Zealand
 
 Good Ideas:
 Ruby                 - http://www.ruby-lang-org - The best of perl,python,scheme without the pain.
 Valgrind             - http://developer.kde.org/~sewardj/ - memory debugger for x86-GNU/Linux
 Free your books      - http://www.bookcrossing.com
 

Comment 4 john.carter 2002-09-19 09:40:37 UTC
From: John Carter <john.carter@tait.co.nz>
To: John Carter <john.carter@tait.co.nz>
Cc: Andreas Schwab <schwab@suse.de>, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/7961: compare( char *) implemented incorrectly.
Date: Thu, 19 Sep 2002 09:40:37 +1200 (NZST)

 On Thu, 19 Sep 2002, John Carter wrote:
 
 >    strcmp( abcNull.c_str(), "abs") == 0
 
 > I would go for correctness and simplicity over speed.
 
 Damn! I was typing too fast, being too fancy...
 Should be...
     strcmp( abcNull.c_str(), "abc") == 0
 
 
 -- 
 
 
 John Carter                             Phone : (64)(3) 358 6639
 Tait Electronics                        Fax   : (64)(3) 359 4632
 PO Box 1645 Christchurch                Email : john.carter@tait.co.nz
 New Zealand
 
 Good Ideas:
 Ruby                 - http://www.ruby-lang-org - The best of perl,python,scheme without the pain.
 Valgrind             - http://developer.kde.org/~sewardj/ - memory debugger for x86-GNU/Linux
 Free your books      - http://www.bookcrossing.com
 
Comment 5 Paolo Carlini 2002-11-01 02:23:27 UTC
State-Changed-From-To: open->closed
State-Changed-Why: Not a bug.
    Indeed, no segfaults can be produced at run-time with code
    like the following
    // ------------
    #include <string>
    #include <cassert>
    
    int main()
    {
      std::string lhs("abc");
    
      lhs.append(1, '\0');
      lhs += "def";
    
      assert( lhs != "abc" );
    }
    // -------------
    and variants thereof. From the glibc2.3.1 documentation:
     - Function: int memcmp (const void *A1, const void *A2, size_t SIZE)
         The function `memcmp' compares the SIZE bytes of memory beginning
         at A1 against the SIZE bytes of memory beginning at A2.  The value
         returned has the same sign as the difference between the first
         differing pair of bytes (interpreted as `unsigned char' objects,
         then promoted to `int').
    
         If the contents of the two blocks are equal, `memcmp' returns `0'.
    
    that is, it seems to me that there is absolutely *nothing*
    wrong with a '\0' embedded in the string: its just a byte
    like any other.
    Thanks for your report, Paolo.
Comment 6 Paolo Carlini 2002-11-01 02:36:42 UTC
Responsible-Changed-From-To: unassigned->paolo
Responsible-Changed-Why: Analyzed.
Comment 7 Paolo Carlini 2002-11-01 02:55:28 UTC
State-Changed-From-To: closed->analyzed
State-Changed-Why: Ok, sorry. Now I see your point: in the memcmp() we may end
    up reading past the end of __s2.
    However, it's difficult to construct an actual testcase...
    Paolo.
Comment 8 Paolo Carlini 2002-11-01 07:27:36 UTC
State-Changed-From-To: analyzed->closed
State-Changed-Why: Fixed for 3.2.1 and 3.3 with:
    http://gcc.gnu.org/ml/libstdc++/2002-11/msg00000.html
Comment 9 Paolo Carlini 2002-11-01 15:21:17 UTC
From: paolo@gcc.gnu.org
To: gcc-gnats@gcc.gnu.org
Cc:  
Subject: libstdc++/7961
Date: 1 Nov 2002 15:21:17 -0000

 CVSROOT:	/cvs/gcc
 Module name:	gcc
 Changes by:	paolo@gcc.gnu.org	2002-11-01 07:21:17
 
 Modified files:
 	libstdc++-v3   : ChangeLog 
 	libstdc++-v3/include/bits: basic_string.tcc 
 
 Log message:
 	2002-11-01  John Carter  <john.carter@tait.co.nz>
 	
 	PR libstdc++/7961
 	* include/bits/basic_string.tcc
 	(compare(const _CharT* __s)): Don't access __s past its length.
 
 Patches:
 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&r1=1.1407&r2=1.1408
 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/bits/basic_string.tcc.diff?cvsroot=gcc&r1=1.26&r2=1.27
 

Comment 10 Paolo Carlini 2002-11-01 15:25:27 UTC
From: paolo@gcc.gnu.org
To: gcc-gnats@gcc.gnu.org
Cc:  
Subject: libstdc++/7961
Date: 1 Nov 2002 15:25:27 -0000

 CVSROOT:	/cvs/gcc
 Module name:	gcc
 Branch: 	gcc-3_2-branch
 Changes by:	paolo@gcc.gnu.org	2002-11-01 07:25:27
 
 Modified files:
 	libstdc++-v3   : ChangeLog 
 	libstdc++-v3/include/bits: basic_string.tcc 
 
 Log message:
 	2002-11-01  John Carter  <john.carter@tait.co.nz>
 	
 	PR libstdc++/7961
 	* include/bits/basic_string.tcc
 	(compare(const _CharT* __s)): Don't access __s past its length.
 
 Patches:
 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_2-branch&r1=1.1057.2.159.2.41&r2=1.1057.2.159.2.42
 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/bits/basic_string.tcc.diff?cvsroot=gcc&only_with_tag=gcc-3_2-branch&r1=1.20.2.5.2.1&r2=1.20.2.5.2.2