This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Is ISO memcmp("abc","ade",10000) safe?


Roger Sayle <roger at www dot eyesopen dot com> writes:

| I'd like to ask the C/C++ language lawyers about their interpretation
| of the definitions of memcmp given in the relevant ANSI, ISO, POSIX,
| SVID and BSD specifications.
| 
| Is memcmp("abc","ade",10000) safe by the standards?

Not according to the formal wording.  The note 262 is interesting:

       262The contents of ``holes'' used as padding for purposes of
          alignment within  structure  objects  are  indeterminate.
          Strings shorter than their allocated space and unions may
          also cause problems in comparison.

The only formal guarantee one has is:

       [#1]  The sign of a nonzero value returned by the comparison
       functions memcmp, strcmp, and strncmp is determined  by  the
       sign  of the difference between the values of the first pair
       of characters  (both  interpreted  as  unsigned  char)  that
       differ in the objects being compared.


In particular, an inefficient implementation   may note that the 2nd
characters differ, remember that fact, but try to compare the  ten
thousands of  characters.

| The question has come up following the discussion of GNATS PR
| optimization/10339, which concerns GCC's optimization of strncmp
| into memcmp for performance reasons.  The issue is that although
| "strncmp" is not allowed to access/fault memory beyond the first
| byte that differs, it is unclear whether "memcmp" has the same
| requirement.

strncmp has an even stronger requirement:

       [#2]   The   strncmp  function  compares  not  more  than  n
       characters (*characters that follow a null character are  not
       compared*)  from  the  array  pointed  to  by s1 to the array
       pointed to by s2.

| If this is indeed the case, GCC's current transformation of
| strncmp("abc","ade",10000) into memcmp("abc","ade",10000),
| for example if it knows bytes must differ before the first NUL
| byte is encountered, is not guaranteed to be safe.  i.e. there
| is nothing in the language standards to prevent the arguments
| to memcmp from being inefficiently compared from offset 9999
| backwards.

According to my reading, that transformation is highly unsafe.

-- Gaby


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]