This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Is ISO memcmp("abc","ade",10000) safe?
- From: Gabriel Dos Reis <gdr at integrable-solutions dot net>
- To: Roger Sayle <roger at www dot eyesopen dot com>
- Cc: gcc-patches at gcc dot gnu dot org, <gcc at gcc dot gnu dot org>, Andreas Schwab <schwab at suse dot de>
- Date: 15 Apr 2003 16:48:49 +0200
- Subject: Re: Is ISO memcmp("abc","ade",10000) safe?
- Organization: Integrable Solutions
- References: <Pine.LNX.4.44.0304150703350.14393-100000@www.eyesopen.com>
Roger Sayle <roger at www dot eyesopen dot com> writes:
| I'd like to ask the C/C++ language lawyers about their interpretation
| of the definitions of memcmp given in the relevant ANSI, ISO, POSIX,
| SVID and BSD specifications.
|
| Is memcmp("abc","ade",10000) safe by the standards?
Not according to the formal wording. The note 262 is interesting:
262The contents of ``holes'' used as padding for purposes of
alignment within structure objects are indeterminate.
Strings shorter than their allocated space and unions may
also cause problems in comparison.
The only formal guarantee one has is:
[#1] The sign of a nonzero value returned by the comparison
functions memcmp, strcmp, and strncmp is determined by the
sign of the difference between the values of the first pair
of characters (both interpreted as unsigned char) that
differ in the objects being compared.
In particular, an inefficient implementation may note that the 2nd
characters differ, remember that fact, but try to compare the ten
thousands of characters.
| The question has come up following the discussion of GNATS PR
| optimization/10339, which concerns GCC's optimization of strncmp
| into memcmp for performance reasons. The issue is that although
| "strncmp" is not allowed to access/fault memory beyond the first
| byte that differs, it is unclear whether "memcmp" has the same
| requirement.
strncmp has an even stronger requirement:
[#2] The strncmp function compares not more than n
characters (*characters that follow a null character are not
compared*) from the array pointed to by s1 to the array
pointed to by s2.
| If this is indeed the case, GCC's current transformation of
| strncmp("abc","ade",10000) into memcmp("abc","ade",10000),
| for example if it knows bytes must differ before the first NUL
| byte is encountered, is not guaranteed to be safe. i.e. there
| is nothing in the language standards to prevent the arguments
| to memcmp from being inefficiently compared from offset 9999
| backwards.
According to my reading, that transformation is highly unsafe.
-- Gaby