This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )

On 08/03/2018 07:00 AM, Bernd Edlinger wrote:
On 08/02/18 22:34, Martin Sebor wrote:
On 08/02/2018 12:56 PM, Bernd Edlinger wrote:
On 08/02/18 15:26, Bernd Edlinger wrote:

   /* If the length can be computed at compile-time, return it.  */
-  len = c_strlen (src, 0);
+  tree array;
+  tree len = c_strlen (src, 0, &array);

You know the c_strlen tries to compute wide character sizes,
but strlen does not do that, strlen (L"abc") should give 1
(or 0 on a BE machine)
I wonder if that is correct.


 static tree
-fold_builtin_strlen (location_t loc, tree type, tree arg)
+fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg)
   if (!validate_arg (arg, POINTER_TYPE))
     return NULL_TREE;
-      tree len = c_strlen (arg, 0);
+      tree arr = NULL_TREE;
+      tree len = c_strlen (arg, 0, &arr);

Is it possible to write a test case where strlen(L"test") reaches this point?
what will c_strlen return then?

Yes, of course it is:

$ cat y.c
int f(char *x)
   return __builtin_strlen(x);

int main ()
   return f((char*)&L"abcdef"[0]);
$ gcc -O3 -S y.c
$ cat y.s
    movl    $6, %eax

The reason is that c_strlen tries to fold wide chars at all.
I do not know when that was introduced, was that already before your last patches?

The function itself was introduced in 1992 if not earlier,
before wide strings even existed.  AFAICS, it has always
accepted strings of all widths.  Until r241489 (in GCC 7)
it computed their length in bytes, not characters.  I don't
know if that was on purpose or if it was just never changed
to compute the length in characters when wide strings were
first introduced.  From the name I assume it's the latter.
The difference wasn't detected until sprintf tests were added
for wide string directives.  The ChangeLog description for
the change reads: Correctly handle wide strings.  I didn't
consider pathological cases like strlen (L"abc").  It
shouldn't be difficult to change to fix this case.

Oh, oh, oh....

$ cat y3.c
int main ()
   char c[100];
   int x = __builtin_sprintf (c, "%S", L"\uFFFF");

   __builtin_printf("%d %ld\n", x,__builtin_strlen(c));

$ gcc-4.8 -O3 -std=c99 y3.c
$ ./a.out
-1 0
$ gcc -O3 y3.c
$ ./a.out
1 0
$ echo $LANG

I would have expected L"\uFFFF" to converted to UTF-8
or another encoding, so the return value if sprintf is
far from obvious, and probably language dependent.

Why do you think it is a good idea to use really every
opportunity to optimize totally unnecessary things like
using the return value from the sprintf function as it is?

Did you never think this adds a significant maintenance
burden on the rest of us as well?

Your condescending tone is uncalled for, and you clearly speak
out of ignorance.  I don't owe you an explanation but as I have
said multiple times: most of my work, including the sprintf pass,
is primarily motivated by detecting bugs like buffer overflow.
Optimization is only a secondary goal (but bug detection depends
on it).  It may come as a shock to you but mistakes happen.
That's why it's important to make an effort to detect them.
This is one is a simple typo (handling %S the same way as %s
instead of %ls.

If you are incapable of a professional tone I would suggest
you go harass someone else.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]