This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Add a character size parameter to c_strlen/get_range_strlen

On 08/17/2018 05:01 PM, Bernd Edlinger wrote:
On 08/17/18 22:17, Martin Sebor wrote:
On 08/17/2018 12:44 PM, Bernd Edlinger wrote:
On 08/17/18 20:23, Martin Sebor wrote:
On 08/17/2018 06:14 AM, Joseph Myers wrote:
On Fri, 17 Aug 2018, Jeff Law wrote:

On 08/16/2018 05:01 PM, Joseph Myers wrote:
On Thu, 16 Aug 2018, Jeff Law wrote:

restores previous behavior.  The sprintf bits want to count element
sized chunks, which for wchars is 4 bytes (that count will then be

   /* Compute the range the argument's length can be in.  */
-  fmtresult slen = get_string_length (arg);
+  int count_by = dir.specifier == 'S' || dir.modifier == FMT_LEN_l ? 4 : 1;

I don't see how a hardcoded 4 is correct here.  Surely you need to example
wchar_type_node to determine its actual size for this target.
We did kick this around a little.  IIRC Martin didn't think that it was
worth handling the 2 byte wchar case.

Sorry, I think we may have miscommunicated -- I didn't think it
was useful to pass a size of the character type to the function.
I agree that passing in a hardcoded constant doesn't seem right
(even if GCC's wchar_t were always 4 bytes wide).

I'm still not sure I see the benefit of passing in the expected
element size given that the patch causes c_strlen() fail when
the actual element size doesn't match ELTSIZE.  If the caller
asks for the number of bytes in a const wchar_t array it should
get back the number bytes.  (I could see it fail if the caller
asked for the number of words in a char array whose size was
not evenly divisible by wordsize.)

I think in this case c_strlen should use the type which the %S format
uses at runtime, otherwise it will not have anything to do with
the reality.

%S is not what I'm talking about.

Failing in the case I described (a caller asking for the size
in bytes of a constant object whose type is larger) prevents
callers that don't care about the type from detecting the actual
size of a constant.

Specifically for sprintf, it means that the buffer overflow
below is no longer diagnosed:

   struct S { char a[2]; void (*pf)(void); };

   void f (struct S *p)
     const char *q = (char*)L"\x41424344\x45464748";

     sprintf (p->a, "%s", q);

There may be no other analyses that would benefit from this
ability today but there easily could be.  There certainly
are optimizations that depend on c_getstr() returning
a pointer to the constant object regardless of its type
(memchr being one of them).

Yes, I agree.

Coincidentally that is exactly what in my follow-up patch implements.

If you call c_getstr(x) you get a valid zero-terminated single-byte
string or nothing.

If you call c_getstr(x, &memsize) you get a pointer to a memory
of the specified memsize, regardless of the underlying type.
and whether or not zero terminated.

Most functions in GCC call c_strlen() at some point to determine
the length of a string.  They need to be able to detect the missing
nul -- it would double the amount of processing to have them also
call c_getstr() when c_strlen() fails to see if the failure happens
to be due to a missing nul -- it the overwhelming majority of cases
it won't be.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]