[Bug tree-optimization/82946] member pointer defeats strlen optimization involving a string literal

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Nov 13 09:08:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82946

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Martin Sebor from comment #0)
> In the program below, while GCC optimizes the strlen call in f() to a
> constant it doesn't do the same for the equivalent function g().
> 
> I suspect this is caused by the same underlying assumptions as pr80944:
> i.e., that the strcpy (a->d, "123") call could change a->d if a->d pointed
> at or into itself.  While that might be true in other circumstances, it's
> not possible here.  Since the array at a->d is subsequently accessed by the
> call to strlen, the strcpy call cannot change a->d in a valid program
> because "123" (or any other string literal) cannot be a valid representation
> of a pointer.  (The only way for a conforming program to obtain a valid
> pointer is by assigning to it the value of another valid pointer.  Even if
> the bit pattern of the literal "123" happened to match a valid address in a
> program, copying the literal into a pointer and then using that pointer is
> undefined.)
> 
> So a->d can be assumed not to change in either function and the strlen
> optimization below is safe in both.
> 
> $ cat c.c && gcc -O2 -S -Wall -fdump-tree-optimized=/dev/stdout a.c
> 
> char* strcpy (char*, const char*);
> __SIZE_TYPE__ strlen (const char*);
> 
> struct A { char *d; };
> 
> unsigned f (struct A *a)
> {
>   char *d = a->d;
>   strcpy (d, "123");
>   return strlen (d);   // folded into 3
> }
> 
> unsigned g (struct A *a)
> {
>   strcpy (a->d, "123");
>   return strlen (a->d);   // not folded but can be
> }
> 
> 
> ;; Function f (f, funcdef_no=0, decl_uid=1898, cgraph_uid=0, symbol_order=0)
> 
> f (struct A * a)
> {
>   char * d;
> 
>   <bb 2> [local count: 10000]:
>   d_4 = a_3(D)->d;
>   __builtin_memcpy (d_4, "123", 4);
>   return 3;
> 
> }
> 
> 
> 
> ;; Function g (g, funcdef_no=1, decl_uid=1902, cgraph_uid=1, symbol_order=1)
> 
> g (struct A * a)
> {
>   char * _1;
>   char * _2;
>   long unsigned int _3;
>   unsigned int _7;
> 
>   <bb 2> [local count: 10000]:
>   _1 = a_5(D)->d;
>   __builtin_memcpy (_1, "123", 4);
>   _2 = a_5(D)->d; 

Clearly because GCC has to assume a_5(D)->d points to itself and thus
memcpy clobbering it.

Can't see how you can rule that out for a valid program.  Thus - INVALID?

Richard.

>   _3 = strlen (_2);
>   _7 = (unsigned int) _3;
>   return _7;
> 
> }


More information about the Gcc-bugs mailing list