Bug 105217 - Track realloc input pointer to improve object size detection when reallocated object has not moved
Summary: Track realloc input pointer to improve object size detection when reallocated...
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 12.0
: P3 normal
Target Milestone: 13.0
Assignee: Siddhesh Poyarekar
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-04-11 13:03 UTC by Martin Liška
Modified: 2022-10-19 01:47 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2022-04-11 00:00:00


Attachments
test-case (956 bytes, text/x-csrc)
2022-04-11 13:03 UTC, Martin Liška
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Liška 2022-04-11 13:03:17 UTC
Created attachment 52781 [details]
test-case

Isolated from autogen, where we originally created the following issue:
https://sourceforge.net/p/autogen/bugs/212/

I isolated that to the attached test-case:

$ head -c 20k </dev/urandom > /tmp/1
$ gcc snippet.c -O2 -D_FORTIFY_SOURCE=3 -g && ./a.out /tmp/1
fread: data=0x2052c0, rem_sz=16340
.. read rdct=16340
realloc to=0x20a490-0x20f489 (newsize=20473)
.. diferent buffer!
fread: data=0x20e484, rem_sz=4096
.. read rdct=4096
realloc to=0x20a490-0x210489 (newsize=24569)
fread: data=0x20f484, rem_sz=4096
*** buffer overflow detected ***: terminated
Aborted (core dumped)

$ clang snippet.c -O2 -D_FORTIFY_SOURCE=3 -g && ./a.out /tmp/1
fread: data=0x4052c0, rem_sz=16340
.. read rdct=16340
realloc to=0x40a490-0x40f489 (newsize=20473)
.. diferent buffer!
fread: data=0x40e484, rem_sz=4096
.. read rdct=4096
realloc to=0x40a490-0x410489 (newsize=24569)
fread: data=0x40f484, rem_sz=4096
.. read rdct=44
fread: data=0x40f4b0, rem_sz=4052
.. read rdct=0
Comment 1 Martin Liška 2022-04-11 13:39:50 UTC
Started with r12-6482-g06bc1b0c539e3a60.
Comment 2 Siddhesh Poyarekar 2022-04-11 15:59:37 UTC
OK, taking a closer look, it looks like clang simply fails to fortify fread (probably due to https://reviews.llvm.org/D109967 or something similar).  Modifying the code to use __fread_chk directly:

    size_t rdct = __fread_chk (data, __builtin_dynamic_object_size (data, 0), (size_t)1, rem_sz, fp);

causes clang to crash too because it too comes up with the same __bdos estimate for size:

```
fread: data=0xf792c0 (dsize: 16344, size: 18446744073709551615), rem_sz=16340
.. read rdct=16340
realloc to=0xf7e490-0xf83489 (newsize=20473)
.. diferent buffer!
fread: data=0xf82484 (dsize: 4101, size: 18446744073709551615), rem_sz=4096
.. read rdct=4096
realloc to=0xf7e490-0xf84489 (newsize=24569)
fread: data=0xf83484 (dsize: 5, size: 18446744073709551615), rem_sz=4096
*** buffer overflow detected ***: terminated
Aborted (core dumped)
```

dsize and size are the actual values that __bdos and __bos resolve to; I simply modified the fprintf to this:

    fprintf(stderr, "fread: data=%p (dsize: %zu, size: %zu), rem_sz=%d\n", data, __builtin_dynamic_object_size (data, 0), __builtin_object_size (data, 0), rem_sz);

I haven't looked too closely at the failure mechanism (I will tomorrow), but this has got me inclined to think that it's an actual autogen bug that got exposed with _FORTIFY_SOURCE=3.
Comment 3 Siddhesh Poyarekar 2022-04-12 11:22:24 UTC
Here's a simplified version that shows the problem:

typedef __SIZE_TYPE__ size_t;                                     
                                                                  
#define OLD 40                                                    
#define NEW 80                                                    
#define OFF 10                                                    
                                                                  
int                                                               
main (void)                                                       
{                                                                 
  char *p = __builtin_malloc (OLD);                               
  char *q = 0;                                                    
  char *dst = p + OFF;                                            
                                                                  
  q = __builtin_realloc (p, NEW);                                 
                                                                  
  if (q == 0)                                                     
    __builtin_unreachable ();                                     
                                                                  
  if (q != p)                                                     
    {                                                             
      p = q;                                                      
      dst = q + OFF;                                              
    }                                                             
                                                                  
  __builtin_printf ("old size: %zu, new size: %zu, __bdos: %zu\n",
                    OLD - OFF, NEW - OFF,                         
                    __builtin_dynamic_object_size (dst, 0));      
}                                                                 


The problem is when realloc does not result in a different buffer (q == p); `dst` is left untouched assuming that it's the same object.  I doubt if this is a portable assumption, since the C standard says that value of q (and consequently dst?) will have become invalid beyond its lifetime, i.e. the moment it is realloc'd:

6.2.4 Storage durations of objects

...
The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.
...

It just happens so that with the glibc malloc implementation it remains valid but ISTM that applications should not rely on it.

Jakub, would you concur with this?
Comment 4 Jakub Jelinek 2022-04-13 14:48:39 UTC
I think pedantically both #c0 and #c3 or even
int
foo (void)
{
  char *p = __builtin_malloc (32);
  char *q = __builtin_realloc (p, 64);
  if (p == q)
    return __builtin_object_size (p, 0);
  else
    return -1;
}
are invalid (the pointer that is passed to realloc can't be used subsequently).  If the comparison is in integral type, it is fuzzier.
The problem is that in real-world, this is a very common thing to do, either with pointer or integral comparisons, often one needs to find out if it has been reallocated and adjust say some embedded pointers in the allocation or some other pointers so that they point into the new allocation rather than the old one and for optimization purposes it is complete waste of time to do it when the allocation wasn't actually moved.
The above also returns 32 when it should return 64, even with very old gcc versions.
On the other side, the standard making that invalid makes a lot of sense, otherwise we couldn't assume anything in case of
  char *p = __builtin_malloc (32);
  bar (p);
  return __builtin_object_size (p, 0);
because if we don't see into bar, it could do something like
  if (__builtin_realloc (p, 33) != p)
    exit (25);
or similar (or say realloc to smaller size, then bos2/bos3).
Even if the realloc is in the same function as the malloc, we might not know that it is that exact pointer passed to realloc (say pointer is passed to some function, that stores the pointer into global var, another function returns it, we then pass it to realloc, or any other way of obfuscation).
I'm afraid in those cases we should just point at the standard that it is undefined.
Then there is the case where we can clearly see that the pointer from malloc is passed to realloc or can trace it to such easily.  I'd say in that case it would be worthwhile to do some extra work.
For __bos the simplest solution would be if we detect something like that (e.g. that the SSA_NAME passed to realloc has uses dominated by the realloc call (though, even figuring that can mean we e.g. mark gimple stmts in each bb with increasing uids to determine like reassoc what stmt is before another one) just to punt, say we don't know anything about the SSA_NAME's size, or use conservative choice from both malloc and realloc (maximum for bos0/bos1, minimum for bos2/bos3).
For __bdos perhaps the same.  Another possibility would be to temporarily split the SSA_NAME passed to realloc, kind like old VRP was introducing ASSERT_EXPRs.
So, basically when we see:
  whatever = realloc (p_34, ...);
rewrite that (temporarily?) to:
  p_121 = p_34;
  whatever = realloc (p_121, ...);
and change all p_34 uses dominated by the realloc stmt to p_121, and add the
p_121 = p_34; stmt to some hash table or otherwise mark it so that we wouldn't propagate the objsz knowledge from p_34 to p_121, but instead set it on the realloc call.  That won't cover the integral comparisons though I'm afraid...
Comment 5 Siddhesh Poyarekar 2022-04-19 15:07:04 UTC
(In reply to Jakub Jelinek from comment #4)
> Then there is the case where we can clearly see that the pointer from malloc
> is passed to realloc or can trace it to such easily.  I'd say in that case
> it would be worthwhile to do some extra work.
> For __bos the simplest solution would be if we detect something like that
> (e.g. that the SSA_NAME passed to realloc has uses dominated by the realloc
> call (though, even figuring that can mean we e.g. mark gimple stmts in each
> bb with increasing uids to determine like reassoc what stmt is before
> another one) just to punt, say we don't know anything about the SSA_NAME's
> size, or use conservative choice from both malloc and realloc (maximum for
> bos0/bos1, minimum for bos2/bos3).
> For __bdos perhaps the same.  Another possibility would be to temporarily
> split the SSA_NAME passed to realloc, kind like old VRP was introducing
> ASSERT_EXPRs.
> So, basically when we see:
>   whatever = realloc (p_34, ...);
> rewrite that (temporarily?) to:
>   p_121 = p_34;
>   whatever = realloc (p_121, ...);
> and change all p_34 uses dominated by the realloc stmt to p_121, and add the
> p_121 = p_34; stmt to some hash table or otherwise mark it so that we
> wouldn't propagate the objsz knowledge from p_34 to p_121, but instead set
> it on the realloc call.  That won't cover the integral comparisons though
> I'm afraid...

This sounds like a gcc 13+ project.  Can we downgrade this since the reproducer is technically invalid and we're only going to attempt to support a limited subset of such uses?
Comment 6 Martin Liška 2022-04-20 07:41:15 UTC
Setting priority back to P3.