Bug 100417 - False positive -Wmaybe-uninitalized, malloc and start/end argument to another function
Summary: False positive -Wmaybe-uninitalized, malloc and start/end argument to another...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.1.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: diagnostic
: 100496 (view as bug list)
Depends on:
Blocks: Wuninitialized 100496
  Show dependency treegraph
 
Reported: 2021-05-04 14:15 UTC by James Bonfield
Modified: 2022-12-06 20:19 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 11.1.0, 12.0
Last reconfirmed: 2021-05-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description James Bonfield 2021-05-04 14:15:20 UTC
With gcc 11.1.0 and -Wall, if we malloc a block of memory and pass this pointer to a function taking a const char * argument then we get a "may be used uninitialized" warning.  That is perhaps a reasonable assessment given the data cannot be modified by the function and nor was it initialised by the malloc.

However in the scenario of passing in both a char * and const char * pointer to the same block of memory, the warning is still reported.  This is a false positive.

An example usage would be a function which writes to a pointer, but also has a second pointer to the end of the buffer to act as bounds checking.  This is a valid idiom as an alternative to maintaining a remaining-length argument.

The simple code below demonstrates this issue:

#include <stdio.h>
#include <stdlib.h>

void fill(char *start, const char *end); // works if const is removed

char *func(void) {
    char *dat;
    //if (!(dat = calloc(1, 10))) // also works
    if (!(dat = malloc(10)))
        return NULL;

    fill(dat, dat+10);
    return dat;
}


$ gcc -Wa
ll -c _.c
_.c: In function 'func':
_.c:12:5: warning: '<unknown>' may be used uninitialized [-Wmaybe-uninitialized]
   12 |     fill(dat, dat+10);
      |     ^~~~~~~~~~~~~~~~~
_.c:4:6: note: by argument 2 of type 'const char *' to 'fill' declared here
    4 | void fill(char *start, const char *end);
      |      ^~~~


Using "char *end" in fill or calloc instead of malloc removes this warning.
The more complete example where this test case was culled from is here:

https://github.com/samtools/htslib/pull/1280/checks?check_run_id=2499621360
from this code https://github.com/samtools/htslib/blob/636c4207e1f4a6a1f310aecfa9b1e6bd13e921f6/cram/cram_codecs.c#L1901-L1938

My belief is that as dat and dat+10 are both pointers to the same object (NB dat+9 gives the same warning so it's not a 1-past-the-end thing) then validation should be done against the weakest form of pointer (char * in this case) for all pointers to that object.

I can confirm that -Wextra -fno-strict-aliasing -fwrapv does not change this behaviour.  

Gcc was built from source, using ../configure --prefix=/software/badger/opt/gcc/11.1.0 --disable-multilib --disable-bootstrap and contrib/download_prerequisites.

Regards,
James
Comment 1 Martin Sebor 2021-05-04 19:37:37 UTC
The warning here is expected (it was considered when the -Wmaybe-uninitialized enhancement was added) but let me confirm this as a bug for the inconsistency below and because even though the expected suppression works via attribute access none, it triggers other unhelpful warnings.  The warning should be issued consistently and I've become convinced that attribute access none with no size argument should not expect the pointer points to an object of nonzero size.

I'm not sure that implementing a heuristic for the likely rare [T*, const T*) range is worthwhile since it would come at the risk of false negatives.  A solution I'd like better is extending attribute access to apply two pointer arguments in addition to pointer and size.

$ (set -x && cat a.c && gcc -D'__attribute__(x)=' -O2 -S -Wall a.c && gcc -O2 -S -Wall a.c)
+ cat a.c
__attribute__ ((access (none, 2)))
void f (int *, const int *);

void f1 (void)
{
  int a[2];
  f (a, a + 1);    // -Wmaybe-uninitialized
}

void f2 (void)
{
  int a[2];
  f (a, a + 2);    // no warning
}

void g1 (void)
{
  int *p = __builtin_malloc (sizeof (*p) * 2);
  f (p, p + 1);    // -Wmaybe-uninitialized
}

void g2 (void)
{
  int *p = __builtin_malloc (sizeof (*p) * 2);
  f (p, p + 2);    // -Wmaybe-uninitialized
}

+ gcc '-D__attribute__(x)=' -O2 -S -Wall a.c
a.c: In function ‘f1’:
a.c:7:3: warning: ‘a’ may be used uninitialized [-Wmaybe-uninitialized]
    7 |   f (a, a + 1);    // -Wmaybe-uninitialized
      |   ^~~~~~~~~~~~
a.c:2:6: note: by argument 2 of type ‘const int *’ to ‘f’ declared here
    2 | void f (int *, const int *);
      |      ^
a.c:6:7: note: ‘a’ declared here
    6 |   int a[2];
      |       ^
a.c: In function ‘g1’:
a.c:19:3: warning: ‘<unknown>’ may be used uninitialized [-Wmaybe-uninitialized]
   19 |   f (p, p + 1);    // -Wmaybe-uninitialized
      |   ^~~~~~~~~~~~
a.c:2:6: note: by argument 2 of type ‘const int *’ to ‘f’ declared here
    2 | void f (int *, const int *);
      |      ^
a.c: In function ‘g2’:
a.c:25:3: warning: ‘<unknown>’ may be used uninitialized [-Wmaybe-uninitialized]
   25 |   f (p, p + 2);    // -Wmaybe-uninitialized
      |   ^~~~~~~~~~~~
a.c:2:6: note: by argument 2 of type ‘const int *’ to ‘f’ declared here
    2 | void f (int *, const int *);
      |      ^
+ gcc -O2 -S -Wall a.c
a.c: In function ‘f2’:
a.c:13:3: warning: ‘f’ expecting 4 bytes in a region of size 0 [-Wstringop-overread]
   13 |   f (a, a + 2);    // no warning
      |   ^~~~~~~~~~~~
a.c:12:7: note: at offset 8 into source object ‘a’ of size 8
   12 |   int a[2];
      |       ^
a.c:2:6: note: in a call to function ‘f’ declared with attribute ‘access (none, 2)’
    2 | void f (int *, const int *);
      |      ^
a.c: In function ‘g2’:
a.c:25:3: warning: ‘f’ expecting 4 bytes in a region of size 0 [-Wstringop-overread]
   25 |   f (p, p + 2);    // -Wmaybe-uninitialized
      |   ^~~~~~~~~~~~
a.c:24:12: note: at offset 8 into source object of size 8 allocated by ‘__builtin_malloc’
   24 |   int *p = __builtin_malloc (sizeof (*p) * 2);
      |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
a.c:2:6: note: in a call to function ‘f’ declared with attribute ‘access (none, 2)’
    2 | void f (int *, const int *);
      |      ^
Comment 2 John Marshall 2021-05-04 22:02:19 UTC
See also https://github.com/samtools/htslib/pull/1275#issuecomment-831799708 (onwards) and https://github.com/samtools/htslib/pull/1280 for the initial observation of this in James's original code. The diagnostic message was

cram/cram_codecs.c: In function 'cram_xdelta_encode_char':
cram/cram_codecs.c:1916:19: error: 'cp_end' may be used uninitialized [-Werror=maybe-uninitialized]
 1916 |             cp += c->vv->varint_put32(cp, cp_end, zigzag16(c->u.e_xdelta.last));

which initially misled us into investigating the cp_end pointer itself's initialisation. IMHO the diagnostic would be more accurate as something like "data pointed to by 'cp_end' may be used uninitialized".

For code like James's example,

    extern void fill(char *start, const char *end);
    ...
    char *dat = malloc(10);
    ...
    fill(dat, dat+10);

in which the 'end' value is demonstrably one-past-the-end-of the malloced array, it should be clear that the pointed-to data is not going to be read (or if it was read, the warning should be something more severe!). It seems to me that in this case the warning is always going to be a false positive so should be suppressed.

> I've become convinced that attribute access none with no size argument should not expect the pointer points to an object of nonzero size.

+1 to this; we also observed that adding access(none, end) caused other spurious error messages.

I suppose there is no real need for the end parameter to be const char *, and making it char * like the other pointer parameter into the same buffer does prevent the warning. So in general that may be the best way to avoid this issue without resorting to attributes. Unfortunately in the case of varint_put32() making that change would require making similar changes to a surprisingly large number of sibling functions across the project's (and a submodule's) source files.
Comment 3 James Bonfield 2021-05-05 08:22:23 UTC
I'd definitely be in favour of John's rewording of the warning - "data pointed to by ...".  This definitely led us up the garden path for a while.

While I agree the code could have been written differently, such as not using a const char *, or having a length parameter instead of end pointer, I don't feel it is the compilers responsibility to prefer one style over another.  That said, I concur that the minimal change to ignoring this check for zero sized objects would catch the more common idioms of bounds checking.

Thank you for the response.
Comment 4 Martin Sebor 2021-05-05 14:48:58 UTC
I agree with improving the wording (I think pr99944 is basically about the same thing).
Comment 5 Martin Sebor 2021-05-10 21:39:06 UTC
*** Bug 100496 has been marked as a duplicate of this bug. ***