Bug 105755 - -Wanalyzer-null-dereference regression compiling Emacs
Summary: -Wanalyzer-null-dereference regression compiling Emacs
Status: REOPENED
Alias: None
Product: gcc
Classification: Unclassified
Component: analyzer (show other bugs)
Version: 12.1.1
: P3 normal
Target Milestone: ---
Assignee: David Malcolm
URL:
Keywords: diagnostic
Depends on:
Blocks: Wanalyzer-null-dereference
  Show dependency treegraph
 
Reported: 2022-05-28 03:49 UTC by Paul Eggert
Modified: 2024-02-16 19:41 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-02-16 00:00:00


Attachments
compile with 'gcc -fanalyzer -O2 -S' to see the false positive (1.06 KB, text/plain)
2022-05-28 03:49 UTC, Paul Eggert
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Eggert 2022-05-28 03:49:14 UTC
Created attachment 53047 [details]
compile with 'gcc -fanalyzer -O2 -S' to see the false positive

GCC 12.1.1 20220507 (Red Hat 12.1.1-1) on x86-64 has a false positive compiling the attached program w.i, which is a stripped-down version of GNU Emacs master. Compile it this way:

gcc -fanalyzer -O2 -S w.i

and it generates the following incorrect output. GCC 11.2 compiles the code cleanly so this is a regression.

In function ‘PSEUDOVECTORP’,
    inlined from ‘SUB_CHAR_TABLE_P’ at w.i:154:10,
    inlined from ‘CHAR_TABLE_REF_ASCII’ at w.i:169:28:
w.i:53:56: warning: dereference of NULL ‘*tbl.ascii’ [CWE-476] [-Wanalyzer-null-dereference]
   52 |           && ((((union vectorlike_header *)
      |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~             
   53 |                 ((char *) XLP ((a)) - Lisp_Vectorlike))->size
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
  ‘word_boundary_p’: events 1-2
    |
    |  187 | word_boundary_p (Lisp_Object char_script_table, int c1, int c2)
    |      | ^~~~~~~~~~~~~~~
    |      | |
    |      | (1) entry to ‘word_boundary_p’
    |  188 | {
    |  189 |   return EQ (CHAR_TABLE_REF (char_script_table, c1),
    |      |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |          |
    |      |          (2) calling ‘CHAR_TABLE_REF’ from ‘word_boundary_p’
    |  190 |              CHAR_TABLE_REF (char_script_table, c2));
    |      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |
    +--> ‘CHAR_TABLE_REF’: events 3-6
           |
           |  179 | CHAR_TABLE_REF (Lisp_Object ct, int idx)
           |      | ^~~~~~~~~~~~~~
           |      | |
           |      | (3) entry to ‘CHAR_TABLE_REF’
           |  180 | {
           |  181 |   return (ASCII_CHAR_P (idx)
           |      |          ~~~~~~~~~~~~~~~~~~~
           |  182 |           ? CHAR_TABLE_REF_ASCII (ct, idx)
           |      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           |      |             |
           |      |             (5) ...to here
           |      |             (6) calling ‘CHAR_TABLE_REF_ASCII’ from ‘CHAR_TABLE_REF’
           |  183 |           : char_table_ref (ct, idx));
           |      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~
           |      |           |
           |      |           (4) following ‘true’ branch...
           |
           +--> ‘CHAR_TABLE_REF_ASCII’: events 7-15
                  |
                  |  164 | CHAR_TABLE_REF_ASCII (Lisp_Object ct, ptrdiff_t idx)
                  |      | ^~~~~~~~~~~~~~~~~~~~
                  |      | |
                  |      | (7) entry to ‘CHAR_TABLE_REF_ASCII’
                  |......
                  |  169 |       Lisp_Object val = (! SUB_CHAR_TABLE_P (tbl->ascii) ? tbl->ascii
                  |      |                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  |  170 |                          : XSUB_CHAR_TABLE (tbl->ascii)->contents[idx]);
                  |      |                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  |      |                          |
                  |      |                          (8) following ‘false’ branch...
                  |  171 |       if (NILP (val))
                  |      |          ~
                  |      |          |
                  |      |          (9) ...to here
                  |      |          (10) following ‘true’ branch...
                  |  172 |         val = tbl->defalt;
                  |      |         ~~~~~~~~~~~~~~~~~
                  |      |             |
                  |      |             (11) ...to here
                  |  173 |       if (!NILP (val) || NILP (tbl->parent))
                  |      |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  |      |          |            |  |
                  |      |          |            |  (13) ...to here
                  |      |          |            (14) following ‘true’ branch...
                  |      |          (12) following ‘false’ branch (when ‘val’ is NULL)...
                  |  174 |         return val;
                  |      |                ~~~
                  |      |                |
                  |      |                (15) ...to here
                  |
           <------+
           |
         ‘CHAR_TABLE_REF’: event 16
           |
           |  182 |           ? CHAR_TABLE_REF_ASCII (ct, idx)
           |      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           |      |             |
           |      |             (16) returning to ‘CHAR_TABLE_REF’ from ‘CHAR_TABLE_REF_ASCII’
           |
    <------+
    |
  ‘word_boundary_p’: events 17-18
    |
    |  189 |   return EQ (CHAR_TABLE_REF (char_script_table, c1),
    |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |          |
    |      |          (17) returning to ‘word_boundary_p’ from ‘CHAR_TABLE_REF’
    |      |          (18) calling ‘CHAR_TABLE_REF’ from ‘word_boundary_p’
    |  190 |              CHAR_TABLE_REF (char_script_table, c2));
    |      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |
    +--> ‘CHAR_TABLE_REF’: events 19-22
           |
           |  179 | CHAR_TABLE_REF (Lisp_Object ct, int idx)
           |      | ^~~~~~~~~~~~~~
           |      | |
           |      | (19) entry to ‘CHAR_TABLE_REF’
           |  180 | {
           |  181 |   return (ASCII_CHAR_P (idx)
           |      |          ~~~~~~~~~~~~~~~~~~~
           |  182 |           ? CHAR_TABLE_REF_ASCII (ct, idx)
           |      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           |      |             |
           |      |             (21) ...to here
           |      |             (22) calling ‘CHAR_TABLE_REF_ASCII’ from ‘CHAR_TABLE_REF’
           |  183 |           : char_table_ref (ct, idx));
           |      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~
           |      |           |
           |      |           (20) following ‘true’ branch...
           |
           +--> ‘CHAR_TABLE_REF_ASCII’: events 23-26
                  |
                  |   51 |   return (TAGGEDP (a, Lisp_Vectorlike)
                  |      |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  |   52 |           && ((((union vectorlike_header *)
                  |      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  |      |           |
                  |      |           (24) following ‘true’ branch...
                  |   53 |                 ((char *) XLP ((a)) - Lisp_Vectorlike))->size
                  |      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  |      |                                                        |
                  |      |                                                        (25) ...to here
                  |      |                                                        (26) dereference of NULL ‘*tbl.ascii’
                  |   54 |                & 0x400000003f000000)
                  |      |                ~~~~~~~~~~~~~~~~~~~~~
                  |   55 |               == (0x4000000000000000 | (code << 24))));
                  |      |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  |......
                  |  164 | CHAR_TABLE_REF_ASCII (Lisp_Object ct, ptrdiff_t idx)
                  |      | ^~~~~~~~~~~~~~~~~~~~
                  |      | |
                  |      | (23) entry to ‘CHAR_TABLE_REF_ASCII’
                  |
Comment 1 GCC Commits 2023-03-09 21:21:33 UTC
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>:

https://gcc.gnu.org/g:4214bdb1d77ebee04d12f66c831730ed67fedf55

commit r13-6565-g4214bdb1d77ebee04d12f66c831730ed67fedf55
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Thu Mar 9 16:21:02 2023 -0500

    testsuite: add various -Wanalyzer-null-dereference false +ve test cases
    
    There are various -Wanalyzer-null-dereference false +ves in bugzilla
    that I've been attempting to fix.  Unfortunately I haven't made much
    progress, but it seems worth at least capturing the reduced
    reproducers as test cases, to make it easier to spot changes in
    behavior.
    
    gcc/testsuite/ChangeLog:
            PR analyzer/102671
            PR analyzer/105755
            PR analyzer/108251
            PR analyzer/108400
            * gcc.dg/analyzer/null-deref-pr102671-1.c: New test, reduced
            from Emacs.
            * gcc.dg/analyzer/null-deref-pr102671-2.c: Likewise.
            * gcc.dg/analyzer/null-deref-pr105755.c: Likewise.
            * gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:
            New test, reduced from haproxy's src/ssl_sample.c.
            * gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:
            Likewise.
            * gcc.dg/analyzer/null-deref-pr108400-SoftEtherVPN-WebUi.c: New
            test, reduced from SoftEtherVPN's src/Cedar/WebUI.c.
    
    Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Comment 2 nightstrike 2024-01-28 08:19:03 UTC
(In reply to GCC Commits from comment #1)
>             * gcc.dg/analyzer/null-deref-pr105755.c: Likewise.

In this test, the XLI function tries to store a pointer in a long.  That type should be __UINTPTR_TYPE__ instead; we see warnings on any LLP64 system.
Comment 3 David Malcolm 2024-02-16 14:35:52 UTC
Current status of reproducer on Compiler Explorer:
  GCC trunk: no warning: https://godbolt.org/z/o6ecKKa8e
  GCC 13.2:  no warning: https://godbolt.org/z/z7hdYx1Y7
  GCC 12.3:  false +ve:  https://godbolt.org/z/8W7c68GoT
  GCC 11.4:  no warning: https://godbolt.org/z/5vv5KWsTP
Comment 4 David Malcolm 2024-02-16 14:39:45 UTC
Looks like this was fixed sometime in GCC 13; resolving as WORKSFORME.

Feel free to reopen if you have a reproducer that triggers on a more recent GCC.
Comment 5 nightstrike 2024-02-16 19:35:02 UTC
(In reply to David Malcolm from comment #4)
> Looks like this was fixed sometime in GCC 13; resolving as WORKSFORME.
> 
> Feel free to reopen if you have a reproducer that triggers on a more recent
> GCC.

The testcase still fails.  To be clear, I'm referring to null-deref-pr105755.c:

Executing on host: /tmp/gcc/src/gcc-git/_w/gcc/xgcc -B/tmp/gcc/src/gcc-git/_w/gcc/  exceptions_enabled705865.cc  -fdiagnostics-plain-output  -Wno-complain-wrong-lang -S -o exceptions_enabled705865.s    (timeout = 300)
spawn -ignore SIGHUP /tmp/gcc/src/gcc-git/_w/gcc/xgcc -B/tmp/gcc/src/gcc-git/_w/gcc/ exceptions_enabled705865.cc -fdiagnostics-plain-output -Wno-complain-wrong-lang -S -o exceptions_enabled705865.s^M
FAIL: gcc.dg/analyzer/null-deref-pr105755.c (test for excess errors)
Excess errors:
/tmp/gcc/src/gcc-git/gcc/testsuite/gcc.dg/analyzer/null-deref-pr105755.c:19:10: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]

testcase /tmp/gcc/src/gcc-git/gcc/testsuite/gcc.dg/analyzer/analyzer.exp completed in 0 seconds


With this change:

--- a/gcc/testsuite/gcc.dg/analyzer/null-deref-pr105755.c
+++ b/gcc/testsuite/gcc.dg/analyzer/null-deref-pr105755.c
@@ -2,7 +2,7 @@
 /* { dg-additional-options "-Wno-analyzer-too-complex -Wno-analyzer-symbol-too-complex -O2" } */

 typedef long int ptrdiff_t;
-typedef long int EMACS_INT;
+typedef __UINTPTR_TYPE__ EMACS_INT;
 typedef long int intmax_t;

 enum Lisp_Type


Then I get this:

PASS: gcc.dg/analyzer/null-deref-pr105755.c (test for excess errors)



If I open your godbolt links, they aren't using a Windows target compiler, so they aren't exercising an LLP64 target.

Generally speaking, most of the analyzer testsuite assumes incorrect definitions of things.  For instance, in the diff I just posted, you can see that the lines before and after also assume the underlying types of ptrdiff_t and intmax_t instead of using compiler builtins or just including the relevant headers.  This is really needs to be fixed across the whole testsuite.
Comment 6 nightstrike 2024-02-16 19:39:28 UTC
(In reply to nightstrike from comment #5)
> If I open your godbolt links, they aren't using a Windows target compiler,
> so they aren't exercising an LLP64 target.

For instance:
https://godbolt.org/z/4Mx96Wjvd

<source>: In function 'XLI':
<source>:15:10: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
   15 |   return (EMACS_INT) o;
      |          ^
ASM generation compiler returned: 0
ERROR: The inherited access control list (ACL) or access control entry (ACE) could not be built.

<source>: In function 'XLI':
<source>:15:10: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
   15 |   return (EMACS_INT) o;
      |          ^
Execution build compiler returned: 0
Program returned: 254
Comment 7 Iain Sandoe 2024-02-16 19:41:05 UTC
see comment #5 and #6