I experimented with -fanalyzer on Xen, given all the recent work on Linux. We're quite similar, but one area where we are very different is accessing per-cpu variables. For architectural reasons (i.e. because we were virtualising Linux, and Linux uses %gs for its per-cpu variables), Xen doesn't. In Xen, we have a block of metadata at the base of the stack, and the stack suitably aligned such that we can do something like this: static inline struct cpu_info *get_cpu_info(void) { register unsigned long sp asm("rsp"); return (struct cpu_info *)((sp | (STACK_SIZE - 1)) + 1) - 1; } Which turns into roughly: ptr = ((rsp | 0x7fff) + 1) - sizeof(struct cpu_info) which is correct and work suitably due to the alignment of the stack in the first place. Unfortunately, it triggers: ./arch/x86/include/asm/current.h:95:5: error: use of uninitialized value 'sp' [CWE-457] [-Werror=analyzer-use-of-uninitialized-value] reliably, every time macros such as `current` get expanded, which is everywhere. The reality is that the stack pointer is never uninitialised. It is unpredictable in the general case, but implementations can account for and remove that unpredictability. The normal trick to hide a variable from uninitialised handling (e.g. to asm("" : "+g"(var)); ) doesn't work, as it suffers from the same error. Is there any way to tell fanalyzer that this value really isn't uninitialised? I can't see anything obvious. I can work around the warning by doing: unsigned long sp; asm ( "mov %%rsp, %0" : "=r" (sp) ); but this impacts code generation quite substantially. This primitive is used all over the place, and the regular C form undergoes far better CSE than the explicit mov to retrieve the stack pointer.
Can you use __builtin_frame_address instead?
__builtin_frame_address() does appear to resolve the warning, but the knock-on effect for code generation is even worse than the asm() block. It forces a frame-pointer setup in all functions that use it (which is most functions in Xen), even leaf functions, and despite -fomit-frame-pointer, which in turn causes spilling of other registers now that %rbp isn't usable.
Perhaps it works if you declare the register variable in file scope.
(In reply to Andreas Schwab from comment #3) > Perhaps it works if you declare the register variable in file scope. Huh. I honestly expected that not to compile, but it appears to, and it appears to work. There is minor perturbation in the build, but as far as I can see, it's just slightly different register/instruction scheduling. Why does being at global scope change the diagnostic?
Minimal reproducer: https://godbolt.org/z/E6EEY1WT6 Am I right in understanding that: register unsigned long sp asm("rsp"); is intended as a way to read the %rsp register? If so, I think the analyzer might be failing to grok that idiom.
(In reply to David Malcolm from comment #5) > Minimal reproducer: https://godbolt.org/z/E6EEY1WT6 > > Am I right in understanding that: > register unsigned long sp asm("rsp"); > is intended as a way to read the %rsp register? Ultimately, yes. More generally, this just creates a way to access the specific register. It is the only way for example to create an asm constraint on e.g. %r8, so the following is common to see for MSABI: register unsigned long param asm ("%r8"); param = 4; asm ("tdcall/whatever" : "+r" (param) ...); (Example here is from Intel's new TDX technology, but the actual asm instruction isn't important.)
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>: https://gcc.gnu.org/g:20bd258d0fa09837b3a93478ef92d8789cbcd442 commit r13-6420-g20bd258d0fa09837b3a93478ef92d8789cbcd442 Author: David Malcolm <dmalcolm@redhat.com> Date: Thu Mar 2 14:01:19 2023 -0500 analyzer: fix uninit false +ves reading from DECL_HARD_REGISTER [PR108968] gcc/analyzer/ChangeLog: PR analyzer/108968 * region-model.cc (region_model::get_rvalue_1): Handle VAR_DECLs with a DECL_HARD_REGISTER by returning UNKNOWN. gcc/testsuite/ChangeLog: PR analyzer/108968 * gcc.dg/analyzer/uninit-pr108968-register.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
I've attempted to work around this with the above patch (for gcc 13). As written, this ought to suppress the "uninit" false positive, but I didn't have a good kind of symbolic value to use for the resulting pointer, hence the analyzer will treat the result of get_cpu_info as an "unknowable" pointer, which might lead to a chain of follow-up false positives if there's logic in the code being analyzed that relies on dereferencing the result and getting consistent results. Can you attach a typical preprocessed source file from xen (the GPL licensed part) that was showing this (use -E), so I can poke at it to see how well this workaround works - thanks! Keeping open in case this needs further work, and to possibly track backporting to GCC 12.
Thank-you for the fix. I've recompiled master and this uninitialised warning has gone. Unfortunately, Xen isn't GCC-13 clean (seems like a real bug in Xen), and the analyser has pointed out various other things which I'm still looking in to. I don't see anything which looks like it is a new knock-on effect from this change. Our code does fundamentally rely on get_cpu_info() always returning the same pointer (on a single CPU). For example, `current` is defined as `get_cpu_info()->current` and we do expect that to yield the same pointer when used multiple times. Even if the analyser was interpreting the generated asm, there's no way it could prove this without knowing the size/alignment constraints of our stacks. Would a const annotation on get_cpu_info() be likely to help? It occurs to me that this is true in all cases that the compiler could legitimately reason about. (It would only cease being true if we fell off our stack, at which point UB is the very least of our worries.)
From trying this out, a const attribute doesn't alter the code generation in the slightest, so I presume GCC has already figured the const-ness out.
(In reply to Andrew Cooper from comment #9) [...snip...] > Would a const annotation on get_cpu_info() be likely to help? It occurs to > me that this is true in all cases that the compiler could legitimately > reason about. (It would only cease being true if we fell off our stack, at > which point UB is the very least of our worries.) Probably not (without further patching of the analyzer, at least). For functions it can't see the definition of, the analyzer will respect const annotations and treat such a function as always returning the same results when given the same set of arguments. However, I don't think it will respect a const annotation on an function it can see the definition of; I think in your case it will simply try to (badly) simulate the insides of get_cpu_info. To what extent that's going to lead to false positives is hard to say.
(In reply to Andrew Cooper from comment #9) [...snip...] > Our code does fundamentally rely on get_cpu_info() always returning the same > pointer (on a single CPU). For example, `current` is defined as > `get_cpu_info()->current` and we do expect that to yield the same pointer > when used multiple times. > > Even if the analyser was interpreting the generated asm, there's no way it > could prove this without knowing the size/alignment constraints of our > stacks. Another issue is that even if the analyzer "knows" that get_cpu_info() always returns the same value, it doesn't know what memory is being pointed to, and so has to assume that in: T old_value = get_cpu_info()->current; some_function_call (); T new_value = get_cpu_info()->current; that old_value doesn't necessarily equal new_value, since some_function_call () could have modified the value of "current".
I've constructed an example which might be the knockon effect you were worried about? void foo(char *other) { char *ptr = NULL; if ( current->domain ) ptr = other; asm volatile ("cmc"); if ( current->domain ) ptr[0] = ~ptr[0]; } yields arch/x86/tmp.c: In function 'foo': arch/x86/tmp.c:14:22: error: dereference of NULL 'ptr' [CWE-476] [-Werror=analyzer-null-dereference] 14 | ptr[0] = ~ptr[0]; | ~~~^~~ 'foo': events 1-5 | | 8 | if ( current->domain ) | | ^ | | | | | (1) following 'false' branch... |...... | 11 | asm volatile ("cmc"); | | ~~~ | | | | | (2) ...to here | 12 | | 13 | if ( current->domain ) | | ~ | | | | | (3) following 'true' branch... | 14 | ptr[0] = ~ptr[0]; | | ~~~ ~~~~~~ | | | | | | | (5) dereference of NULL 'ptr' | | (4) ...to here |
Created attachment 54572 [details] Preprocessed example
Wow that's a lot of junk getting included for the minimal include set I could easily make. It occurs to me only after posting that you're liable to fail at: asm ( ".include \"arch/x86/include/asm/asm-macros.h\"" ); which always trips things up. You can safely drop it if you're just interested in the analyser behaviour.
Minimized version of attachment 54572 [details]: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ struct cpu_info { /* [...snip...] */ struct vcpu *current_vcpu; /* [...snip...] */ }; struct vcpu { /* [...snip...] */ struct domain *domain; /* [...snip...] */ }; static __inline__ struct cpu_info *get_cpu_info_from_stack(unsigned long sp) { return (struct cpu_info *)((sp | ((((1L) << 12) << 3) - 1)) + 1) - 1; } static __inline__ struct cpu_info *get_cpu_info(void) { register unsigned long sp asm("rsp"); return get_cpu_info_from_stack(sp); } void foo(char *other) { char *ptr = ((void*)0); if ( ((get_cpu_info()->current_vcpu))->domain ) ptr = other; asm volatile ("cmc"); if ( ((get_cpu_info()->current_vcpu))->domain ) ptr[0] = ~ptr[0]; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...where trunk emits: test.c:35:22: warning: dereference of NULL 'ptr' [CWE-476] [-Wanalyzer-null-dereference] 35 | ptr[0] = ~ptr[0]; | ~~~^~~ 'foo': events 1-6 | | 27 | char *ptr = ((void*)0); | | ^~~ | | | | | (1) 'ptr' is NULL | 28 | | 29 | if ( ((get_cpu_info()->current_vcpu))->domain ) | | ~ | | | | | (2) following 'false' branch... |...... | 32 | asm volatile ("cmc"); | | ~~~ | | | | | (3) ...to here | 33 | | 34 | if ( ((get_cpu_info()->current_vcpu))->domain ) | | ~ | | | | | (4) following 'true' branch... | 35 | ptr[0] = ~ptr[0]; | | ~~~~~~ | | | | | (5) ...to here | | (6) dereference of NULL 'ptr' |
Looks like it doesn't even need the asm stmt at line 32 to consider that it could take the false-then-true path.
The releases/gcc-12 branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>: https://gcc.gnu.org/g:833d822ff0e83478a4fe536d55dfb22cde8ddc40 commit r12-9366-g833d822ff0e83478a4fe536d55dfb22cde8ddc40 Author: David Malcolm <dmalcolm@redhat.com> Date: Wed Mar 29 14:16:49 2023 -0400 analyzer: fix uninit false +ves reading from DECL_HARD_REGISTER [PR108968] Cherrypicked from r13-6749-g430d7d88c1a123. gcc/analyzer/ChangeLog: PR analyzer/108968 * region-model.cc (region_model::get_rvalue_1): Handle VAR_DECLs with a DECL_HARD_REGISTER by returning UNKNOWN. gcc/testsuite/ChangeLog: PR analyzer/108968 * gcc.dg/analyzer/uninit-pr108968-register.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>