This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Question about static code analysis features in GCC


Hi

Richard, I've implemented a simple nop-pass as you described and are now investigating a path forward for static code analysis.
I'm trying to modify eg. cp-pass to be able to call these workers from my analysis pass.

I found some other work though done by Alexander Ivanov Sotirov called "Vulncheck".
Available patch at "http://gcc.vulncheck.org/";.
It seems to contain some work that might be useful to continue on?
Why was not this patch applied to GCC trunk?

A question from Sotirov about additional features was unanswered or done off-list?
http://gcc.gnu.org/ml/gcc/2007-09/msg00549.html

I guess the constant propagation etc is done by other workers/passes in GCC today, so its better to use the available workers.
But when starting reading his paper, it seems to me that some parts could be usable?
Also Sotirov have a "ssa-tree" approach to analysis rather than Volanchi (http://mygcc.free.fr) that using pretty-printer and pattern matching approach.
(Which as I understand stopped this patch from being applied to official GCC.)

Or is it even better just to do it as a plugin-pass using MELT or something similar?

Thanks and Best Regards
/Fredrik
________________________________________
From: Richard Guenther [richard.guenther@gmail.com]
Sent: Wednesday, February 16, 2011 11:17
To: sarah@hederstierna.com
Cc: gcc@gcc.gnu.org
Subject: Re: Question about static code analysis features in GCC

On Wed, Feb 16, 2011 at 8:54 AM, sarah@hederstierna.com
<fredrik@hederstierna.com> wrote:
> Hi
>
> Thanks for you answer, I just discovered though that the array-bounds-error could be catched by "-Warray-bounds" warning.
> I guess this analysis is done in Range Value Propagation "tree-vrp.c"
> The testcases I tried (+mine example code) did not warn though, is it a bug?

the array-bounds warning only works when VRP is enabled which it
is only at -O2 by default, usually in simple testcases accesses are
optimized away.

> testsuite/gcc.dg/Warray-bounds.c
> testsuite/gcc.dg/Warray-bounds-2.c
> testsuite/gcc.dg/Warray-bounds-3.c
> testsuite/gcc.dg/Warray-bounds-4.c   FAILED??
> testsuite/gcc.dg/Warray-bounds-5.c
> testsuite/gcc.dg/Warray-bounds-6.c
> testsuite/gcc.dg/Warray-bounds-7.c   FAILED??
> testsuite/gcc.dg/Warray-bounds-8.c
>
> Couldn't NULL dereferences also be checked in tree-VRP to some extent?

Yes, but VRP assumes that once you dereference a pointer it will be
not NULL - thus its optimistic analysis does defeat the intent to
warn for NULL accesses ;)

> And about adding a opt-pass, do you mean about here (in passes.c)
>
>  p = &all_regular_ipa_passes;
> +NEXT_PASS (pass_ipa_static_analysis);
>  NEXT_PASS (pass_ipa_whole_program_visibility);

No, I was thinking about

Index: passes.c
===================================================================
--- passes.c    (revision 170176)
+++ passes.c    (working copy)
@@ -796,6 +796,7 @@ init_optimization_passes (void)
   *p = NULL;

   p = &all_regular_ipa_passes;
+  NEXT_PASS (pass_ipa_static_analysis);
   NEXT_PASS (pass_ipa_whole_program_visibility);
   NEXT_PASS (pass_ipa_profile);
   NEXT_PASS (pass_ipa_cp);

at the point you show we are not yet in SSA form.  The above will
only reliably work at -O0 as otherwise early optimizations will have
taken place.

> What passes do you think have an additional mode for non-code generation, value-numbering (tree-nrv? tree-ssa-sccvn, tree-ssa-pre?) or constant-propagation (tree-cp)?

There are none at the moment, but at least the SSA propagators
(tree-ssa-ccp.c, tree-ssa-copy.c) and the value-numberer
(tree-ssa-sccvn.c/tree-ssa-pre.c) whould be easy to modify.

> Could this opt-stages be called earlier in the passes pipeline?

I would rather arrange for the workers to be able to be called from
the static analysis pass directly instead of trying to make them
"passes without code-gen".

Richard.


>
> Thanks and Best Regards
> /Fredrik
> ________________________________________
> From: Richard Guenther [richard.guenther@gmail.com]
> Sent: Sunday, February 13, 2011 10:54
> To: sarah@hederstierna.com
> Cc: gcc@gcc.gnu.org
> Subject: Re: Question about static code analysis features in GCC
>
> On Sun, Feb 13, 2011 at 2:34 AM, sarah@hederstierna.com
> <fredrik@hederstierna.com> wrote:
>> Hi
>>
>> I would like to have some advice regarding static code analysis and GCC.
>> I've just reviewed several tools like Klocwork, Coverity, CodeSonar and PolySpace.
>> These tools offer alot of features and all tools seems to find different types of defects.
>> The tool that found most bugs on our code was Coverity, but it is also the most expensive tool.
>>
>> But basically I would most like just to find very "simple" basic errors like NULL-dereferences and buffer overruns.
>> I attach a small example file with some very obvious errors like NULL-dereferences and buffer overruns.
>>
>> This buggy file compiles fine though without any warnings at all with GCC as expected
>>
>>    gcc -o example example.c -W -Wall -Wextra
>>
>> I tried to add checking with mudflap:
>>
>>    gcc -fmudflap -o example example.c -W -Wall -Wextra -lmudflap
>>
>> Then I found all defects in run-time, but I had to run the program so I could not find all potential errors in compile-time.
>> Also Valgrind could be used to check run-time bugs, but I'm not 100% sure I can cover all execution paths in my tests (I also tried gcov).
>>
>> I tried to analyze my example file with CLANG, then I found "uninitialized" issues and NULL-pointers, but not buffer overruns:
>>
>>    clang --analyze example.c
>>    example.c:7:3: warning: Dereference of null pointer loaded from variable 'a'
>>    example.c:41:3: warning: Undefined or garbage value returned to caller
>>
>> About NULL-checks and buffer-overruns, is there any possible path to get such checkers into a standard GCC, maybe in just some very limited level?
>> I've checked the "MyGCC" (http://mygcc.free.fr) patch on Graphite, but it has been rejected, could it be rewritten somehow as a standard opt_pass to just find NULL-derefs?
>>
>> I've also checked TreeHydra in Mozilla project (https://developer.mozilla.org/en/Treehydra) that gives JavaScript interface to GIMPEL.
>> Is the GCC4.5.x plugin API something that is recommended to use to implement such features, is performance okey to not have it as a core opt-pass?
>>
>> I'm willing to put some free time into this matter if it turns out its possible to add some tree SSA optimization pass that could find some limited set of errors.
>> Example given if some value is constant NULL and dereferenced, or some element if accessed outside a constant length buffer using a constant index.
>> What is your recommended path to move forward using GCC and basic static code analysis?
>
> It should be possible to fit static code analysis into GCC.  The most
> prominent issue is that GCC is a compiler mainly looking at
> optimization
> quality, and optimization can defeat static code analysis in some
> cases (such as aggressively using undefined behavior to do dead code
> elimination).  On the other hand optimization makes static analysis
> easier in some cases, and even more useful if issues in "really" dead
> code
> are removed.
>
> As a way to start I would suggest to restrict static analysis to -O0
> (no optimization), a suitable place to do such analysis is the first
> entry in
> the IPA pass pipeline (then you have the whole program in SSA, a
> callgraph built and unused functions removed - you also have
> always_inline
> functions inlined).  Something that can be done quite easily is have a
> mode for the standard SSA propagators (like constant propagation or
> even value-numbering) to only compute the propagation but do no
> modification to the program, if the results are then kept and not
> freed
> static analysis can use them.
>
> I don't see a good reason to not include a well-desiged version
> following that route into GCC itself.  Note that comparing to CLANG
> this
> analysis would reside in the middle-end, not the frontend - with the
> advantage that it can take a look at the whole program when building
> with link-time (non-)optimization and work cross-language.
>
> No, I'm not going to implement it - but a patch that just inserts a
> noop pass at the place I suggested should be < 50 lines of code.
>
> Richard.
>
>> About un-initialized values I found some additional info, it seems to be hard to solve...
>>  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18501
>>  http://lists.cs.uiuc.edu/pipermail/cfe-dev/2011-February/013170.html
>>
>> Thanks and Best Regards
>> /Fredrik
>>
>> --------------------------------------------------------------
>> #include <stdio.h>
>>
>> // Example1: null pointer de-reference
>> int f1(void)
>> {
>>  int* a = NULL;
>>  *a = 1;
>>  return *a;
>> }
>>
>> // Example2: buffer overrun global variable
>> char v2[1];
>> char* f2(void)
>> {
>>  v2[-1] = 1;
>>  v2[0]  = 1;
>>  v2[1]  = 1;
>>  return (char*)v2;
>> }
>>
>> // Example3: buffer overrun local variable
>> int f3(void)
>> {
>>  char v3[1];
>>  v3[-1] = 1;
>>  v3[0]  = 1;
>>  v3[1]  = 1;
>>  return v3[-1] + v3[0] + v3[1] + v3[2];
>> }
>>
>> // Example4: uninitialized memory access
>> int f4(void)
>> {
>>  char v4[1];
>>  return v4[0];
>> }
>>
>> // Examples NULL dereference and buffer overruns
>> int main(void)
>> {
>>  int t1 = f1();
>>  printf("test1 %d\n", t1);
>>
>>  void *t2 = f2();
>>  printf("test2 %08x\n", (unsigned int)t2);
>>
>>  int t3 = f3();
>>  printf("test3 %d\n", t3);
>>
>>  int t4 = f4();
>>  printf("test4 %d\n", t4);
>>
>>  return 0;
>> }
>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]