This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Question about static code analysis features in GCC


Hey Sarah,

Many array bounds and format string problems can already be found, especially with LTO, ClooG, loop-unrolling, and -O3 enabled. Seeing across object-file boundaries, understanding loop boundaries, and aggressive inlining allows GCC to warn about a lot of real-world vulnerabilities. When multiple IPA passes lands in trunk, it should be even better.

What I think is missing is:

1) detection of double-free. This is already a function attribute called 'malloc', which is used to express a specific kind of allocation function whose return value will never be aliased. You could use that attribute, in addition to a new one ('free'), to track potential double-frees of values via VRP/IPA.

2) the ability to annotate functions as to the taint and filtering side-effects to their parameters, like the format() attribute. (I've asked for this feature from the PC-Lint people for some time.) You could make this even more generic and just add a new attribute that allows for tagging and checking of arbitrary tags:
ssize_t recv(int sockfd, void *buf, size_t len, int flags) __attribute__ ((add_parameter_tag ("taint", 2)))
                                                           __attribute__ ((add_return_value_tag ("taint")));

int count_sql_rows_for(const char* name) __attribute__ ((disallow_parameter_tag ("taint", 1)));
void filter_sql_characters_from(const char* name) __attribute__ ((removes_parameter_tag ("taint", 1)));

then a program like this:
int main(void) {
  char name[20] = {0};
  recv(GLOBAL_SOCKET, &name, sizeof(name), 0);
  filter_sql_characters_from(name); // comment this line to get warning
  count_sql_rows_for(name);
}

When I wrote my binary static analysis product, BugScan, we assumed that if a pointer was tainted, so was its contents. (This was especially a necessity for collections like lists and vectors in Java and C++ binaries.) You may want to get more explicit with that, by having a rescurively_add_parameter_tag() or somesuch that only applies to pointer parameters.

3) lack of explicit NULL-termination of strings. This one gets really complicated, especially for situations where they are terminated properly and then become un-terminated.

4) if a loop that writes to a pointer, and increments that pointer, is bound by a tainted value. You'd have to add an extension to the loop unroller for that, and just check for the 'taint' tag on the bounds check.


Of course, you still run into temporal ordering issues, especially with globals, where the CFG ordering won't help.

But don't let that discourage you -- it would be great work to see done and commoditized, and would probably be better than most commercial analyzers as well ;)

Let me know if you need any more of my expertise in this area. I can't speak for GCC internals, though.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]