GCC extension for protecting applications from format string attacks

Makoto Iwamura iwamura@pb.highway.ne.jp
Sat Mar 31 06:55:00 GMT 2001


Iwamura and Etoh would make a GCC extension for protecting applications 
from format string attacks. The idea is as follows. Any suggestion and 
comments will be appreciated.

When the extension finds the function call with variable number of 
arguments, it generates a series of bytes data as its mark and the number 
of arguments to the machine code of the calling function. The generated 
data is independent from the execution of the original calling function. 
Therefore, the compiled code is portable to the same UNIX system.

The mark is used for deciding whether the calling function is compiled by 
our GCC extension or not. You can handle the number of arguments in your 
library if the calling function is compiled by the extension. Unless that, 
you can ignore the value. So, you can also write a portable library. Our 
idea enables keeping up both calling and called function's portability.


Actually machine codes of the function calling printf() become the 
following code.
Surely,<number of arguments> etc. don't affect the original action.

   .
   .
   .
call printf
jmp .L14
.byte <mark>
.long <number of arguments>
.L14:
   .
   .
   .

When you get the number of arguments in the function printf, you can write 
the following programs at the entrance of the printf function. The GCC 
built-in function __builtin_return_address() is used for getting the "jmp 
.L14" address next to the call instruction. The get_number_of_args 
function gives an example how to find <mark> and get <number of 
arguments>.
Note that printf() cannot find <mark>, it means calling function doesn't 
have a number of arguments.

int get_number_of_args(byte *code)
{
    byte    mark[] = {MARK};
    for(int i = 0; i < SEARCH_RANGE; i++)
        if(!memcmp(&code[i],mark,sizeof(mark)))
            return *(int*)(&code[i+sizeof(mark)]);
    return -1;
}

int printf(char *fmt,...)
{
    int     num = get_number_of_args(__builtin_return_address(0));
                .
                .
                .
}

If you implement printf()(fprintf(),syslog(),,,etc.) that doesn't access 
arguments more than "num" value, you can protect applications from format 
string attacks. If we will make a new built-in function instead of calling 
get_number_of_args, you can get the number of arguments with only one 
statement added. 


We have several concerns to proceed further implementation. 

- There are several length of jmp instructions of different processors. We 
have to find the mark and number of arguments correctly regardless of the 
processor. We are worried about what MARK and SEARCH_RANGE should be.
- Can we add a new built-in function in GCC?
- When we write a printf library which protects format string attacks and 
stops the execution of a program, we can not stop those attacks for 
denial-of-service. Is there any idea how the program continues to the 
execution after the detection of format string attacks?

Any comments and suggestions will be appreciated. Thank you.

--
Makoto Iwamura <iwamura@muraoka.info.waseda.ac.jp>
Muraoka Laboratory, Dept. of Information & Computer Science
Graduate School of Science & Engineering, Waseda University

Hiroaki Etoh,  Tokyo Research Laboratory, IBM Japan



More information about the Gcc-patches mailing list