This is the mail archive of the
mailing list for the GCC project.
Re: WPP capabilities in gcc
- From: Martin Sebor <msebor at redhat dot com>
- To: Shoham Peller <shohamp at gmail dot com>, Jonathan Wakely <jwakely dot gcc at gmail dot com>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>, Guy Lewin <guy at lewin dot co dot il>
- Date: Mon, 18 May 2015 21:21:21 -0600
- Subject: Re: WPP capabilities in gcc
- Authentication-results: sourceware.org; auth=none
- References: <CABpZJmwNrv_6JDzhPxZMmOFCXexsJVm6XwNP-+eRwys62_hQmg at mail dot gmail dot com> <553C13D8 dot 7080000 at gmail dot com> <CAH6eHdR=Tfrhd5JKnp9-5zrV0J0khN8e2PTZugpSprONPJq5ew at mail dot gmail dot com> <CABpZJmy484mvEMVZYzLo+qQDBd24Y7g=sURuWGUGMRUSOicU8A at mail dot gmail dot com>
On 04/26/2015 11:47 AM, Shoham Peller wrote:
You are completely right Jonathan. My Apologies.
WPP is a tool I use in my work field on an every-day basis, so I
thought it was known.
Here is the Wikipedia page on WPP:
In short, WPP allows to put traces and logs in your C/C++ products and
send to customers, without including sensitive tracing/logging strings
in your binary.
What WPP does, is it runs during pre-compilation, and replaces the
string in each call to a trace macro, with an obfuscated string. So:
DoTrace("program started %d %s beginning", num, str);
is Replaced to:
DoTrace("897192873 __LINE__ __FILE__ %d %s", num, str);
And puts the original string in the Debug's .pdb file, with the debug symbols.
Later, the user can use a software like EtwDataViewer (see first
google result for screenshot) to "Reverse" the WPP traces, and recover
the original traces of the program.
I hope my explanation is clear.
1. Can you think of a way to achieve this with gcc?
We used regular expressions until now to find the "DoTrace" and
replace the strings, but needless to say its ugly.
Here's a simple (perhaps naive) way to do it without changing
1) Define a DoTrace macro that "annotates" each format string
with a special "tag" to make it reliably distinguishable
from all other strings in a program. For example:
#define DoTrace(fmt, ...) \
vfprintf (stderr, "@DoTrace@" fmt, __VA_ARGS__)
(This assumes that all tracing format strings are literals.)
2) Add a stage to your build system that searches the .rodata
section of each object (program or DSO) for occurrences of
strings that start with the unique "@DoTrace@" tag, replaces
each instance of such a string with its unique id while
retaining any formatting directives, and stores the mapping
from the modified format string to the replaced string in
some "other file."
I would be inclined to start by using the Binutils objcopy
command to extract the .rodata section, modifying it as
necessary using a script, and then putting the result back
into the object file. If scripting turned out to be too
clumsy, error-prone, or slow I would look at using the
Binutils BFD library instead.
3) Write a tool that, given the "other file" created in stage
(3), replaces the encoded format strings with the originals.
We also thought about implementing a gcc plugin, but I think a lot
of people can benefit from it being embedded in gcc, instead of having
to import an .so file to enjoy this feature.
2. Do you think its a good feature for gcc, and will you (the
maintainers) be willing to merge it, after we'll implement it?
As others have already implied I too suspect this is overly
specialized to be of general interest or appropriate for
inclusion in a compiler.