This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Decompiler Project and Mailing List



   Please note I'm crossposting this announcement to the developers' lists
I think contain the people most likely to be interested in this project.

   My little decompiler project (a sub-project of the Free Expression
Project) now has a mailing list, decompiler@free-expression.org.
Subscribe by mailing to majordomo@free-expression.org a message containing
the line:
subscribe decompiler

   I also have put the current code in my CVS repository, which you can
reach with
cvs -d :pserver:guest@se232.math.indiana.edu:/usr/local/cvsroot login
password is "freeguest", and you just need to checkout the decompiler
directory.

   The current state of the code is that I can translate a i386 executable
into a Scheme representation (tagged lists, mainly because it was easy to
implement and guaranteed to be portable to any scheme you want - I'm using
GUILE) of its symbols, data, and disassembled code (using the objdump code
as a base for the disassembling logic).  Then I have a converter from i386
assembler to an RTL like representation (main difference - machine modes
are replaced with a more precise typing system), which works for a
reasonable subset of i386 assembly (integer math, control flow, regular
bitwise operations - basically no floating point/MMX/SIMD conversion).

    I'm now working on an abstract interpreter to build a control flow
graph (allowing for bizarre jumps between function contexts), derived from
the compiler theory used for functional languages, since they have the
same problem as assembler that code is easily treated as data.  After
that, there will be loop detection, and any goto's that are left will be
translated into a letrec structure (where tail-recursion optimization is 
assumed) and then an attempt to "de-tailize" them will be made.  Also,
stack frames won't be explicitly recognized by the analysis, instead
frames will be treated as a structure passed to function through the bp
register.  I hope (assuming this works) that putting in place this general
kind of analysis will allow the detection of explicit stack constructs
(for example, look at the bison.simple parser) without any extra work.

Lynn

PS My apologies to those who've read this before.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]