Due to its modular design with front-end, middle-end and back-end, GCC supports a multitude of different programming languages (C, C++, Java, Fortran, Ada, etc.) and target architectures (x86, x86_64, SPARC, ia64, etc.). The purpose of this page and its subpages is to describe the structure of a sample front-end including its interaction with the middle-end.
Here we describe the front-end 'skeleton' code needed to implement an entirely new language into the GNU Compiler Collection, each front-end has to implement its own syntax parsing and semantic analysis into some internal GCC data-structures. From here GCC's middle end is able to preform some optimizations and then pass to the back-end and produce optimized assembly code for which ever platform/architecture you may be on. So your front-end code doesn't have to worry about what platform or architecture you may be running on, another nice thing to note is if you make use of the OPEN/MP lang-hooks you can make a parallel language compiler, which has a nice features for implementing programming languages for multi-core/SMP systems.
So for this skeleton it would be wise to have good experience of C programming in a UNIX environment as-well as GNU/Make and GNU/autotools. So lets say we don't even have a language we want to implement yet, we just want to understand the skeleton code for GCC to compile this new front-end into its compiler-collection and then discuss what needs to be implemented in here.
Lets work from a latest release of gcc-core ( which at this moment is gcc-core-4.4.0 ); Remember you also need MPFR-http://www.mpfr.org/mpfr-current/ and GMP-http://gmplib.org/ libraries which you would use to compile GCC like normal.
% wget -c http://${MIRROR}/releases/gcc-4.4.0/gcc-core-4.4.0.tar.gz
- % tar xvf gcc-core.tar.gz
If you peek inside this folder you will see a-lot of folders like lib{lang-name}/, gcc/, etc. What we are intereasted is mainly in is the gcc/ folder. If you download the full gcc tarball you will see the alot of front-end folders; note: The fortran is a very good reference point for front-ends and its very readable. So lets create a language front-end here.
- % cd gcc/
- % mkdir lang-name
So in this folder this is where your work is going to reside! So lets start what do we need.
- % touch {Make-lang.in, config-lang.in, README}
So this is the iceing on the cake for the moment, then we generally create:
- % touch { ${lang-name}-lang.h, ${lang-name}1.c }
It doesn't have to be these names of course just following the structure of other front-ends. So lets fill out some code.
Lets start with Make-lang.in
Its should look like:
# The name of the <LANG-NAME> compiler.
<LANG-NAME>_EXE = <lang-name>1$(exeext)
# The <LANG-NAME>-specific object files inclued in $(<LANG-NAME>_EXE).
<LANG-NAME>_OBJS = <lang-name>/<lang-name>1.o
# These hooks are used by the main GCC Makefile. Consult that
# Makefile for documentation.
<lang-name>.all.cross:
<lang-name>.start.encap: $(<LANG-NAME>_EXE)
<lang-name>.rest.encap:
<lang-name>.tags:
<lang-name>.install-common:
<lang-name>.install-man:
<lang-name>.install-info:
<lang-name>.dvi:
<lang-name>.html:
<lang-name>.uninstall:
<lang-name>.info:
<lang-name>.man:
<lang-name>.srcextra:
<lang-name>.srcman:
<lang-name>.srcinfo:
<lang-name>.mostlyclean:
rm -f $(<LANG-NAME>_OBJS) $(<LANG-NAME>_EXE)
<lang-name>.clean:
<lang-name>.distclean:
<lang-name>.maintainer-clean:
<lang-name>.stage1:
<lang-name>.stage2:
<lang-name>.stage3:
<lang-name>.stage4:
<lang-name>.stageprofile:
<lang-name>.stagefeedback:
# <LANG-NAME> rules.
$(<LANG-NAME>_EXE): $(<LANG-NAME>_OBJS) $(BACKEND) $(LIBDEPS)
- $(CC) $(ALL_CFLAGS) $(LDFLAGS) -o $@ \
$(<LANG-NAME>_OBJS) $(BACKEND) $(LIBS) attribs.o \
- prefix.o $(GMPLIBS)
#Depandancies
<lang-name>/<lang-name>1.o: <lang-name>/<lang-name>1.c $(CONFIG_H) coretypes.h debug.h \
- $(GGC_H) langhooks.h $(LANGHOOKS_DEF_H) $(SYSTEM_H) \
gtype-<lang-name>.h gt-<lang-name>-<lang-name>1.h
Hopefully you should be able to copy and paste this into your own Make-lang.in and change <lang-name> to your language. So next you will want to make your config-lang.in
# Configure looks for the existence of this file to auto-config each language.
# We define several parameters used by configure:
# language - name of language as it would appear in $(LANGUAGES) # compilers - value to add to $(COMPILERS)
# diff_excludes - files to ignore when building diffs between two versions.
language="sbsh"
#build_by_default="no"
compilers="sbsh1\$(exeext)"
stagestuff="sbsh1\$(exeext)"
gtfiles="\$(srcdir)/sbsh/sbsh1.c \$(srcdir)/sbsh/sbsh-lang.h"
- The gtfiles is the intereasting part here, more on this later. When we get into the code.