This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [pph] New branch for incremental C++ parsing

From: Lawrence Crowl <crowl at google dot com>
To: Benjamin Kosnik <bkoz at redhat dot com>
Cc: Diego Novillo <dnovillo at google dot com>, gcc at gcc dot gnu dot org
Date: Wed, 1 Dec 2010 16:15:16 -0800
Subject: Re: [pph] New branch for incremental C++ parsing
References: <20101102192101.GA14333@google.com> <20101201154116.3e95cc33@shotwell>

On 12/1/10, Benjamin Kosnik <bkoz@redhat.com> wrote:
> Hi Diego! Sorry to have missed this talk at the GCC Summit, this work
> looks interesting.
>
>> We have created a new branch for the incremental parsing work
>> that Lawrence and I described at the last GCC Summit
>> (http://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=IncrementalCompiler.pdf).
>>
>> To get the branch:
>>
>> $ svn co svn+ssh://gcc.gnu.org/svn/gcc/branches/pph
>
> I've been trying to use this, and have some basic usage questions for
> you.
>
> From your description in the email above, in particular:
>
> " The code currently implements a token cache on disk.  This is
> currently enabled with -fpth (for Pre-Tokenized Headers).  Each
> file in a translation unit gets its own .pth image. When a file
> is found unchanged wrt the .pth image, its tokens are
> instantiated out of the image instead of the text stream.
>
> This saves on average ~15% of compilation time on C++.  PTH
> images are factored, so a change in one file does require
> building the complete PTH image for the whole TU.

What this is supposed to mean is that changing one header does _not_
necesssarily invalidate other PTH files, most of which may be used
as-is in the remainder of the TU.

> Additionally,
> each PTH file is segmented into token hunks, each of which can be
> validated and applied separately.  This allows reusing the same
> PTH file in different translation units."
>
> However, in use I am having problems with this.
>
> For instance, take the following two files:
>
> 1.cc
> #include <vector>
> #include <iostream>
>
> int main()
> {
>   using namespace std;
>
>   vector<int> v;
>
>   for(unsigned int i = 0; i<100; ++i)
>     v.push_back(i);
>
>   cout << v[10] << endl;
>
>   return 0;
> }
>
>
> 2.cc
> #include <string>
> #include <iostream>
>
> int main()
> {
>   using namespace std;
>
>   string s("100 count vector");
>
>   cout << s << endl;
>
>   return 0;
> }
>
> To compile the first, I check out and build the branch. I use this like
> so:
>
> mkdir tmp; cd tmp
> g++ -fpth ../1.cc
>
> This seems fine. I end up with an a.out executable, and 89
> separate .pth files. The pth files are named like:
>
> _usr_include_ctype_h.pth
>
> or
>
> _mnt_share_bin_H_x86_64_gcc_pph_branch_20101201_bin____lib_gcc_x86_64_unknown_linux_gnu_4_6_0_____________include_c___4_6_0_x86_64_unknown_linux_gnu_bits_os_defines_h.pth
>
> This seems as expected.
>
> Now, I should be able to compile again, exact same compile line, and
> use the cache. Like:
>
> %g++ -fpth ../1.cc
>
> In file included from ../test_multi_1.cc:33554432:0:
> /usr/include/locale.h:179:2: error: #endif without #if
> In file included from ../test_multi_1.cc:33554432:0:
> /usr/include/sched.h:98:2: error: #endif without #if
> In file included from ../test_multi_1.cc:33554432:0:
> /usr/include/pthread.h:1119:2: error: #endif without #if
> In file included from ../test_multi_1.cc:33554432:0:
> /usr/include/unistd.h:938:2: error: #endif without #if
> In file included from ../test_multi_1.cc:33554432:0:
> /usr/include/wctype.h:7:2: error: #endif without #if
> In file included from ../test_multi_1.cc:33554432:0:
> /usr/include/wctype.h:285:2: error: #endif without #if
>
> This seems to be an error. This is supposed to work,  correct?

There were some merge problems when moving from 4.4 to trunk.
I get slightly different, but also failing, results.  Eventually,
it is supposed to work.

>
> Then, assuming this worked, then (re-using 1's images)
>
> %g++ -fpth ../2.cc
>
> Should in theory re-use 1's images and generate any new images that are
> necessary (according to the initial email. I understand results may
> vary at the moment.)
>
> This does not work atm, but gets errors similar to the #endif without
> #if errors above, but for a different file than locale.h.
>
> From discussing this with you via email, it looks like there are two
> options for -fpth, one with uses a timestamp (-fpth) and one that uses
> an md5 hash (-fpth-md5).
>
> -fpth  // use timestamp
> -fpth-md5 // use md5 of file
>
> Please note the -fpth-md5 does not do anything on the branch at the
> moment. (Ie using it means no .pth files are generated.)
>
> Anyway. This is from my initial use and probing. Is it worth filing
> these bugs in bugzilla for the pph branch, or is this branch kind of
> dead while you work on the thing-after-pph-branch?
>
> I'd like to start documenting this project/branch on the GCC wiki. At
> least the command options in gcc/c-family/c.opt, and have usage
> examples. You'd mentioned that this may use the incremental linker
> page, but as PPH/PTH is but one part of this I'm hoping to convince you
> to use a new page, say PrettyCachedHeader or PPHPTH or FECaching or
> something. Thoughts?

Diego's on vacation (or holiday) right now, so it might be a while
before he answers.

-- 
Lawrence Crowl

References:
- Re: [pph] New branch for incremental C++ parsing
  - From: Benjamin Kosnik

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]