This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: GSOC 2018 - Textual LTO dump tool project

From: Martin Jambor <mjambor at suse dot cz>
To: Hrishikesh Kulkarni <hrishikeshparag at gmail dot com>, gcc at gcc dot gnu dot org
Cc: mliska at suse dot cz
Cc:
Date: Sun, 25 Feb 2018 10:46:38 +0100
Subject: Re: GSOC 2018 - Textual LTO dump tool project
Authentication-results: sourceware.org; auth=none
References: <CAL+0whT_S-U7-PJ9YdODuGx7pvX_vm12OZFWSBn3UHCzNqr-bQ@mail.gmail.com>

Hello Hrishikesh,

I apologize for replying to you this late, this has been a busy week
and now I am traveling.

On Mon, Feb 19 2018, Hrishikesh Kulkarni wrote:
> Hi,
>
> I am Hrishikesh Kulkarni currently studying as an undergrad student in
> Computer Engineering at Pune University, India. I find compilers quite
> interesting as a subject,  and would like to apply to GSoC to gain some
> understanding of how real-world compilers work. So far, I have managed to
> build gcc and perform some simple tweaks to the codebase. In particular, I
> would like to apply to the Textual LTO dump tool project.
>

I must say I am impressed by the research you have already done.
Nevertheless, please note that Ray Kim has also expressed interest in
the project.  Martin Liska will be the mentor, so I will let him drive
the selection process.  On the other hand, Ray also liked another
project, so maybe he will pick that and everyone will be happy.

> As far as I understand, the motivation for LTO framework was to enable
> cross file interprocedural optimizations, and for this purpose an ipa pass
> is divided into following three stages:
>
>    1.
>
>    LGEN: The pass does a local analysis of the function and generates a
>    “summary”, ie, the information relevant to the pass and writes it to LTO
>    object file.

A pass might do that, but the output of the whole stage is not just the
pass summaries, it also writes the function IL (the function gimple
statements, above all) to the object file.

>    2.
>
>    WPA: The LTO object files are given as input to the linker, which then
>    invokes the lto1 frontend to perform global ipa analysis over the
>    call-graph and write optimized summaries to LTO object files
>    (partitioning). The global ipa analysis is done over summary and not the
>    actual function bodies.

Well... note that partitioning actually means dividing the whole
compiled program/library into chunks that are then compiled
independently in the LTRANS stage.  But you are basically right that WPA
does also do whole-program analysis based on summaries and then writes
its decisions to optimization summaries, yes.

>    3.

>
>    LTRANS: The partitions are read back, and the function bodies are
>    reconstructed from summary and are then compiled to produce real object
>    files.

Function bodies and the summaries are distinct things.  The body
consists of gimple statements and all the associated stuff (such as
types, so a lot of stuff), whereas when we refer to summaries, we mean
small chunks of data that interprocedural optimizations such as inlining
or IPA-CP scurry away because they cannot feasibly work on bodies of the
entire program.

But apart from this terminology issue, you are basically correct, at the
LTRANS stage, IPA passes apply transformations to the bodies according
to the optimization summary generated by the WPA phase.  And then, all
normal, intra-procedural passes and code generation runs.

>
>
> If I understand correctly, the motivation for textual LTO dump tool is to
> easily analyze contents of LTO object file, similar to readelf or objdump ?

That is how I understand it too, but Martin may have some further uses
in mind.

>
> Assume that LTO object file contains in pureconst section: 0b0110 (0b for
> binary prefix) corresponding to values of fs->pure_const_state and
> fs->state_previously_known.
>
> If I understand correctly, the output of dump tool should then be:
>
> pure_const pass:
>
> pure_const_state = IPA_PURE (enum value of pure_const_state_e corresponding
> to 0b01)
>
> state_previously_known = IPA_NEITHER (enum value of pure_const_state_e
> corresponding to 0b10)
>
> Is this the expected output of the dump tool ?

I think the tool would have to a bit more than just dumping summaries of
IPA passes.  I tend to think that the task should also include dumping
gimple bodies (but we already do that in GCC and so it should be mostly
easy) and also of types (that are merged as one of the first steps of
WPA and interesting things happen when mergingit does something
"interesting").  And perhaps quite a bit more.  Martin?

>
> I am reasonably familiar working with C, C++ and python. My prior
> experience includes opportunities to work in areas of NLP. Some of my
> accomplishments in the area include presenting project VicharDhara- A
> thought Mapper that was selected among top five ideas in Accenture
> Innovation Challenge among 7000 nationwide entries. My paper on this topic
> won the best paper award in IEEE Conference ICCUBEA-2017. My previous work
> was focused on simple parsers, student psychology, thought process
> detection for team selection.

Interesting, congratulations.

>
> In the interim, I have been through a few docs on GCC and LTO [1][2][3] and
> am trying to write a toy ipa pass to better understand LTO/IPA
> infrastructure. 

Great, I believe that's exactly what my advice would be

> I would be grateful for feedback on the textual LTO dump
> tool.

I hope that Martin will shed a bit more light on what output he
envisions the tool to have.  I will talk to him about it too when I get
back to the office (so maybe on Tuesday but probably on Wednesday).

Thanks,

Martin

>
> [1] http://www.ucw.cz/~hubicka/slides/labs2013.pdf
>
> [2] https://gcc.gnu.org/wiki/LinkTimeOptimizatio
> <https://gcc.gnu.org/wiki/LinkTimeOptimization>
>
> [3] https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html
>
> My two recent publications are listed below:
>
> [A] Hrishikesh Kulkarni, "Contextual Data Representation Using Prime Number
> Route Mapping Method and Ontology" IEEE Conference, ICCUBEA, 2017
>
> [B] Hrishikesh Kulkarni, “Multi-Graph based Intent Hierarchy Generation to
> Determine Action Sequence”, Springer Conference, ICDECT, December 2017, Pune
>
> Thanks,
>
> Hrishikesh Kulkarni

Follow-Ups:
- Re: GSOC 2018 - Textual LTO dump tool project
  - From: Richard Biener
- Re: GSOC 2018 - Textual LTO dump tool project
  - From: Martin Liška

References:
- GSOC 2018 - Textual LTO dump tool project
  - From: Hrishikesh Kulkarni

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]