This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: GSOC Proposal

From: Richard Biener <rguenther at suse dot de>
To: nick <xerofoify at gmail dot com>
Cc: GCC Development <gcc at gcc dot gnu dot org>
Date: Mon, 8 Apr 2019 09:29:59 +0200 (CEST)
Subject: Re: GSOC Proposal
References: <4327491a-395b-bee0-145a-eddd8f64b0ba@gmail.com> <CAFiYyc0AbLoU6=OD6D0G=nRbtVSdujrQNBH-miNWiF5QW2kijA@mail.gmail.com> <f5ab7c50-d8a0-cdae-dc0c-218a90303f3b@gmail.com> <alpine.LSU.2.20.1903290959560.27537@zhemvz.fhfr.qr> <63e78666-ceca-94d8-9ac4-101130afab4c@gmail.com> <70316c90-b241-3f88-56d8-9e59f3eac0ee@gmail.com> <alpine.LSU.2.20.1904011153200.27537@zhemvz.fhfr.qr> <038afc5c-85fa-8d78-5b84-289207c7b17f@gmail.com> <alpine.LSU.2.20.1904011546340.27537@zhemvz.fhfr.qr> <ebe2bcc8-f4bb-83bd-3c41-b110f762d209@gmail.com> <alpine.LSU.2.20.1904031320420.27537@zhemvz.fhfr.qr> <1eaa53df-3ed9-2a5f-d6db-2a224ee9da01@gmail.com> <alpine.LSU.2.20.1904051223450.27537@zhemvz.fhfr.qr> <ec115fb3-ee31-b697-e370-1405ee14cabf@gmail.com> <DC08D4BB-8B93-4C6E-A9C7-2216AED0A563@suse.de> <bd907357-2598-9e63-0f51-e1b6130815b1@gmail.com>

On Sun, 7 Apr 2019, nick wrote:

> 
> 
> On 2019-04-07 5:31 a.m., Richard Biener wrote:
> > On April 5, 2019 6:11:15 PM GMT+02:00, nick <xerofoify@gmail.com> wrote:
> >>
> >>
> >> On 2019-04-05 6:25 a.m., Richard Biener wrote:
> >>> On Wed, 3 Apr 2019, nick wrote:
> >>>
> >>>>
> >>>>
> >>>> On 2019-04-03 7:30 a.m., Richard Biener wrote:
> >>>>> On Mon, 1 Apr 2019, nick wrote:
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 2019-04-01 9:47 a.m., Richard Biener wrote:
> >>>>>>> On Mon, 1 Apr 2019, nick wrote:
> >>>>>>>
> >>>>>>>> Well I'm talking about the shared roots of this garbage
> >> collector core state 
> >>>>>>>> data structure or just struct ggc_root_tab.
> >>>>>>>>
> >>>>>>>> But also this seems that this to be no longer shared globally if
> >> I'm not mistaken 
> >>>>>>>> or this:
> >>>>>>>> static vec<const_ggc_root_tab_t> extra_root_vec;
> >>>>>>>>
> >>>>>>>> Not sure after reading the code which is a bigger deal through
> >> so I wrote
> >>>>>>>> my proposal not just asking which is a better issue for not
> >> being thread
> >>>>>>>> safe. Sorry about that.
> >>>>>>>>
> >>>>>>>> As for the second question injection seems to not be the issue
> >> or outside
> >>>>>>>> callers but just internal so phase 3 or step 3 would now be:
> >>>>>>>> Find internal callers or users of x where x is one of the above
> >> rather
> >>>>>>>> than injecting outside callers. Which answers my second question
> >> about
> >>>>>>>> external callers being a issue still.
> >>>>>>>>
> >>>>>>>> Let me know which  of the two is a better issue:
> >>>>>>>> 1. struct ggc_root_tabs being shared
> >>>>>>>> 2.static vec<const_ggc_root_tab_t> extra_root_vec; as a shared
> >> heap or
> >>>>>>>> vector of root nodes for each type of allocation
> >>>>>>>>
> >>>>>>>> and I will gladly rewrite my proposal sections for that
> >>>>>>>> as needs to be reedited.
> >>>>>>>
> >>>>>>> I don't think working on the garbage collector as a separate
> >>>>>>> GSoC project is useful at this point.  Doing locking around
> >>>>>>> allocation seems like a good short-term solution and if that
> >>>>>>> turns out to be a performance issue for the threaded part
> >>>>>>> using per-thread freelists is likely an easy to deploy
> >>>>>>> solution.
> >>>>>>>
> >>>>>>> Richard.
> >>>>>>>
> >>>>>> I agree but we were discussing this:
> >>>>>> Or maybe a project to be more
> >>>>>> explicit about regions of the code that assume that the garbage-
> >>>>>> collector can't run within them?[3] (since the GC is state that
> >> would
> >>>>>> be shared by the threads).
> >>>>>
> >>>>> The process of collecting garbage is not the only issue (and that
> >>>>> very issue is easiest mitigated by collecting only at specific
> >>>>> points - which is what we do - and have those be serializing
> >> points).
> >>>>> The main issue is the underlying memory allocator (GCC uses memory
> >>>>> that is garbage collected plus regular heap memory).
> >>>>>
> >>>>>> In addition I moved my paper back to our discussion about garbage
> >> collector
> >>>>>> state with outside callers.Seems we really need to do something
> >> about
> >>>>>> my wording as the idea of my project in a nutshell was to figure
> >>>>>> out how to mark shared state by callers and inject it into the
> >>>>>> garbage collector letting it known that the state was not shared
> >> between
> >>>>>> threads or shared. Seems that was on the GSoc page and in our
> >> discussions the issue
> >>>>>> is marking outside code for shared state. If that's correct then
> >> my
> >>>>>> wording of outside callers is incorrect it should have been shared
> >>>>>> state between threads on outside callers to the garbage collector.
> >>>>>> If the state is that in your wording above then great as I
> >> understand
> >>>>>> where we are going and will gladly change my wording.
> >>>>>
> >>>>> I'm still not sure what you are shooting at, the above sentences do
> >>>>> not make any sense to me.
> >>>>>
> >>>>>> Also freelists don't work here as the state is shared at the
> >> caller's 
> >>>>>> end which would need two major issues:
> >>>>>> 1. Locking on nodes of the 
> >>>>>> freelists when two threads allocate at the same thing which can be
> >> a 
> >>>>>> problem if the shared state is shared a lot
> >>>>>> 2. Locking allocation with 
> >>>>>> large numbers of callers can starve threads
> >>>>>
> >>>>> First of all allocating memory from the GC pool is not the main
> >>>>> work of GIMPLE passes so simply serializing at allocation time
> >> might
> >>>>> work out.  Second free lists of course do work.  What you'd do is
> >>>>> have a fast path in allocation using a thread-local "free list"
> >>>>> which you can allocate from without taking any lock.  Maybe I
> >> should
> >>>>> explain "free list" since that term doesn't make too much sense in
> >>>>> a garbage collector world.  What I'd do is when a client thread
> >>>>> asks for memory of size N allocate M objects of that size but put
> >>>>> M - 1 on the client thread local "free list" to be allocated
> >> lock-free
> >>>>> from for the next M - 1 calls.  Note that garbage collected memory
> >>>>> objects are only handed out in fixed chunks (powers of two plus
> >>>>> a few special sizes) so you'd have one "free list" per chunk size
> >>>>> per thread.
> >>>>>
> >>>>> The collection itself (mark & sweep) would be fully serialized
> >> still
> >>>>> (and not return to any threads local "free list").
> >>>>>
> >>>>> ggc_free'd objects _might_ go to the threads "free list"s (yeah, we
> >>>>> _do_ have ggc_free ...).
> >>>>>
> >>>>> As said, I don't see GC or the memory allocator as sth interesting
> >>>>> to work on for parallelization until the basic setup works and it
> >>>>> proves to be a bottleneck.
> >>>>>
> >>>>>> Seems that working on the garbage collector itself isn't the issue
> >> but 
> >>>>>> the callers as I just figured out as related to your state idea.
> >> Let me 
> >>>>>> know if that's correct and if the wording change I mentioned is
> >> fine 
> >>>>>> with you as that's the state it seems that needs to be changed.
> >>>>>> Nick 
> >>>>>
> >>>>> Richard.
> >>>>>
> >>>>
> >>>> That's fine and it's my fault for not understanding you better. I
> >> was aware 
> >>>> of the expand_functions_all being taken for passes.c. However it
> >> seems
> >>>> two other issues are these sets as related to threads:
> >>>> 1.finalize_compilation_unit
> >>>> 2.and the ipa set of pass functions
> >>>>
> >>>> If I'm understanding it correctly number 1 seems to be a early
> >> version of
> >>>> expand_all_functions for the GENERIC representation if that's the
> >> case
> >>>> it really should be fixed. Not sure which is a better issue as both
> >>>> seem to have issues either at the GENERIC level or GIMPLE level with
> >> shared
> >>>> state.
> >>>>
> >>>> Let me know if this is better as it seems now that I really think
> >> about 
> >>>> it GIMPLE or GENERIC functions in passes.c are the main issue. 
> >>>>
> >>>> Sorry for the misunderstanding and hopefully one of functions listed
> >> is better
> >>>> for moving forward with my proposal,
> >>>
> >>> Sorry, but guessing at useful projects by skimming through GCC code
> >>> at this point isn't the way to go forward - this new "idea" lacks
> >>> both detail and understanding.  Please try to stick to one of the
> >>> suggested projects or do more thorough research in case you want
> >>> to work on a new project idea next year.
> >>>
> >>> Thanks,
> >>> Richard.
> >>>
> >>
> >> I was talking about cgraphunits.c and it seems that according to this:
> >> Parallelize compilation using threads. GCC currently has an awful lot
> >> of truly global state and even more per-pass global state which makes
> >> this somewhat hard. The idea is to avoid issues with global state as
> >> much as possible by partitioning the compilation pipeline in pieces
> >> that share as little global state as possible and ensure each thread
> >> only works in one of those partitions. The biggest roadblock will be
> >> the not thread-safe memory allocator of GCC garbage collector. The goal
> >> of this project is to have a compilation pipeline driven by a scheduler
> >> assigning functions to be optimized to the partitions in the pipeline.
> >> This project would be mentored by Richard Biener. Required skills
> >> include: C/C++, ability to analyze big complex code base,
> >> parallelization
> >>
> >> We are trying to create a rendering pipeline if I'm correct and it
> >> seems that the GENERIC level needs finalize_compilation_unit
> >> to be fixed like expand_all_functions at the GIMPLE. That's my point it
> >> still is within that project. Here is what I wrote
> >> as I figured out that it was shared state related to GENERIC passing to
> >> GIMPLE which is a bottleneck or would be in the 
> >> threaded pipeline.
> >>
> >> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
> > 
> > The pre- post-IPA parts cannot be easily parallelized and that includes GENERIC to GIMPLE translation. This is why the project should focus on the much easier post-IPA and pre-RTL parts of the compilation pipeline since there interaction between functions is minimal. 
> > 
> > Richard. 
> > 
> >>
> >> Nick
> > 
> Richard,
> Thanks seems we finally got somewhere I rewrote it for that. Please just doubt check tomorrow before I submit later tomorrow
> as I would just want to make sure it's OK with you. Here is the link:
> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
> 
> The only thing is if you want more detail in the project part as I just think it's
> enough but let me know if not.

It seems the project part lacks any specifics and those are to be
researched and discovered in the Mid may - Late May timeframe of
your Roadmap?

I do not understand what "Create optimized functions ... for avoiding
sharing of state ..." is about - what kind of functions do you intend
to write?  Often examples help.

Richard.

> Thanks for your patience,
> Nick

Follow-Ups:
- Re: GSOC Proposal
  - From: nick

References:
- Re: GSOC Proposal
  - From: Richard Biener
- Re: GSOC Proposal
  - From: nick
- Re: GSOC Proposal
  - From: Richard Biener
- Re: GSOC Proposal
  - From: nick
- Re: GSOC Proposal
  - From: Richard Biener
- Re: GSOC Proposal
  - From: nick
- Re: GSOC Proposal
  - From: Richard Biener
- Re: GSOC Proposal
  - From: nick
- Re: GSOC Proposal
  - From: Richard Biener
- Re: GSOC Proposal
  - From: nick

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]