This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GSOC Proposal



On 2019-04-08 9:42 a.m., Richard Biener wrote:
> On Mon, 8 Apr 2019, nick wrote:
> 
>>
>>
>> On 2019-04-08 3:29 a.m., Richard Biener wrote:
>>> On Sun, 7 Apr 2019, nick wrote:
>>>
>>>>
>>>>
>>>> On 2019-04-07 5:31 a.m., Richard Biener wrote:
>>>>> On April 5, 2019 6:11:15 PM GMT+02:00, nick <xerofoify@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 2019-04-05 6:25 a.m., Richard Biener wrote:
>>>>>>> On Wed, 3 Apr 2019, nick wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019-04-03 7:30 a.m., Richard Biener wrote:
>>>>>>>>> On Mon, 1 Apr 2019, nick wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019-04-01 9:47 a.m., Richard Biener wrote:
>>>>>>>>>>> On Mon, 1 Apr 2019, nick wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Well I'm talking about the shared roots of this garbage
>>>>>> collector core state 
>>>>>>>>>>>> data structure or just struct ggc_root_tab.
>>>>>>>>>>>>
>>>>>>>>>>>> But also this seems that this to be no longer shared globally if
>>>>>> I'm not mistaken 
>>>>>>>>>>>> or this:
>>>>>>>>>>>> static vec<const_ggc_root_tab_t> extra_root_vec;
>>>>>>>>>>>>
>>>>>>>>>>>> Not sure after reading the code which is a bigger deal through
>>>>>> so I wrote
>>>>>>>>>>>> my proposal not just asking which is a better issue for not
>>>>>> being thread
>>>>>>>>>>>> safe. Sorry about that.
>>>>>>>>>>>>
>>>>>>>>>>>> As for the second question injection seems to not be the issue
>>>>>> or outside
>>>>>>>>>>>> callers but just internal so phase 3 or step 3 would now be:
>>>>>>>>>>>> Find internal callers or users of x where x is one of the above
>>>>>> rather
>>>>>>>>>>>> than injecting outside callers. Which answers my second question
>>>>>> about
>>>>>>>>>>>> external callers being a issue still.
>>>>>>>>>>>>
>>>>>>>>>>>> Let me know which  of the two is a better issue:
>>>>>>>>>>>> 1. struct ggc_root_tabs being shared
>>>>>>>>>>>> 2.static vec<const_ggc_root_tab_t> extra_root_vec; as a shared
>>>>>> heap or
>>>>>>>>>>>> vector of root nodes for each type of allocation
>>>>>>>>>>>>
>>>>>>>>>>>> and I will gladly rewrite my proposal sections for that
>>>>>>>>>>>> as needs to be reedited.
>>>>>>>>>>>
>>>>>>>>>>> I don't think working on the garbage collector as a separate
>>>>>>>>>>> GSoC project is useful at this point.  Doing locking around
>>>>>>>>>>> allocation seems like a good short-term solution and if that
>>>>>>>>>>> turns out to be a performance issue for the threaded part
>>>>>>>>>>> using per-thread freelists is likely an easy to deploy
>>>>>>>>>>> solution.
>>>>>>>>>>>
>>>>>>>>>>> Richard.
>>>>>>>>>>>
>>>>>>>>>> I agree but we were discussing this:
>>>>>>>>>> Or maybe a project to be more
>>>>>>>>>> explicit about regions of the code that assume that the garbage-
>>>>>>>>>> collector can't run within them?[3] (since the GC is state that
>>>>>> would
>>>>>>>>>> be shared by the threads).
>>>>>>>>>
>>>>>>>>> The process of collecting garbage is not the only issue (and that
>>>>>>>>> very issue is easiest mitigated by collecting only at specific
>>>>>>>>> points - which is what we do - and have those be serializing
>>>>>> points).
>>>>>>>>> The main issue is the underlying memory allocator (GCC uses memory
>>>>>>>>> that is garbage collected plus regular heap memory).
>>>>>>>>>
>>>>>>>>>> In addition I moved my paper back to our discussion about garbage
>>>>>> collector
>>>>>>>>>> state with outside callers.Seems we really need to do something
>>>>>> about
>>>>>>>>>> my wording as the idea of my project in a nutshell was to figure
>>>>>>>>>> out how to mark shared state by callers and inject it into the
>>>>>>>>>> garbage collector letting it known that the state was not shared
>>>>>> between
>>>>>>>>>> threads or shared. Seems that was on the GSoc page and in our
>>>>>> discussions the issue
>>>>>>>>>> is marking outside code for shared state. If that's correct then
>>>>>> my
>>>>>>>>>> wording of outside callers is incorrect it should have been shared
>>>>>>>>>> state between threads on outside callers to the garbage collector.
>>>>>>>>>> If the state is that in your wording above then great as I
>>>>>> understand
>>>>>>>>>> where we are going and will gladly change my wording.
>>>>>>>>>
>>>>>>>>> I'm still not sure what you are shooting at, the above sentences do
>>>>>>>>> not make any sense to me.
>>>>>>>>>
>>>>>>>>>> Also freelists don't work here as the state is shared at the
>>>>>> caller's 
>>>>>>>>>> end which would need two major issues:
>>>>>>>>>> 1. Locking on nodes of the 
>>>>>>>>>> freelists when two threads allocate at the same thing which can be
>>>>>> a 
>>>>>>>>>> problem if the shared state is shared a lot
>>>>>>>>>> 2. Locking allocation with 
>>>>>>>>>> large numbers of callers can starve threads
>>>>>>>>>
>>>>>>>>> First of all allocating memory from the GC pool is not the main
>>>>>>>>> work of GIMPLE passes so simply serializing at allocation time
>>>>>> might
>>>>>>>>> work out.  Second free lists of course do work.  What you'd do is
>>>>>>>>> have a fast path in allocation using a thread-local "free list"
>>>>>>>>> which you can allocate from without taking any lock.  Maybe I
>>>>>> should
>>>>>>>>> explain "free list" since that term doesn't make too much sense in
>>>>>>>>> a garbage collector world.  What I'd do is when a client thread
>>>>>>>>> asks for memory of size N allocate M objects of that size but put
>>>>>>>>> M - 1 on the client thread local "free list" to be allocated
>>>>>> lock-free
>>>>>>>>> from for the next M - 1 calls.  Note that garbage collected memory
>>>>>>>>> objects are only handed out in fixed chunks (powers of two plus
>>>>>>>>> a few special sizes) so you'd have one "free list" per chunk size
>>>>>>>>> per thread.
>>>>>>>>>
>>>>>>>>> The collection itself (mark & sweep) would be fully serialized
>>>>>> still
>>>>>>>>> (and not return to any threads local "free list").
>>>>>>>>>
>>>>>>>>> ggc_free'd objects _might_ go to the threads "free list"s (yeah, we
>>>>>>>>> _do_ have ggc_free ...).
>>>>>>>>>
>>>>>>>>> As said, I don't see GC or the memory allocator as sth interesting
>>>>>>>>> to work on for parallelization until the basic setup works and it
>>>>>>>>> proves to be a bottleneck.
>>>>>>>>>
>>>>>>>>>> Seems that working on the garbage collector itself isn't the issue
>>>>>> but 
>>>>>>>>>> the callers as I just figured out as related to your state idea.
>>>>>> Let me 
>>>>>>>>>> know if that's correct and if the wording change I mentioned is
>>>>>> fine 
>>>>>>>>>> with you as that's the state it seems that needs to be changed.
>>>>>>>>>> Nick 
>>>>>>>>>
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>
>>>>>>>> That's fine and it's my fault for not understanding you better. I
>>>>>> was aware 
>>>>>>>> of the expand_functions_all being taken for passes.c. However it
>>>>>> seems
>>>>>>>> two other issues are these sets as related to threads:
>>>>>>>> 1.finalize_compilation_unit
>>>>>>>> 2.and the ipa set of pass functions
>>>>>>>>
>>>>>>>> If I'm understanding it correctly number 1 seems to be a early
>>>>>> version of
>>>>>>>> expand_all_functions for the GENERIC representation if that's the
>>>>>> case
>>>>>>>> it really should be fixed. Not sure which is a better issue as both
>>>>>>>> seem to have issues either at the GENERIC level or GIMPLE level with
>>>>>> shared
>>>>>>>> state.
>>>>>>>>
>>>>>>>> Let me know if this is better as it seems now that I really think
>>>>>> about 
>>>>>>>> it GIMPLE or GENERIC functions in passes.c are the main issue. 
>>>>>>>>
>>>>>>>> Sorry for the misunderstanding and hopefully one of functions listed
>>>>>> is better
>>>>>>>> for moving forward with my proposal,
>>>>>>>
>>>>>>> Sorry, but guessing at useful projects by skimming through GCC code
>>>>>>> at this point isn't the way to go forward - this new "idea" lacks
>>>>>>> both detail and understanding.  Please try to stick to one of the
>>>>>>> suggested projects or do more thorough research in case you want
>>>>>>> to work on a new project idea next year.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>
>>>>>> I was talking about cgraphunits.c and it seems that according to this:
>>>>>> Parallelize compilation using threads. GCC currently has an awful lot
>>>>>> of truly global state and even more per-pass global state which makes
>>>>>> this somewhat hard. The idea is to avoid issues with global state as
>>>>>> much as possible by partitioning the compilation pipeline in pieces
>>>>>> that share as little global state as possible and ensure each thread
>>>>>> only works in one of those partitions. The biggest roadblock will be
>>>>>> the not thread-safe memory allocator of GCC garbage collector. The goal
>>>>>> of this project is to have a compilation pipeline driven by a scheduler
>>>>>> assigning functions to be optimized to the partitions in the pipeline.
>>>>>> This project would be mentored by Richard Biener. Required skills
>>>>>> include: C/C++, ability to analyze big complex code base,
>>>>>> parallelization
>>>>>>
>>>>>> We are trying to create a rendering pipeline if I'm correct and it
>>>>>> seems that the GENERIC level needs finalize_compilation_unit
>>>>>> to be fixed like expand_all_functions at the GIMPLE. That's my point it
>>>>>> still is within that project. Here is what I wrote
>>>>>> as I figured out that it was shared state related to GENERIC passing to
>>>>>> GIMPLE which is a bottleneck or would be in the 
>>>>>> threaded pipeline.
>>>>>>
>>>>>> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
>>>>>
>>>>> The pre- post-IPA parts cannot be easily parallelized and that includes GENERIC to GIMPLE translation. This is why the project should focus on the much easier post-IPA and pre-RTL parts of the compilation pipeline since there interaction between functions is minimal. 
>>>>>
>>>>> Richard. 
>>>>>
>>>>>>
>>>>>> Nick
>>>>>
>>>> Richard,
>>>> Thanks seems we finally got somewhere I rewrote it for that. Please just doubt check tomorrow before I submit later tomorrow
>>>> as I would just want to make sure it's OK with you. Here is the link:
>>>> https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit
>>>>
>>>> The only thing is if you want more detail in the project part as I just think it's
>>>> enough but let me know if not.
>>>
>>> It seems the project part lacks any specifics and those are to be
>>> researched and discovered in the Mid may - Late May timeframe of
>>> your Roadmap?
>> Yes this is because after looking into ipa functions it would take
>> some time to really understand it not just at a surface level. It's
>> not the functions themselves but thinking about how the pipeline
>> would work in the POST IPA and pre RTL functions. 
>>
>>> I do not understand what "Create optimized functions ... for avoiding
>>> sharing of state ..." is about - what kind of functions do you intend
>>> to write?  Often examples help.
>>>
>> Fix functions at both the post IPA  and pre RTL levels for avoiding sharing of state and increasing 
>> throughput with multiple threads in the compiler is what I actually meant. There aren't any new 
>> functions it's optimizing the paths. So it was a small typo. Thanks for noticing it.
>>
>> I will submit either later today or early tomorrow so if there's anything else let me know, 
> 
> I think you will have to explain how you want to avoid sharing of state
> and at least vaguely _what_ global state you want to avoid.  The other
> proposal side-steps global state of individual GIMPLE passes by never
> executing the same GIMPLE pass twice at the same time for example,
> so it's the scheduler avoiding shared state.  What other global state
> do you have in mind?
> 
> Richard.
> 

If I understanding correctly there are two at the IPA late pass we can pass over the program
partition multiple times that would be fixed. For pre RTL seems cgraph_function_versioning
has issues with copying functions more than once and then doing the transformation on different
threads multiple times. 

If that's enough info I will just edit the roadmap, introduction and project parts to explain
the details.

Nick


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]