This is the mail archive of the
mailing list for the GCC project.
Re: Question on param MAX_PENDING_LIST_LENGTH in sched-deps
- From: "Bin.Cheng" <amker dot cheng at gmail dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Fri, 7 Nov 2014 11:02:03 +0800
- Subject: Re: Question on param MAX_PENDING_LIST_LENGTH in sched-deps
- Authentication-results: sourceware.org; auth=none
- References: <CAHFci2-r0DRCbU+dwEU-BgJgBVth7oFKEYHFA5ran+8gdk0ZdQ at mail dot gmail dot com> <545BFFF3 dot 6030409 at redhat dot com>
On Fri, Nov 7, 2014 at 7:10 AM, Jeff Law <email@example.com> wrote:
> On 11/04/14 20:29, Bin.Cheng wrote:
>> The parameter MAX_PENDING_LIST_LENGTH is set to 32 by default. It
>> seems to me the length of pending list can't be larger than 32. But
>> in sched-deps.c, below code is used:
>> /* Pending lists can't get larger with a readonly context. */
>> if (!deps->readonly
>> && ((deps->pending_read_list_length +
>> > MAX_PENDING_LIST_LENGTH))
>> Since we compares it using ">", the list can have 33 instructions at
>> most, which is inconsistent with the parameter name.
>> Well, it's some kind of nit picking.
> Yea, seems like a bit of a nit. Might be worth fixing as someone might size
> an array based on the param's value only to find out it can have an extra
Yes, I found this only because in some extreme case with lots of
consecutive stores, length of 33 breaks the store pairing one time in
the middle of instruction flow.
>> Another question. This parameter is introduced to prevent GCC from
>> running for too long time. I am not clear if the running time is a
>> quadratic function of pending list, or a quadratic function of
>> dependency nodes? If it's the latter, could we use the number of
>> dependency nodes as the parameter directly? Because in some cases
>> like memory initialization, there are more than 32 store instructions
>> in flow, but the dependency are actually very simple.
> I don't recall what part of the algorithm was quadratic, but it should be
> too hard to find out given the thread references the target & the testcase.
At least dependency analysis part is of quadratic complexity because
for each new memory access, it iterates over all insns in the list to
see if it depends on them. But this seems not the real problem, there
are comments in both source code and the patch you mentioned, that
perf issue is caused by too many dependency nodes created in this
Also this is introduced at time we only have low frequency/memory
machines, I bet it can be relax to a larger number without hurting
compilation time. The reason I didn't is because I can't get anything
good from it, seems it barely has any impact on performance.