This is the mail archive of the
mailing list for the GCC project.
Re: [patch] jump threading multiple paths that start from the same BB
- From: Jeff Law <law at redhat dot com>
- To: Aldy Hernandez <aldyh at redhat dot com>, Christophe Lyon <christophe dot lyon at linaro dot org>
- Cc: gcc Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 3 Jul 2018 18:16:36 -0600
- Subject: Re: [patch] jump threading multiple paths that start from the same BB
- References: <email@example.com> <firstname.lastname@example.org> <email@example.com> <CAKdteOYDiox7j=nnM-9jNNd-tiEqVF_KvoQ-B=quzQ=YL8VKpQ@mail.gmail.com> <firstname.lastname@example.org>
On 07/03/2018 03:31 AM, Aldy Hernandez wrote:
> On 07/02/2018 07:08 AM, Christophe Lyon wrote:
>>>> On 11/07/2017 10:33 AM, Aldy Hernandez wrote:
>>>>> While poking around in the backwards threader I noticed that we bail if
>>>>> we have already seen a starting BB.
>>>>> /* Do not jump-thread twice from the same block. */
>>>>> if (bitmap_bit_p (threaded_blocks, entry->src->index)
>>>>> This limitation discards paths that are sub-paths of paths that have
>>>>> already been threaded.
>>>>> The following patch scans the remaining to-be-threaded paths to identify
>>>>> if any of them start from the same point, and are thus sub-paths of the
>>>>> just-threaded path. By removing the common prefix of blocks in upcoming
>>>>> threadable paths, and then rewiring first non-common block
>>>>> appropriately, we expose new threading opportunities, since we are no
>>>>> longer starting from the same BB. We also simplify the would-be
>>>>> threaded paths, because we don't duplicate already duplicated paths.
>> I've noticed a regression on aarch64:
>> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps
>> threaded: 3"
>> very likely caused by this patch (appeared between 262282 and 262294)
> The test needs to be adjusted here.
> The long story is that the aarch64 IL is different at thread3 time in
> that it has 2 profitable sub-paths that can now be threaded with my
> patch. This is causing the threaded count to be 5 for aarch64, versus 3
> for x86 64. Previously we couldn't thread these in aarch64, so the
> backwards threader would bail.
> One can see the different threading opportunities by sticking
> debug_all_paths() at the top of thread_through_all_blocks(). You will
> notice that aarch64 has far more candidates to begin with. The IL on
> the x86 backend, has no paths that start on the same BB. The aarch64,
> on the other hand, has many to choose from:
> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11,
> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 19,
> Some of these prove unprofitable, but 2 more than before are profitable now.
> BTW, I see another threading related failure on aarch64 which is
> unrelated to my patch, and was previously there:
> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps
> This is probably another IL incompatibility between architectures.
> Anyways... the attached path fixes the regression. I have added a note
> to the test explaining the IL differences. We really should rewrite all
> the threading tests (I am NOT volunteering ;-)).
> OK for trunk?
> * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust test because aarch64
> has a slightly different IL that provides more threading
WRT rewriting the tests. I'd certainly agree that we don't have the
right set of knobs to allow us to characterize the target nor do we have
the right dumping/scanning facilities to describe and query the CFG changes.
The fact that the IL changes so much across targets is a sign that
target dependency (probably BRANCH_COST) is twiddling the gimple we
generate. I strongly suspect we'd be a lot better off if we tackled the
BRANCH_COST problem first.
ps. That particular test is the test which led to the creation of the
backwards jump threader :-)