This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH GCC][5/5]Enable tree loop distribution at -O3 and above optimization levels.
On Fri, Jun 23, 2017 at 6:04 AM, Jeff Law <email@example.com> wrote:
> On 06/07/2017 02:07 AM, Bin.Cheng wrote:
>> On Tue, Jun 6, 2017 at 6:47 PM, Jeff Law <firstname.lastname@example.org> wrote:
>>> On 06/02/2017 05:52 AM, Bin Cheng wrote:
>>>> This patch enables -ftree-loop-distribution by default at -O3 and above optimization levels.
>>>> Bootstrap and test at O2/O3 on x86_64 and AArch64. is it OK?
>>>> Note I don't have strong opinion here and am fine with either it's accepted or rejected.
>>>> 2017-05-31 Bin Cheng <email@example.com>
>>>> * opts.c (default_options_table): Enable OPT_ftree_loop_distribution
>>>> for -O3 and above levels.
>>> I think the question is how does this generally impact the performance
>>> of the generated code and to a lesser degree compile-time.
>>> Do you have any performance data?
>> Hi Jeff,
>> At this stage of the patch, only hmmer is impacted and improved
>> obviously in my local run of spec2006 for x86_64 and AArch64. In long
>> term, loop distribution is also one prerequisite transformation to
>> handle bwaves (at least). For these two impacted cases, it helps to
>> resolve the gap against ICC. I didn't check compilation time slow
>> down, we can restrict it to problem with small partition number if
>> that's a problem.
> Just a note. I know you've iterated further with Richi -- I'm not
> objecting to the patch, nor was I ready to approve.
> Are you and Richi happy with this as-is or are you looking to submit
> something newer based on the conversation the two of you have had?
The patch series is updated in various ways according to review
comments, for example, it restricts compilation time by checking
number of data references against MAX_DATAREFS_FOR_DATADEPS as well as
restores data dependence cache. There are still two missing parts I'd
like to do as followup patches: one is loop nest distribution and the
other is a data-locality cost model (at least) for small cases. Now
Richi approved most patches except the last major one, but I still
need another iterate for some (approved) patches in order to fix
mistake/typo introduced when I separating the patch.