This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Tiny predcom improvement (PR tree-optimization/59643)
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Richard Biener <rguenther at suse dot de>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 9 Jan 2014 17:01:05 -0800
- Subject: Re: [PATCH] Tiny predcom improvement (PR tree-optimization/59643)
- Authentication-results: sourceware.org; auth=none
- References: <20131231190433 dot GH892 at tucnak dot redhat dot com>
On Tue, Dec 31, 2013 at 11:04 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> As written in the PR, I've been looking why is llvm 3.[34] so much faster
> on Scimark2 SOR benchmark and the reason is that it's predictive commoning
> or whatever it uses doesn't give up on the inner loop, while our predcom
> unnecessarily gives up, because there are reads that could alias the write.
>
> This simple patch improves the benchmark by 42%. We already ignore
> unsuitable dependencies for read/read, the patch extends that for unsuitable
> dependencies for read/write by just putting the read (and anything in it's
> component) into the bad component which is ignored. pcom doesn't optimize
> away the writes and will keep the potentially aliasing reads unmodified as
> well. Without the patch we'd merge the two components, and as
> !determine_offset between the two DRs, it would mean the whole merged
> component would be always unsuitable and thus ignored. With the patch
> we'll hopefully have some other reads with known offset to the write
> and can optimize that, so the patch should always either handle what
> it did before or handle perhaps some more cases.
>
> The inner loop from the (public domain) benchmark is added in the two tests,
> one runtime test and one test looking whether pcom actually optimized it.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2013-12-31 Jakub Jelinek <jakub@redhat.com>
>
> PR tree-optimization/59643
> * tree-predcom.c (split_data_refs_to_components): If one dr is
> read and one write, determine_offset fails and the write isn't
> in the bad component, just put the read into the bad component.
>
This caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59745
--
H.J.