This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops
- From: Richard Guenther <richard dot guenther at gmail dot com>
- To: "Fang, Changpeng" <Changpeng dot Fang at amd dot com>
- Cc: Jack Howarth <howarth at bromo dot med dot uc dot edu>, Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>, Xinliang David Li <davidxl at google dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 5 Jan 2011 12:13:52 +0100
- Subject: Re: [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops
- References: <20101214075629.GA10020@kam.mff.cuni.cz> <D4C76825A6780047854A11E93CDE84D004C1768401@SAUSEXMBP01.amd.com> <20101214210552.GA19633@kam.mff.cuni.cz> <AANLkTim241bdY_JeZc2z-eLy9WU1+=wTz355XyjDfQXW@mail.gmail.com> <D4C76825A6780047854A11E93CDE84D004C1768405@SAUSEXMBP01.amd.com> <AANLkTinzdo0Sjx2WxGC9F-50fyRjXL2vCRNtyvAKT17y@mail.gmail.com> <20101215092220.GA9872@kam.mff.cuni.cz> <D4C76825A6780047854A11E93CDE84D004C1768412@SAUSEXMBP01.amd.com> <20110104030426.GA28556@bromo.med.uc.edu> <D4C76825A6780047854A11E93CDE84D004C176842F@SAUSEXMBP01.amd.com>
On Tue, Jan 4, 2011 at 11:30 PM, Fang, Changpeng <Changpeng.Fang@amd.com> wrote:
> Hi,
>
> Thanks, Jack. Hopefully the 6% protein gain (for -m64) is not just noise.
>
> I updated the patch based on the current trunk (REV 168477).
>
> Is it OK to commit now?
The patch doesn't contain a testcase nor a reference to a bug so isn't
appropriate
for stage4 (or even stage3). Also using BB flags for this doesn't
sound quite right
as you'd have to make sure CFG manipulations properly copy/unset this flag
(which is likely the reason for the polyhedron regressions seen).
That said, I don't like the patch too much anyway, it looks like a hack.
The proper fix is to finally go the way of preserving loop information
and putting
such flag in the loop information (or well, just manually setting the
upper bound
for the number of iterations therein).
Please postpone this for stage1 of GCC 4.7. Thanks.
Richard.
> Thanks,
>
> Changpeng
>
>
>
> ________________________________________
> From: Jack Howarth [howarth@bromo.med.uc.edu]
> Sent: Monday, January 03, 2011 9:04 PM
> To: Fang, Changpeng
> Cc: Zdenek Dvorak; Richard Guenther; Xinliang David Li; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops
>
> On Fri, Dec 17, 2010 at 01:14:49AM -0600, Fang, Changpeng wrote:
>> Hi, Jack:
>>
>> Thanks for the testing.
>>
>> This patch is not supposed to slow down a program by 10% (rnflow and test_fpu).
>> It would be helpful if you can provide analysis why they are slowed down.
>
> Changpeng,
> ? The corrected merge against gcc trunk of...
>
> Index: gcc/basic-block.h
> ===================================================================
> --- gcc/basic-block.h ? (revision 168437)
> +++ gcc/basic-block.h ? (working copy)
> @@ -247,11 +247,14 @@
> ? ? ?Only used in cfgcleanup.c. ?*/
> ? BB_NONTHREADABLE_BLOCK = 1 << 11,
>
> + ?/* Set on blocks that are headers of non-rolling loops. ?*/
> + ?BB_HEADER_OF_NONROLLING_LOOP = 1 << 12,
> +
> ? /* Set on blocks that were modified in some way. ?This bit is set in
> ? ? ?df_set_bb_dirty, but not cleared by df_analyze, so it can be used
> ? ? ?to test whether a block has been modified prior to a df_analyze
> ? ? ?call. ?*/
> - ?BB_MODIFIED = 1 << 12
> + ?BB_MODIFIED = 1 << 13
> ?};
>
> ?/* Dummy flag for convenience in the hot/cold partitioning code. ?*/
>
> for the proposed patch from http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01344.html
> eliminated the performance regressions on x86_64-apple-darwin10. I now get...
>
> Compile Command : gfortran -ffast-math -funroll-loops -O3 %n.f90 -o %n
>
> ? ? ? ? ? ? ? ? ? Execution Time
> -m32
> ? ? ? ? ? ? ? ? stock ? ? ? patched ?%increase
> ac ? ? ? ? ? ? ? 10.59 ? ? ? 10.59 ? ? ?0.0
> aermod ? ? ? ? ? 19.49 ? ? ? 19.13 ? ? -1.8
> air ? ? ? ? ? ? ? 6.07 ? ? ? ?6.07 ? ? ?0.0
> capacita ? ? ? ? 44.60 ? ? ? 44.61 ? ? ?0.0
> channel ? ? ? ? ? 1.98 ? ? ? ?1.98 ? ? ?0.0
> doduc ? ? ? ? ? ?31.19 ? ? ? 31.31 ? ? ?0.4
> fatigue ? ? ? ? ? 9.90 ? ? ? 10.29 ? ? ?3.9
> gas_dyn ? ? ? ? ? 4.72 ? ? ? ?4.71 ? ? -0.2
> induct ? ? ? ? ? 13.93 ? ? ? 13.93 ? ? ?0.0
> linpk ? ? ? ? ? ?15.50 ? ? ? 15.49 ? ? -0.1
> mdbx ? ? ? ? ? ? 11.28 ? ? ? 11.26 ? ? -0.2
> nf ? ? ? ? ? ? ? 27.62 ? ? ? 27.58 ? ? -0.1
> protein ? ? ? ? ?38.70 ? ? ? 38.60 ? ? -0.3
> rnflow ? ? ? ? ? 24.68 ? ? ? 24.68 ? ? ?0.0
> test_fpu ? ? ? ? 10.13 ? ? ? 10.13 ? ? ?0.0
> tfft ? ? ? ? ? ? ?1.92 ? ? ? ?1.92 ? ? ?0.0
>
> Geometric Mean ? 12.06 ? ? ? 12.08 ? ? ?0.2
> Execution Time
>
> -m64
> ? ? ? ? ? ? ? ? stock ? ? ? patched ? %increase
> ac ? ? ? ? ? ? ? ?8.80 ? ? ? ?8.80 ? ? ?0.0
> aermod ? ? ? ? ? 17.34 ? ? ? 17.17 ? ? -1.0
> air ? ? ? ? ? ? ? 5.48 ? ? ? ?5.52 ? ? ?0.7
> capacita ? ? ? ? 32.38 ? ? ? 32.50 ? ? ?0.4
> channel ? ? ? ? ? 1.84 ? ? ? ?1.84 ? ? ?0.0
> doduc ? ? ? ? ? ?26.50 ? ? ? 26.52 ? ? ?0.1
> fatigue ? ? ? ? ? 8.35 ? ? ? ?8.33 ? ? -0.2
> gas_dyn ? ? ? ? ? 4.30 ? ? ? ?4.29 ? ? -0.2
> induct ? ? ? ? ? 12.83 ? ? ? 12.83 ? ? ?0.0
> linpk ? ? ? ? ? ?15.49 ? ? ? 15.49 ? ? ?0.0
> mdbx ? ? ? ? ? ? 11.23 ? ? ? 11.22 ? ? -0.1
> nf ? ? ? ? ? ? ? 30.21 ? ? ? 30.16 ? ? -0.2
> protein ? ? ? ? ?34.13 ? ? ? 32.07 ? ? -6.0
> rnflow ? ? ? ? ? 23.18 ? ? ? 23.19 ? ? ?0.0
> test_fpu ? ? ? ? ?8.04 ? ? ? ?8.02 ? ? -0.2
> tfft ? ? ? ? ? ? ?1.87 ? ? ? ?1.86 ? ? -0.5
>
> Geometric Mean ? 10.87 ? ? ? 10.82 ? ? -0.5
> Execution Time
>