This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Tree unroll - Relaxing code size increase with O2


Hi,

    Currently, tree unrolling pass(cunroll) does not allow any code
size growth in O2 mode.  Code size growth is permitted only if O3 or
funroll-loops/fpeel-loops is used. I have created  a patch to allow
partial code size increase in O2 mode. With funroll-loops the maximum
allowed code growth is 100 unrolled insns. For partial growth, I
experimented with various values of code growth and I have attached
SPEC 2006 performance numbers for code growth from 20 to 100 insns in
steps of 20.

   For this patch, I have set the partial code growth in O2 mode to be
40 insns (tunable via param) where we get performance improvements
with minimal code size growth.  Perf. data shows good improvements in
a few benchmarks.  h264, sjeng and bzip2 get >2%  improvement.
calculix shows a big regression(4.5% on westmere) which I am
investigating along with the povray regression.

   I also ran experiments with -ftree-vectorize turned on with -O2
both in baseline and with the partial unroll to study the effect of
unrolling on vectorization. Loop unrolling seems to benefit more
benchmarks when vectorization is turned on.

   I have attached the patch and pdfs of the perf. data. and code size growth.

How to read the attached perf data:

There are two data files.

* spec_perf_O2_unroll.txt contains perf data using unrolling with
various code size growth on O2.
* spec_perf_O2_vectorize_ unroll.txt contains perf data using
unrolling with various code size growth on O2 + ftree-vectorize.

Each file contains perf. improvements and code size growth data.
Experiments were done on Ibis-sandybridge and Ikaria-westmere.

Here is a sample from the file (All perf. numbers are in %):

Unroll insns code growth           20      40     60       80        100
_____________________________________________________
spec/2006/fp/C++/444.namd     -3.2   -0.13   -0.4    -0.57      -0.31

This data shows that namd regressed by 3.2% over baseline when code
size growth was set to 20 insns and regressed by 0.57% over baseline
when growth was 80 insns.

   Please let me know what you think.

Thanks
Sri

Attachment: tree_loop_unroll_O2.txt
Description: Text document

Attachment: spec_perf_O2_unroll.txt
Description: Text document

Attachment: spec_perf_O2_vectorize_unroll.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]