This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: GIMPLE-level if-converter with scratchpads --- Results from SPEC2006 FP analysis done at Richard`s request {late July / early August} --- results from running all the SPEC2006 CPU FP tests again after adding "-march=native"
- From: Abe <abe_skolnik at yahoo dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>, Alan Lawrence <alan dot lawrence at arm dot com>, Sebastian Pop <sebpop at gmail dot com>
- Date: Tue, 25 Aug 2015 16:22:45 -0500
- Subject: Re: GIMPLE-level if-converter with scratchpads --- Results from SPEC2006 FP analysis done at Richard`s request {late July / early August} --- results from running all the SPEC2006 CPU FP tests again after adding "-march=native"
- Authentication-results: sourceware.org; auth=none
- References: <55CBF39A dot 6020109 at yahoo dot com> <CAFiYyc1_VquBaSTQsHvzu3p0LbBXRhCG6d7uqnXD+G3zhS2w2g at mail dot gmail dot com>
Dear all,
I have redone the SPEC2006 CPU FP tests again after adding "-march=native". Unfortunately, the results are not
very good for the new if-converter. I believe this is the case because the CPU in question [details below] "only"
has first-generation AVX, and, from what I`ve been told, at least AVX2 is needed for scatter/gather and/or
masked loads/stores, and possibly even AVX512 [the 3rd generation]. As I have written before, in my opinion
the new converter would be better than the old one if enough time and effort were to be spent on it,
especially the time and effort to make it not add unneeded indirections.
First, I will give the totals. Then, I`ll give the CPU details for better understanding what "-march=native"
did [or at least should have done]. Then, I`ll give the per-subtest numbers that Richard requested.
For concision, I will use "Richard`s check-in" to refer to the GCC I built from Richard`s check-in dated July 10 2015
with Git SHA "cb791e75379bc0c8b10bd13bcb24305c36fd504b" and "git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@225652".
[my reason for rebasing the relevant Git check-out to that point: quoting Richard`s check-in message:
"PR tree-optimization/66823
* tree-if-conv.c (memrefs_read_or_written_unconditionally): Fix inverted predicate."]
All the compilations were done with "-Ofast". The results, all integers, are the number of loops that were vectorized.
Regards,
Abe
Richard`s check-in
[i.e. *_old_* converter]
no if-conversion-specific flags
-------------------------------
8374
Richard`s check-in
[i.e. *_old_* converter]
"-ftree-loop-if-convert" but NOT "-ftree-loop-if-convert-stores"
----------------------------------------------------------------
8374
Richard`s check-in
[i.e. *_old_* converter]
both "-ftree-loop-if-convert" AND "-ftree-loop-if-convert-stores"
-----------------------------------------------------------------
8388
----
patched version of Richard`s check-in
[i.e. *_new_* converter]
no if-conversion-specific flags
-------------------------------------
8275
patched version of Richard`s check-in
[i.e. *_new_* converter]
"-ftree-loop-if-convert" but NOT "-ftree-loop-if-convert-stores"
----------------------------------------------------------------
8275
patched version of Richard`s check-in
[i.e. *_new_* converter]
both "-ftree-loop-if-convert" AND "-ftree-loop-if-convert-stores"
-----------------------------------------------------------------
8275
CPU [from "/proc/cpuinfo"]
--------------------------
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
stepping : 7
microcode : 0x710
cpu MHz : 2499.902
cache size : 15360 KB
physical id : 0
siblings : 12
core id : 0
cpu cores : 6
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx
est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timer aes xsave avx lahf_lm ida arat xsaveopt pln pts dtherm
tpr_shadow vnmi flexpriority ept vpid
bogomips : 4999.80
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
[similarly for the cores numbered 1...23]
kernel: 3.13.0-57-generic #95-Ubuntu SMP Fri Jun 19 09:28:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Richard`s check-in
[i.e. *_old_* converter]
no if-conversion-specific flags
-------------------------------
410.bwaves: 13
416.gamess: 3837
433.milc: 7
434.zeusmp: 138
435.gromacs: 172
436.cactusADM: 261
437.leslie3d: 92
444.namd: 0
450.soplex: 1
454.calculix: 436
459.GemsFDTD: 275
465.tonto: 943
470.lbm: 0
481.wrf: 2141
482.sphinx3: 58
998.specrand: 0
Richard`s check-in
[i.e. *_old_* converter]
"-ftree-loop-if-convert" but NOT "-ftree-loop-if-convert-stores"
----------------------------------------------------------------
410.bwaves: 13
416.gamess: 3837
433.milc: 7
434.zeusmp: 138
435.gromacs: 172
436.cactusADM: 261
437.leslie3d: 92
444.namd: 0
450.soplex: 1
454.calculix: 436
459.GemsFDTD: 275
465.tonto: 943
470.lbm: 0
481.wrf: 2141
482.sphinx3: 58
998.specrand: 0
Richard`s check-in
[i.e. *_old_* converter]
both "-ftree-loop-if-convert" AND "-ftree-loop-if-convert-stores"
-----------------------------------------------------------------
410.bwaves: 13
416.gamess: 3850
433.milc: 7
434.zeusmp: 138
435.gromacs: 173
436.cactusADM: 261
437.leslie3d: 92
444.namd: 0
450.soplex: 1
454.calculix: 436
459.GemsFDTD: 275
465.tonto: 943
470.lbm: 0
481.wrf: 2141
482.sphinx3: 58
998.specrand: 0
----
patched version of Richard`s check-in
[i.e. *_new_* converter]
no if-conversion-specific flags
-------------------------------------
410.bwaves: 13
416.gamess: 3804
433.milc: 7
434.zeusmp: 136
435.gromacs: 173
436.cactusADM: 261
437.leslie3d: 92
444.namd: 0
450.soplex: 1
454.calculix: 436
459.GemsFDTD: 275
465.tonto: 943
470.lbm: 0
481.wrf: 2079
482.sphinx3: 55
998.specrand: 0
patched version of Richard`s check-in
[i.e. *_new_* converter]
"-ftree-loop-if-convert" but NOT "-ftree-loop-if-convert-stores"
----------------------------------------------------------------
410.bwaves: 13
416.gamess: 3804
433.milc: 7
434.zeusmp: 136
435.gromacs: 173
436.cactusADM: 261
437.leslie3d: 92
444.namd: 0
450.soplex: 1
454.calculix: 436
459.GemsFDTD: 275
465.tonto: 943
470.lbm: 0
481.wrf: 2079
482.sphinx3: 55
998.specrand: 0
patched version of Richard`s check-in
[i.e. *_new_* converter]
both "-ftree-loop-if-convert" AND "-ftree-loop-if-convert-stores"
-----------------------------------------------------------------
410.bwaves: 13
416.gamess: 3804
433.milc: 7
434.zeusmp: 136
435.gromacs: 173
436.cactusADM: 261
437.leslie3d: 92
444.namd: 0
450.soplex: 1
454.calculix: 436
459.GemsFDTD: 275
465.tonto: 943
470.lbm: 0
481.wrf: 2079
482.sphinx3: 55
998.specrand: 0