This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 27% regression of gcc 4.3 performance on cpu2k6/calculix


Hi!
I create test to reproduce issue with cpu2006/454.calculix
See attached. File e_c3d.f contains cutted subroutine from calculix.
tr535.f main entry point of the test. you can use go-script as a
reference how i get these results. find_stall.pl script which find
problem instruction combinations.

Problem that new compiler generates read instruction right after
write. See some dumps below.

This is inner cycle near line #42 generated by rev. 119759 compiler
.L13:
.LBB22:
	.loc 1 42 0
	movapd	%xmm2, %xmm0
	leaq	(%rdx,%rbx), %rax
	.loc 1 38 0
	addl	$1, %edi
	addq	$24, %rdx
	.loc 1 42 0
	mulsd	72(%rcx), %xmm0
	.loc 1 38 0
	addq	$72, %rcx
	cmpl	$4, %edi
	.loc 1 42 0
	mulsd	%xmm3, %xmm0
	mulsd	-8(%rax,%r9,8), %xmm0
	mulsd	%xmm4, %xmm0
	addsd	%xmm0, %xmm1
	.loc 1 38 0
	jne	.L13
	
This is for line 42 generated by rev. 119760 compiler
.L13:
.LBB23:
	.loc 1 42 0
	movsd	72(%rdx), %xmm0
	movq	80(%rsp), %rax
	addq	$72, %rdx
	mulsd	-8(%r9,%r15,8), %xmm0
	addq	%rdi, %rax
	addq	$24, %rdi
	.loc 1 38 0
	cmpq	$72, %rdi
	.loc 1 42 0
	mulsd	-8(%r11,%r14,8), %xmm0
	mulsd	-8(%rax,%r13,8), %xmm0
	movq	440(%rsp), %rax
	mulsd	(%rax), %xmm0
	addsd	(%rsi,%r10,8), %xmm0     <-|
	movsd	%xmm0, (%rsi,%r10,8)    <-+- problems
	.loc 1 38 0
	jne	.L13



My output is:
real    0m3.781s
user    0m3.776s
sys     0m0.004s

real    0m5.956s
user    0m5.948s
sys     0m0.004s
hey... we are going
hey... we are going
Line 31
       addsd   (%rsi,%r10,8), %xmm0
       movsd   %xmm0, (%rsi,%r10,8)

Line 42
       addsd   (%rsi,%r10,8), %xmm0
       movsd   %xmm0, (%rsi,%r10,8)

Feel free to ask if any problems with reproducing occurs.

-Vladimir


------ * From: Grigory Zagorodnev <grigory_zagorodnev at linux dot intel dot com> * To: gcc at gcc dot gnu dot org, dnovillo at redhat dot com * Cc: "H. J. Lu" <hjl at lucon dot org> * Date: Mon, 15 Jan 2007 17:59:31 +0300 * Subject: 27% regression of gcc 4.3 performance on cpu2k6/calculix

Hi!
There is a huge regression of gcc 4.3 performance detected on
cpu2006/454.calculix benchmark at -O2 optimization level on
x86_64-redhat-linux.

Regression is caused by mem-ssa merge 12/12/2006 (revision 119760).
http://gcc.gnu.org/viewcvs?view=rev&revision=119760


PS: I'm trying to get a small reproducer - Grigory

Attachment: test_calculix.tar.bz2
Description: BZip2 compressed data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]