This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: GCC4.3.4 downside against GCC3.4.4 on mips?

From: "Amker.Cheng" <amker dot cheng at gmail dot com>
To: gcc at gcc dot gnu dot org
Cc: Václav Haisman <v dot haisman at sh dot cvut dot cz>, Martin Guy <martinwguy at gmail dot com>, Andrew Haley <aph at redhat dot com>
Date: Thu, 27 May 2010 18:33:04 +0800
Subject: Re: GCC4.3.4 downside against GCC3.4.4 on mips?
References: <AANLkTin4axeTjBjWaSmakG2um2MzAqy_tCLpDHzUlTnb@mail.gmail.com> <3ada5fe2e4b15ae3fefa95747429b841@shell.sh.cvut.cz>

> Posting some random numbers without a test-case and precise command line
> parameters for both compilers makes the numbers useless, IMHO. You also
> only mention instruction counts. Have you actually benchmarked the
> resulting code? CPUs are complicated and what you might perceive as worse
> code might actually be superior thanks to scheduling and internal CPU
> parallelism etc.

Thanks for reminding.
After some investigation, I could demonstrate the issue by following
piece of code:
-------------------------------------begin here-------------------
extern int *p[5];

# define REAL_RADIX_2            24
# define REAL_MUL_2(x, y)        (((long long)(x) * (long long)(y)) >>
REAL_RADIX_2)


void func(int *b1, int *b2)
{
  int c0 = p[3][0];
  int c1 = p[3][1];

  b2[0x18] = b1[0x18] + b1[0x1B];
  b2[0x1B] = REAL_MUL_2((b1[0x18] - b1[0x1B]) , c0);

  b2[0x19] = b1[0x19] + b1[0x1A];
  b2[0x1A] = REAL_MUL_2((b1[0x19] - b1[0x1A]) , c1);

  b2[0x1C] = b1[0x1C] + b1[0x1F];
  b2[0x1F] = REAL_MUL_2((b1[0x1F] - b1[0x1C]) , c0);

  b2[0x1D] = b1[0x1D] + b1[0x1E];
  b2[0x1E] = REAL_MUL_2((b1[0x1E] - b1[0x1D]) , c1);
}
-------------------------------------cut here-------------------

It seems GCC4.3.4 always expands the long long multiplication into
three long multiplications, like
-------------------------------------begin here-------------------
#  b2[0x1A] = REAL_MUL_2((b1[0x19] - b1[0x1A]) , c1);

	lw	$6,104($4)
	lw	$2,100($4)
	subu	$2,$2,$6
	mult	$11,$2
	sra	$6,$2,31
	madd	$6,$9
	mflo	$6
	multu	$2,$9
	mfhi	$3
	addu	$3,$6,$3
	sll	$6,$3,8
	mflo	$2
	srl	$7,$2,24
	or	$7,$6,$7
	sw	$7,104($5)
-------------------------------------cut here-------------------

while GCC3.4.4 treats the long long multiplication just like simple
ones, which generates only one
mult insn for each statement, like
-------------------------------------begin here-------------------
#  b2[0x1A] = REAL_MUL_2((b1[0x19] - b1[0x1A]) , c1);

	lw	$2,100($4)
	lw	$7,104($4)
	subu	$3,$2,$7
	mult	$3,$9
	mflo	$6
	mfhi	$25
	srl	$15,$6,24
	sll	$24,$25,8
	or	$14,$15,$24
	sw	$14,104($5)
-------------------------------------cut here-------------------

In my understanding, It‘s not necessary using three mult insn to implement
long long mult, since the operands are converted from int type.

And as before, the compiling options are like "-march=mips32r2  -O3"

Thanks.

-- 
Best Regards.

Follow-Ups:
- Re: GCC4.3.4 downside against GCC3.4.4 on mips?
  - From: Paolo Bonzini

References:
- GCC4.3.4 downside against GCC3.4.4 on mips?
  - From: Amker.Cheng
- Re: GCC4.3.4 downside against GCC3.4.4 on mips?
  - From: VÃclav Haisman

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]