This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GCC4.3.4 downside against GCC3.4.4 on mips?


> Posting some random numbers without a test-case and precise command line
> parameters for both compilers makes the numbers useless, IMHO. You also
> only mention instruction counts. Have you actually benchmarked the
> resulting code? CPUs are complicated and what you might perceive as worse
> code might actually be superior thanks to scheduling and internal CPU
> parallelism etc.

Thanks for reminding.
After some investigation, I could demonstrate the issue by following
piece of code:
-------------------------------------begin here-------------------
extern int *p[5];

# define REAL_RADIX_2            24
# define REAL_MUL_2(x, y)        (((long long)(x) * (long long)(y)) >>
REAL_RADIX_2)


void func(int *b1, int *b2)
{
  int c0 = p[3][0];
  int c1 = p[3][1];

  b2[0x18] = b1[0x18] + b1[0x1B];
  b2[0x1B] = REAL_MUL_2((b1[0x18] - b1[0x1B]) , c0);

  b2[0x19] = b1[0x19] + b1[0x1A];
  b2[0x1A] = REAL_MUL_2((b1[0x19] - b1[0x1A]) , c1);

  b2[0x1C] = b1[0x1C] + b1[0x1F];
  b2[0x1F] = REAL_MUL_2((b1[0x1F] - b1[0x1C]) , c0);

  b2[0x1D] = b1[0x1D] + b1[0x1E];
  b2[0x1E] = REAL_MUL_2((b1[0x1E] - b1[0x1D]) , c1);
}
-------------------------------------cut here-------------------

It seems GCC4.3.4 always expands the long long multiplication into
three long multiplications, like
-------------------------------------begin here-------------------
#  b2[0x1A] = REAL_MUL_2((b1[0x19] - b1[0x1A]) , c1);

	lw	$6,104($4)
	lw	$2,100($4)
	subu	$2,$2,$6
	mult	$11,$2
	sra	$6,$2,31
	madd	$6,$9
	mflo	$6
	multu	$2,$9
	mfhi	$3
	addu	$3,$6,$3
	sll	$6,$3,8
	mflo	$2
	srl	$7,$2,24
	or	$7,$6,$7
	sw	$7,104($5)
-------------------------------------cut here-------------------

while GCC3.4.4 treats the long long multiplication just like simple
ones, which generates only one
mult insn for each statement, like
-------------------------------------begin here-------------------
#  b2[0x1A] = REAL_MUL_2((b1[0x19] - b1[0x1A]) , c1);

	lw	$2,100($4)
	lw	$7,104($4)
	subu	$3,$2,$7
	mult	$3,$9
	mflo	$6
	mfhi	$25
	srl	$15,$6,24
	sll	$24,$25,8
	or	$14,$15,$24
	sw	$14,104($5)
-------------------------------------cut here-------------------

In my understanding, It‘s not necessary using three mult insn to implement
long long mult, since the operands are converted from int type.

And as before, the compiling options are like "-march=mips32r2  -O3"

Thanks.

-- 
Best Regards.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]