This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PowerPC suboptimal "add with carry" optimization


Noticed that gcc 4.3.4 doesn't optimize "add with carry" properly:

static u32
add32carry(u32 sum, u32 x)
{
  u32 z = sum + x;
  if (sum + x < x)
      z++;
  return z;
}
Becomes:
add32carry:
	add 3,3,4
	subfc 0,4,3
	subfe 0,0,0
	subfc 0,0,3
	mr 3,0
Instead of:
	addc 3,3,4
	addze 3,3

This slows down the the Internet checksum sigificantly

Also, doing this in a loop can be further optimized:

for(;len; --len)
   sum = add32carry(sum, *++buf);


	addic 3, 3, 0 /* clear carry */
.L31:
	lwzu 0,4(9)
	adde 3, 3, 0 /* add with carry */
	bdnz .L31

	addze 3, 3 /* add in final carry */


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]