This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: 2.95, x86: severe performance problems with short arithmetic


John Wehle wrote:
> > Looks like it would suffice to rip out the entire if block - but there
> > was a reason it was put there in the first place, right?
> 
> ix86 word instructions generally require a prefix which makes the
> resulting instruction larger and complicates decoding.  ix86 word
> instructions also can cause partial register stalls on P6 class
> processors (as you have apparentally noticed :-).  The various
> ix86 patterns in i386.md attempt to avoid word instructions
> when possible so to hopefully produce better code, however at
> times they just make the situation worse.  Consider the following:
> 
>   1) A 16 bit write to a register immediately followed by a 32 bit read
>      of the register.  This will cause a stall.  Converting the 16 bit
>      write to a 32 bit write avoids the stall, however this may cause a
>      stall with an earlier instruction (if I recall correctly).

Does this include something like this?

	movzbw %dl, %ax
	addl %eax, %esi

It certainly sounds like it, and would explain why such a tiny
difference turns into a 2x performance loss.

...
> Really the code needs to do more analysis and scheduling to properly handle
> the issue of avoiding the prefix and partial register stalls.  
> Any takers? :-)

It appears to have already been done, in the new_ia32_branch.  I was
hoping for a quick fix that could go into 2.95.1, but I can wait for
2.96 or 3.0 or whatever.

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]