This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: 2.95, x86: severe performance problems with short arithmetic
- To: John Wehle <john@feith.com>
- Subject: Re: 2.95, x86: severe performance problems with short arithmetic
- From: Jeffrey A Law <law@cygnus.com>
- Date: Tue, 10 Aug 1999 23:00:53 -0600
- cc: zack@bitmover.com, gcc@gcc.gnu.org
- Reply-To: law@cygnus.com
In message <199908101943.PAA01276@jwlab.FEITH.COM>you write:
> processors (as you have apparentally noticed :-). The various
> ix86 patterns in i386.md attempt to avoid word instructions
> when possible so to hopefully produce better code, however at
> times they just make the situation worse. Consider the following:
>
> 1) A 16 bit write to a register immediately followed by a 32 bit read
> of the register. This will cause a stall. Converting the 16 bit
> write to a 32 bit write avoids the stall, however this may cause a
> stall with an earlier instruction (if I recall correctly).
>
> 2) A 16 bit write to a register followed by several other instructions
> which don't reference the register followed by a 32 bit read of the
> register. This will not stall.
>
> Really the code needs to do more analysis and scheduling to properly handle
> the issue of avoiding the prefix and partial register stalls. Any takers?
> :-)
Good summary. I think the key thing to remember is sometimes the
transformations which promote the operation from 8/16 bits to 32bits will
generate slower code, sometimes they will generate faster code.
Thus, just ripping the code out is the wrong approach. While it will help
Zack's problem, it is just as likely to cause someone else's example to crawl.
Instead we need to do more analysis to determine when it is profitable to
promote from 8/16 bit operations to 32bit operations.
jeff