This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/18019] [4.0 Regression] -march=pentium4 generates word fetch instead of byte fetch


------- Additional Comments From roger at eyesopen dot com  2004-11-09 18:10 -------
I believe that the bug is a latent problem that was only exposed by my patch.
The *movqi_1 (and *movhi_1) pattern(s) have two attributes "type" and "mode".
The mode attribute indicates the RTL machine mode the move should be done in
QImode or SImode, and the type attribute is either IMOVX or IMOV indicating with
or without extension respectively.  My patch tweaked the type attribute, such
that when optimizing for size IMOV is smaller on x86 than IMOVX.  The
independent underlying problem is that the mode is incorrectly set to SImode,
even if the alignment is unknown on partial register stall machines.

IMOVX  SImode   -> movzbl
IMOVX  QImode   -> movzbl
IMOV   SImode   -> movl
IMOV   QImode   -> movb

This issue was hidden prior to my patch as fortuituously even if the mode was
incorrectly SImode, using a IMOVX instruction would only read a single byte.

The issue appears to be when to use SImode and when QImode.  Clearly, partial
register stall machines get a performance benefit from using SImode, but it
looks like there's no attempt to check the alignment of the access when making
this optimization.

I'm happy to revert my patch, clearly a minor code size tweak is much less
important than a correctness issue.  But I'm trying to convince myself that
this isn't papering over the more fundamental problem in mode selection.

Any help/advice/opinions would be greatly appreciated.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18019


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]