This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

--- Comment #7 from Oleg Endo <oleg.endo@t-online.de> 2011-10-09 23:34:45 UTC ---
(In reply to comment #6)

> Yep, maintenance burden but I don't mean ack/nak for anything.
> If it's enough fruitful, we should take that route.  When it
> gives 5% improvement in the usual working set like as CSiBE,
> hundreds lines would be OK, but if it's ~0.5% or less, it doesn't
> look worth to add many patterns for that.
> 
> > Isn't there a way to tell the combine pass not to do so, but instead first look
> > deeper at what is in the MD?
> 
> I don't know how to do it cleanly.
> 

I've tried out a couple of things and got some CSiBE numbers based on 
trunk rev 179430. Unfortunately only code size comparisons, no run time 
performance numbers. The tests were compiled with 
-ml -m4-single -Os -mnomacsave -mpretend-cmove -mfused-madd -freg-struct-return

Option 1)
  Use many (~10) patterns in the MD and some cost calculation tuning.
  The last patch required some adaptation, because the combine pass 
  started trying to match things slightly differently. I've also 
  noticed that it requires a special case for one pattern on SH4A...

  size of all modules: 2916390 -> 2909026    -7364 / -0.252504 %
  avg difference over all modules: -409.111111 / -0.273887 %
  max: compiler       22808 -> 22812           +4 / +0.017538 %
  min: libpng-1.2.5   99120 -> 97804        -1316 / -1.327684 %

Option 2)
  I've added another combine pass which has the make_compound_operation
  function turned off. The make_compound_operation function is used to
  produce zero_extract patterns. If the resulting "simplified" pattern does
  not match anything in the MD, combine reverts the transformation and 
  proceeds with the next insn. That way, it never tries to match the 
  tst #imm pattern in the MD. 
  With this option only ~5 patterns seem to be required and a small
  extension of the costs calculation.

  size of all modules:  2916390 -> 2909170    -7220 / -0.247566 %
  avg difference over all modules: -401.111111 / -0.254423 %
  max: compiler       22808 -> 22812           +4 / +0.017538 %
  min: libpng-1.2.5   99120 -> 97836        -1284 / -1.295400 %

Not so spectacular on average. It highly depends on the type of SW being
compiled, but it hits quite a lot of files in CSiBE.

Option 2 seems more robust even if it seems less effective, what do you think?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]