This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

Oleg Endo <oleg.endo@t-online.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #24412|0                           |1
        is obsolete|                            |

--- Comment #11 from Oleg Endo <oleg.endo@t-online.de> 2011-10-13 22:54:54 UTC ---
Created attachment 25491
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25491
Proposed patch including test case


> > 3) only zero_extract special cases
> 
> looks to be dominant.

Yes.  I've briefly looked through the test sources.  A popular use case
for bit test instructions seem to be single bit tests, which the patch
is basically adding. 

> I see.  I also expect that the experts have some idea for
> this issue.

Hm .. http://gcc.gnu.org/ml/gcc/2011-10/msg00189.html
Eric pointed me to the i386 back-end.  Unfortunately, what I found there is
where I originally started...

;; Combine likes to form bit extractions for some tests.  Humor it.

I.e. it is also coded against the behavior of the combine pass with a bunch
of pattern variations.  I guess that's the way it's supposed to be done then :T

> I don't think that it's too much.  Those numbers can be easily
> collected for CSiBE.  If your patterns are named, you could
> simply add "-dap -save-temps" to the compiler option which is
> specified when ruining CSiBE's create-config and then get
> the occurrences of testsi_6, for example, with something like
>   grep "testsi_6" `find . -name "*.s" -print` | wc -l
> after running the CSiBE size test.

Ah, right!  With the attached latest patch applied to trunk rev 179778
the numbers for 
  "-ml -m4-single -Os -mnomacsave -mpretend-cmove -mfused-madd
-freg-struct-return"
look something like that:

tstsi_t: 1391
tsthi_t: 4
tstqi_t: 23
tstqi_t_zero: 667
tstsi_t_and_not: 598
tstsi_t_zero_extract_eq: 70
tstsi_t_zero_extract_xor: 923

Notice that the split contributes to the tstsi_t number.
Also, the 3 patterns 
  tstsi_t_zero_extract_xor
  tstsi_t_zero_extract_subreg_xor_little
  tstsi_t_zero_extract_subreg_xor_big

are basically one and the same. On SH4A the subreg variants are required,
because tstsi_t_zero_extract_xor will never match.

I've also added a special case to sh_rtx_costs to detect at least the tstsi_t
pattern. However, the other patterns are not really covered by that and the 
combine pass calculates the cost as a sum of all the operations of the pattern.
I guess the selection of the test instruction could be stimulated a bit more 
by a more accurate costs calculation, but my feeling is that it won't do a lot.


Cheers,
Oleg


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]