This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

--- Comment #5 from Oleg Endo <oleg.endo@t-online.de> 2011-06-26 22:30:05 UTC ---
> Yes, that peephole doesn't catch all the patterns which could
> make tst #imm8,r0 use.  Perhaps it would be a good idea to get
> numbers for the test like CSiBE test with the vanilla and new
> insns/peepholes patched compilers.  Something covers 80% of
> the possible cases in the usual working set, it would be enough
> successful for such a micro-optimization, I guess.

I'd also like to see some numbers on those micro-improvements. I'll have a look
at CSiBE.
Anyway, why not just add all the currently known-to-work cases? What are your
concerns regarding that? I can imagine that it is a maintenance burden to keep
all those definitions and special cases in the MD up-to-date (bit rot etc). Do
you have anything other than that in mind? 

It seems that others have been trying to solve the same problem in a very
similar way: http://patchwork.ozlabs.org/patch/58832/ ;)

I've figured that the following pattern works quite well for this particular
case. It seems to catch all the bit patterns. (sh_tst_const_ok and
sh_zero_extract_val are added functions in sh.c)

(define_insn_and_split "*tstsi3"
  [(set (reg:SI T_REG)
    (zero_extract:SI (match_operand:SI 0 "arith_reg_operand" "")
            (match_operand:SI 1 "const_int_operand")
            (match_operand:SI 2 "const_int_operand")))]
  "TARGET_SH1 && sh_tst_const_ok (sh_zero_extract_val (operands[1],
operands[2]))"
  "#"
  "&& 1"
  [(const_int 0)]
  "{
    int tstval = sh_zero_extract_val (operands[1], operands[2]);
    emit_insn (gen_tstsi (operands[0], GEN_INT (tstval)));
    DONE;
  }"
)

...however, the problem with that one is that the T bit is inverted afterwards,
and thus the following branch logic (or whatever) gets inverted as well. One
option could be to emit some T bit inversion insn after gen_tstsi and then
optimize it away later on with e.g. a peephole (?), but I haven't tried that
yet.

The actual "problem" is that the combine pass first sees the andsi, eq, ...
insns and correctly matches them to those in the MD. But instead of continuing
to match patterns from the MD it expands the and insn into something like
zero_extract, which in turn will never be combined back into the tst insn.
Isn't there a way to tell the combine pass not to do so, but instead first look
deeper at what is in the MD?


Regarding the peephole:

> (satisfies_constraint_I08 (operands[1])
>       || satisfies_constraint_K08 (operands[1])

I guess this might generate wrong code for e.g. "if (x & -2)". When x has any
bits[31:1] set this must return true. The code after the peephole optimization
will look only at the lower 8 bits and would possibly return false for x =
0xFF000000, which is wrong. So it should be satisfies_constraint_K08 only,
shouldn't it?


> 
> Cost patch looks fine to me.  Could you propose it as a separate
> patch on gcc-patches list with an appropriate ChangeLog entry?
> When proposing it, please refer how you've tested it.  Also
> the numbers got with the patch are highly welcome.
> 
> BTW, do you have FSF copyright assignment for your GCC work?
> Although the cost patch itself is essentially several lines which
> doesn't require copyright assignment, the other changes you've
> proposed clearly require the paper work, I think.

I'll contact you directly regarding that.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]