[PATCH] Fix PR target/28946
Richard Earnshaw
rearnsha@arm.com
Tue Sep 5 14:32:00 GMT 2006
On Tue, 2006-09-05 at 14:20, Roger Sayle wrote:
> On Tue, 5 Sep 2006, Uros Bizjak wrote:
> > 2006-09-06 Uros Bizjak <uros@kss-loka.si>
> >
> > PR target/28946
> > * combine.c (try_combine): Force PARALLEL of comparison and
> > arithmetic insn even if arithmetic result is not used.
> >
> > * gcc.target/i386/pr28946.c: New test.
>
> I was going to point out tht a generic change to combine like this
> really needs more testing that C & C++ on x86, especially during
> stage 3. However, from your latest comments in the bugzilla PR it
> looks like you've already discovered as issue with the use of "and"
> vs "test".
>
> Taking a small step backwards perhaps we're missing a completely
> different optimization here... if (((unsigned)x >> 5) != 0) could
> probably be better expanded/transformed as if ((unsigned) x >= 32),
> especially on pentium-4s where the cost of a shift is significant.
>
> A less intrusive patch/workaround for the 4.0 and 4.1 branches might
> be to add a peephole2 to recognize the "shrl $foo, reg; testl reg, reg"
> sequence and simplify it. Less than ideal, but unlikely to change
> anything other than the affected code.
>
> However if we stick with a combine solution, this might be one of
> those instances where we need to attempt to recognize the combination
> directly (to catch testl or a mythical shift-compare, ARM?), and if
> that fails, try again with a parallel containing the original SET.
GCC for ARM already generates a near optimal sequence for this, namely:
fct:
movs r0, r0, lsr #5
beq .L2
b fct1
.L2:
b fct2
which uses the following pattern in the MD file
(define_insn "*shiftsi3_compare0_scratch"
[(set (reg:CC_NOOV CC_REGNUM)
(compare:CC_NOOV (match_operator:SI 3 "shift_operator"
[(match_operand:SI 1 "s_register_operand" "r")
(match_operand:SI 2 "arm_rhs_operand" "rM")])
(const_int 0)))
(clobber (match_scratch:SI 0 "=r"))]
"TARGET_ARM"
"mov%?s\\t%0, %1%S3"
so I'm not sure why the x86 can't do something similar.
I'm concerned about trying to convert this to a comparison with a
constant. Non-small constants are very expensive to generate in Thumb
state.
More information about the Gcc-patches
mailing list