This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Fix PR target/28946
- From: Roger Sayle <roger at eyesopen dot com>
- To: "H. J. Lu" <hjl at lucon dot org>
- Cc: Uros Bizjak <ubizjak at gmail dot com>, <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 5 Sep 2006 10:19:23 -0600 (MDT)
- Subject: Re: [PATCH] Fix PR target/28946
On Tue, 5 Sep 2006, H. J. Lu wrote:
> As I have pointed out in the bug report, some recent processors need
> the extra "testl %eax, %eax" here
> shrl $5, %eax
> testl %eax, %eax
> to avoid partial flag register stall since a shift instruction may
> not set flag register since shift count may be 0.
I think we're converging on a backend solution, that should allow the
use or omission of the "testl" (after shifts by constants) to be x86
family specific. For example, when optimizing for size, it should be
reasonable to omit "testl", provided the shift count is always known
to be non-zero. One would hope that the i386 backend never currently
emits a shift with an immediate bit count operand of zero, but stranger
things have been known to happen. Hence if the solution is a new
pattern like the ARM has, or a new peephole2, it can be guarded with
"optimize_size || !TARGET_PARTIAL_REG_STALL" (or whatever).
I'd personally also like to see these shift-comparisons canonicalized to
simple comparisons against constants, but that may requires some buy-in
and support from the ARM folks and other potentially affected backends.
Perhaps something for 4.3, and not for fixing this 4.0/4.1 regression.