This is the mail archive of the
mailing list for the GCC project.
Re: Serious performance regression -- some tree optimizer questions
- From: Ulrich Weigand <Ulrich dot Weigand at de dot ibm dot com>
- To: Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>
- Cc: Zdenek Dvorak <dvorakz at suse dot de>, gcc at gcc dot gnu dot org, Michael Matz <matz at suse dot de>
- Date: Wed, 12 Jan 2005 19:12:45 +0100
- Subject: Re: Serious performance regression -- some tree optimizer questions
Zdenek Dvorak <email@example.com> wrote on 01/03/2005
> > I've added a hack to get_computation_at to *also* call
> > strip_offset (just like get_computation_cost_at now does
> > when patch #7 is applied), and now I'm getting good code ...
> this is a bit dangerous -- I am fairly sure that strip_offset is
> sometimes wrong. This is OK in the current use (when it is used just
> inside the heuristics to choose the right candidates), but using it
> for code generation would almost surely lead to misscompilations.
> I will try to come up with some solution for the problem.
There was one last IV selection problem w.r.t. to the IV used for
the comparison at the end of the loop; may_eliminate_iv did not
allow an unsigned candidate to be used instead of the orignal
(signed) variable. Therefore, IV selection chose a candidate
incremented at the end of the loop instead of one incremented
before the exit test; this caused a new basic block to be
inserted, which in turn prevented some subsequent optimizations.
The patch #3 from your mail, however, together with a change to
may_eliminate_iv to actually pass the OEP_IGNORE_SIGNEDNESS flag
to operand_equal_p, fixes this problem.
I'm now running a current CVS head together with the following
your patch #3 (introduces OEP_IGNORE_SIGNEDNESS)
my hack to use OEP_IGNORE_SIGNEDNESS in may_eliminate_iv
your patch #7 (introduces strip_offset)
my hack to use strip_offset in get_computation_at
your latest IV sign-extension patch
This compiler now generates absolutely perfect code for the mgrid
hot spot on 64-bit s390x; I couldn't write better assembler by hand ;-)
Together with Dan Berlin's value handle backsubstituon patch I'm
also getting perfect code on 31-bit s390.
I haven't looked at other test cases yet, but I'd assume the ivopt
changes significantly benefit other code as well ...
Do you think it will be possible to still get some/most of these
changes into 4.0?
Mit freundlichen Gruessen / Best Regards
Dr. Ulrich Weigand
Linux for S/390 Design & Development
IBM Deutschland Entwicklung GmbH, Schoenaicher Str. 220, 71032 Boeblingen
Phone: +49-7031/16-3727 --- Email: Ulrich.Weigand@de.ibm.com