This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] Exploiting dual mode operation, implementation.

Following our message (
regarding exploiting dual mode operations, we enclose an overview of the
implementation of the algorithm.

Main entry point: se_elimination()
The optimization is called from passes.c right before the gcse is
called for the first time.

Phase 1.a - prepare for combine
Implemented in se_prepare_for_combine().
In these phase the (Redundant) sign extensions are generated
before the uses.
A.  All the uses and defs that are relevant for the optimization
are marked using: mark_all_legal_defs() and mark_all_legal_uses().
B.  All the webs are generated. Only webs that all of its uses and defs
are marked as relevant are handled from this point.
C.  Sign extension instructions are generated before relevant uses and
stored in a splay tree for the following phase using:
D.  The definitions pattern is transformed and the new sign extension
instruction is stored in a splay tree for the combine phase or in a
virtual array for the cleanup phase, according to the definition type.
using: se_gen_defs_se()

Phases 1.b - combine
       2.a - prepare for PRE
These phases are activated on each definition and use independently.
At this point definitions won't be actually combined. Their combine
information will be used to correctly prepare for PRE.
A.  For each relevant use, the function:
se_combine_and_prepare_use_for_PRE() tries to combine one, two or three
sign extension instructions with it using the function: se_try_combine().
If the combine succeeds the sign extension instruction(s) will be deleted
and the combined instruction will replace the original use.
B.  For each relevant definition, the function: se_prepare_def_for_PRE()
tries to combine the sign extension instruction with it using the
function: se_try_combine(). Only if combine fails, the sign extention
will be removed.

Phase 2.b - PRE
The phase is the regular gcse pass. There is no change in it.

Phase 3 - definition combine and cleanup
This phase is called from passes.c right after the first pass of gcse
(inside rest_of_handle_gcse()).
Implemented in se_cleanup().
A.  se_combine_def_after_PRE() is called for each definition that was
successfully combined before the PRE. It tries to combine it with the
sign extension instruction again and if it succeeds it replaces the sign
extension instruction with the new combined instruction.
B.  All the dummy sign extension instructions that were inserted during
the optimization are removed.

There are couple of problematic issues that should be mentioned:

1. The dealing with notes is not optimal. All the notes that may be
   incorrect after the optimization are removed. This could ruin some
   opportunities for other optimizations.A better solution would be to
   update these instructions but this is quite complicated.
2. The optimization uses the df.c module for reaching definition.
   This analysis is expensive. One of the tests in testsuite
   (gcc.c-torture/compile/20001226-1.c) does not pass just because timeout
   during compilation.
3. There is problem to check the correctness of the optimization because
   mainline does not pass bootstrap with the -m64 flag.
   If anyone succeeded to bootstrap mainline with the -m64 flag,
   please let me know.

Currently, on POWER4 the optimization passes:
1. Regression tests (except one, see above).
2. All the benchmarks that the original mainline succeeds in using the
   -m64 flag.

Comments welcomed.

Attachment: se.c
Description: Binary data

Attachment: changed_files_diff.txt
Description: Text document

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]