This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] New pass to partition single function into multiple (resubmission)
- From: Ralf Wildenhues <Ralf dot Wildenhues at gmx dot de>
- To: Revital1 Eres <ERES at il dot ibm dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Sun, 7 Jun 2009 09:34:33 +0200
- Subject: Re: [PATCH] New pass to partition single function into multiple (resubmission)
- References: <OF92C32BEE.9C242D7E-ONC22575AD.003A0103-C22575AD.003A28EE@il.ibm.com> <571f6b510905050509oe83494dyc5d06a45175666bf@mail.gmail.com> <OFF8D89692.8B5697D5-ONC22575C9.00270131-C22575C9.004BCF7C@il.ibm.com>
Hello,
a few random typo nits:
* Revital1 Eres wrote on Tue, Jun 02, 2009 at 03:48:03PM CEST:
> --- doc/invoke.texi (revision 148013)
> +++ doc/invoke.texi (working copy)
> @@ -7236,6 +7244,16 @@ You will not be able to use @code{gprof}
> specify this option and you may have problems with debugging if
> you specify both this option and @option{-g}.
>
> +@item -fpartition-functions-into-sections=@var{n}
> +@opindex fpartition-functions-into-sections
> +Partition function into sections according to a threshold which indicates
> +the maximum number of bytes each section can contain. This transformation
> +is aimed for targets with limited local store which may not suffice to
> +hold large functions. To overcome this the function is partitioned into
s/this/& limitation,/
> +several sections which can be swapped while running. This may result
> +in an undefined behavior when using an @code{asm} statement.
s/ an / /
> +@option{-ffunction-sections} flag is implied implicitly when setting this flag.
s/ implicitly / /
> @item -fbranch-target-load-optimize
> @opindex fbranch-target-load-optimize
> Perform branch target register load optimization before prologue / epilogue
> --- bb-reorder.c (revision 148013)
> +++ bb-reorder.c (working copy)
> +/* Information regarding insn.
> + The instruction size could be dynamic (it depends on how the
> + targetm.bb_partitioning.estimate_instruction_size() is implemented).
> + However, the sizes of the instructions are caclulated once and cached
calculated
> + for later use. This should be good enough for the instruction size
> + estimation. */
> +struct insn_aux
> +static bool
> +start_new_section_for_loop (basic_block bb,
> + unsigned HOST_WIDE_INT last_section_size)
> +{
> + struct loop_info_def *cur_loop;
> + unsigned i;
> +
> + if (fbb_data[bb->index].loops_info_list == NULL)
> + return false;
> +
> + if (last_section_size == 0)
> + return false;
> +
> + /* The loops are sorted in loops_info_list according to the loop size
> + (in bytes); where the loop with the samllest layout appears first.
smallest
> +/* Create sections for the current function. Return the edges that
> + cross between sections in CROSSING_EDGES array which is of size
> + N_CROSSING_EDGES so they could be fixed later.
> +
> + The partitioning is done by traversing the basic-blocks according to
> + their order in the code layout. When a new basic-block is encountered
> + a decision is made whether to add it to the last section, split it
> + or add it to a new section.
> +
> + Here is a short description of the decision process:
> +
> + Start a new section if one of the following conditions exist:
> +
> + a) if adding the basic-block to the last section causes the last
basic block
> + section to exceed the max section size and it's size is less
its
> + than max section size.
> + b) if a loop starts at this basic-block and the loop can not
basic block
> + fully be inserted into the last section and it's size is less
its
> + than max section size.
than the
> + c) the basic block hotness property is different from it's
> + previous basic-block's hotness property.
> + d) there is a machine-specific reasons. (e.g., the
> + basic-block starts a sequence that should reside in a single
> + section; such that its size is less than the section size
> + threshold).
> +
> + If the above conditions are not true split the basic-block
> + if adding the basic-block to the last section causes the last
> + section to exceed the max section size.
> + Otherwise add it to the last section. */
> +
> +static void
> +create_sections (void)
> +{
> + /* Do not add the basic-block to the last section. This is due
basic block
> + to the possible reasons:
, due to one of these reasons:
> + 1) the addition of the basic block to last section will
> + cause the last section to exceed the max section size.
> + 2) the basic-block starts a loop that can not fully be
basic block
cannot
> + inserted to the last section without exceeding the max
s/ to / into /
> + section size and it's size is less than the max section size.
its
> + 3) The hotness properaty of this basic-block is different
property
basic block
> + then it's previous.
than that of the previous
or:
than that of its predecessor
> + 4) There is a machine-specific reasons. */
reason
> + /* Start a new section if one of following conditions
> + are fulfilled:
> + 1) the last section is not empty and the bb size is less than
> + the section size threshold.
> + 2) the basic-block starts a loop that its size is less than
basic block
s/its size/has a size that/
> + the section size threshold.
> + 3) The hotness property of this basic block is different then
> + the previous.
see above
> + 4) There is a machine-specific reasons. */
reason
> +static void
> +record_loops_boundaries (void)
> +{
> + /* We assume that if two loops are disjoint they can not interleave
> + and if they are not disjoint one is completely contained in the
> + other. This assumption help us to avoid the case of interleaved
helps
s/ to / /
> + loop intervals (which is somewhat unlikely to happen in practice).
> + TODO: calculate all the loops's size in one pass. */
loop sizes
or:
all sizes of all loops
> + FOR_EACH_LOOP (li, loop, 0)
> + {
> +static void
> +get_estimate_section_overhead (void)
> {
> + /* The machine depndent pass could add extra instructions
dependent
> + as a result of the new branches. */
> + estimate_section_overhead =
> + targetm.bb_partitioning.estimate_section_overhead ();
> --- config/spu/spu.c (revision 148013)
> +++ config/spu/spu.c (working copy)
> +/* Return the size of instruction INSN in bytes. Take into account the
> + size of extra machine depndent instructions that can be added
dependent
> + as a result of insn (like branch-hints for branch instructions).
> + Called when partitioning a function into sections. */
> +static unsigned HOST_WIDE_INT
> +spu_estimate_instruction_size (rtx insn)
> --- gcc.dg/section/prof/section-prof-1.c (revision 0)
> +++ gcc.dg/section/prof/section-prof-1.c (revision 0)
> +/* Should be Vectorized. Fixed misaligment in the inner-loop. */
misalignment
> +/* Not vectorized because we can't determine the inner-loop bound. */
> --- gcc.dg/section/section-size-2.c (revision 0)
> +++ gcc.dg/section/section-size-2.c (revision 0)
> +/* Should be Vectorized. Fixed misaligment in the inner-loop. */
misalignment
> +/* Not vectorized because we can't determine the inner-loop bound. */
Cheers,
Ralf