This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
RE: Turning off unrolling to certain loops
Hello,
I faced a similar issue a while ago. I filed a bug report
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36712) In the end,
I implemented a simple tree-level unrolling pass in our port
which uses all the existing infrastructure. It works quite well for
our purpose, but I hesitated to submit the patch because it contains
our not-very-elegannt #prgama unroll implementation.
The following two functions do the unrolling. I insert the pass
just after the loop_prefetch pass (4.4)
Cheers,
Bingfeng Mei
/* Perfect unrolling of a loop */
static void tree_unroll_perfect_loop (struct loop *loop, unsigned factor,
edge exit)
{
sbitmap wont_exit;
edge e;
bool ok;
unsigned i;
VEC (edge, heap) *to_remove = NULL;
/* Unroll the loop and remove the exits in all iterations except for the
last one. */
wont_exit = sbitmap_alloc (factor);
sbitmap_ones (wont_exit);
RESET_BIT (wont_exit, factor - 1);
ok = gimple_duplicate_loop_to_header_edge
(loop, loop_latch_edge (loop), factor - 1,
wont_exit, exit, &to_remove, DLTHE_FLAG_UPDATE_FREQ);
free (wont_exit);
gcc_assert (ok);
for (i = 0; VEC_iterate (edge, to_remove, i, e); i++)
{
ok = remove_path (e);
gcc_assert (ok);
}
VEC_free (edge, heap, to_remove);
update_ssa (TODO_update_ssa);
#ifdef ENABLE_CHECKING
verify_flow_info ();
verify_dominators (CDI_DOMINATORS);
verify_loop_structure ();
verify_loop_closed_ssa ();
#endif
}
/* Go through all the loops:
1. Determine unrolling factor
2. Unroll loops in different conditions
-- perfect loop: no extra copy of original loop
-- other loops: the original version of loops to execute the remaining iterations
*/
static unsigned int rest_of_tree_unroll (void)
{
loop_iterator li;
struct loop *loop;
unsigned ninsns, unroll_factor;
HOST_WIDE_INT est_niter;
struct tree_niter_desc desc;
bool unrolled = false;
initialize_original_copy_tables ();
/* Scan the loops, inner ones first. */
FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST)
{
est_niter = estimated_loop_iterations_int (loop, false);
ninsns = tree_num_loop_insns (loop, &eni_size_weights);
unroll_factor = determine_unroll_factor (loop, ninsns, &desc, est_niter);
if (unroll_factor != 1)
{
tree niters = number_of_exit_cond_executions(loop);
bool perfect_unrolling = false;
if(niters != NULL_TREE && niters!= chrec_dont_know && TREE_CODE(niters) == INTEGER_CST){
int num_iters = tree_low_cst(niters, 1);
if((num_iters % unroll_factor) == 0)
perfect_unrolling = true;
}
/* If no. of iterations can be divided by unrolling factor, we have perfect unrolling */
if(perfect_unrolling){
tree_unroll_perfect_loop(loop, unroll_factor, single_dom_exit(loop));
}
else{
tree_unroll_loop (loop, unroll_factor, single_dom_exit (loop), &desc);
}
unrolled = true;
}
}
free_original_copy_tables ();
/* Need to rebuild call graph due if new function calls are created due to
loop unrolling
FIXME: rebuild cgraph will lose some information like reason of not inlining*/
if(unrolled)
rebuild_cgraph_edges();
/*debug_cgraph();*/
return 0;
}
> -----Original Message-----
> From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On
> Behalf Of Jean Christophe Beyler
> Sent: 14 October 2009 19:29
> To: Zdenek Dvorak
> Cc: gcc@gcc.gnu.org
> Subject: Re: Turning off unrolling to certain loops
>
> Ok, I've actually gone a different route. Instead of waiting for the
> middle end to perform this, I've directly modified the parser stage to
> unroll the loop directly there.
>
> Basically, I take the parser of the for and modify how it adds the
> various statements. Telling it to, instead of doing in the
> c_finish_loop :
>
> if (body)
> add_stmt (body);
> if (clab)
> add_stmt (build1 (LABEL_EXPR, void_type_node, clab));
> if (incr)
> add_stmt (incr);
> ...
>
> I tell it to add multiple copies of body and incr and the at the end
> add in the loop the rest of it. I've also added support to remove
> further unrolling to these modified loops and will be handling the
> "No-unroll" pragma. I then let the rest of the optimization passes,
> fuse the incrementations together if possible, etc.
>
> The initial results are quite good and seem to work and
> produce good code.
>
> Currently, there are two possibilities :
>
> - If the loop is not in the form we want, for example:
>
> for (;i<n;)
> {
> ...
> }
>
> Do we still unroll even though we have to trust the user that the
> number of unrolling will not break the semantics ?
>
> To handle this, I am adding warnings that will appear if the loop is
> anything but :
>
> for (i=C1; i < C2; i ++)
> {
> ...
> }
>
> Later on, once this is thoroughly tested, I will allow :
>
> for (i=C1; fct (i, C2); i = fct2 (i))
>
>
> where fct is any comparison function with only i and C2,
> fct2 is a incrementation/decrementation calculation using i.
>
>
> Any comments ? Concerns ? Questions ?
> Thanks in advance,
> Jc
>
> On Thu, Oct 8, 2009 at 12:22 PM, Jean Christophe Beyler
> <jean.christophe.beyler@gmail.com> wrote:
> > Hi,
> >
> >> such an epilogue is needed when the # of iterations is not
> known in the
> >> compile time; it should be fairly easy to modify the
> unrolling not to
> >> emit it when it is not necessary,
> >
> > Agreed, that is why I was surprised to see this in my
> simple example.
> > It seems to me that the whole unrolling process has been made to, on
> > purpose, have this epilogue in place.
> >
> > In the case where the unrolling would be perfect (ie. there would be
> > no epilogue), the calculation of the max bound of the
> unrolled version
> > is always done to have this epilogue (if you have 4
> iterations and ask
> > to unroll twice, it will actually change the max bound to 3,
> > therefore, having one iteration of the unrolled version and 2
> > iterations of the original...). I am currently looking at
> the code of
> > tree_transform_and_unroll_loop to figure out how to change this and
> > not have an epilogue in my cases.
> >
> > Jc
> >
>
>