This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Issue with _Cilk_for



> -----Original Message-----
> From: Jakub Jelinek [mailto:jakub@redhat.com]
> Sent: Friday, January 31, 2014 11:04 AM
> To: Iyer, Balaji V
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: Issue with _Cilk_for
> 
> On Fri, Jan 31, 2014 at 04:42:57PM +0100, Jakub Jelinek wrote:
> > Can you explain why you emit anything in between the parallel and
> _Cilk_for?
> > I don't see why it should be needed.
> > Depending on what the Cilk+ standard allows you to do (can array.begin
> > () be evaluated multiple times or not, and if not, can you invoke say
> > a copy constructor on it), you should just emit an EXPR_STMT that
> > initializes an artificial scalar (say get_temp_regvar created) to
> > (array.end () - array.begin ()) before you add the parallel, or, if
> > array.begin () can't be evaluated multiple times, then construct some
> > temporary before the parallel with array.begin () as initializer and
> > then do the subtraction between array.end () and that temporary.
> >
> > Then the parallel, with _Cilk_for immediately in it and just another
> > temporary scalar as loop iterator there, and only inside of _Cilk_for
> > declare the iterator var and construct (if it has been declared in
> > _Cilk_for, otherwise just initialize it) to, depending on what Cilk+
> > requires, either to array.begin () and then operator+ it to the
> > corresponding value, or construct already to array.begin () + scalariv.
> 
> Actually, small correction, the iterator variable likely should be constructed
> before the parallel and just marked as firstprivate on the parallel, then you
> even don't need any special temporary.
> 
> So for:
> 
> _Cilk_for (vector<int>::iterator iter = array.begin(); iter != array.end(); iter++)
> {
>   if (*iter == 6)
>     *iter = 13;
> }
> 
> emit basically:
>   {
>     Iterator iter;
>     Iterator::Iterator(&iter, array.begin());
>     ptrdiff_t tmpcount = Iterator::operator-(array.end(), &iter);
>     ptrdiff_t tmpiv;
>     ptrdiff_t tmplast = 0;
>     #pragma omp parallel firstprivate(iter, tmpcount, tmplast) private(tmpiv)
>     {
>       _Cilk_for(tmpiv = 0; tmpiv < tmpcount; tmpiv++)
>       {
>         Iterator::operator+(&iter, tmpiv - tmplast);
>         tmplast = tmpiv;
> 	{
>           if (*iter == 6)
> 	    *iter = 13;
>         }
>       }
>     }
>   }
> 

I tried to do this. The thing is, you had me model like this:

#pragma omp parallel for 
_Cilk_for (...)
{
	Body
}

Now, the compiler will do something like this:

#pragma OMP parallel
{
	#pragma Omp for 
	{
		_Cilk_for (...)
			Body
	}
}

Now, if by the time it starts to look at evaluating/breaking up the _Cilk-for's parts, we are already at the scope inside #pragma omp parallel. If I try to pop here, it has some scope issues with #pragma omp parallel. This requires a lot of rewrite of existing OMP code to port it for _Cilk_for. This can be done but will take significant time which I don't think I have since we are close to end of stage3.

But, if I had the parallel around the body, then the body is transplanted into the child function and everything I need for _Cilk-for is done. Yes, it does not make sense for an OMP expert. I am nowhere close to being an expert and even I had a hard time to wrap my head around this.

We can look into what you want, but in the meantime, can you accept this patch with the way I have modelled so that the feature can make it into 4.9?

Thanks,

Balaji V. Iyer.



> 	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]