[PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

Iyer, Balaji V balaji.v.iyer@intel.com
Fri Oct 18 21:19:00 GMT 2013


Hi Jeff,
	Please see my comments below. Also, I am adding all these changes to the files as you requested in my local directory. Should I send you an updated patch along the way?

Thanks,

Balaji V. Iyer.

> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: Thursday, October 17, 2013 5:30 PM
> To: Iyer, Balaji V; rth@redhat.com; Jason Merrill (jason@redhat.com); Aldy
> Hernandez (aldyh@redhat.com)
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)
> 
> On 09/11/13 12:18, Iyer, Balaji V wrote:
> > Hello Everyone, Couple weeks back, I had submitted a patch for review
> > that will implement Cilk keywords (_Cilk_spawn and _Cilk_sync) into
> > the C compiler. I recently finished C++ implementation also. In this
> > email, I am attaching 2 patches: 1 for C (and the common parts for C
> > and C++) and 1 for C++. The C++ Changelog is labelled
> > cp-ChangeLog.cilkplus and the other one is just ChangeLog.cilkplus.
> > There isn't much changes in the C patch. Only noticeable changes would
> > be moving functions to the common parts so that C++ can use them.
> >
> > It passes all the tests and does not affect  (by affect I mean fail a
> > passing test or pass a failing one) any of the other tests in the
> > testsuite directory.
> More random comments, questions & things to fix:
> 
> In cilk_common.c we have:
> +/* Returns the value in structure FRAME pointed by the FIELD_NUMBER
> +   (e.g. X.y).
> +   FIELD_NUMBER is an index to the structure FRAME_PTR.  For details
> +   about these fields, refer to cilk_trees structure in cilk.h and
> +   cilk_init_builtins function  in this file.  Returns a TREE that is
> the type
> +   of the field represented by FIELD_NUMBER.  */
> +
> +tree
> +cilk_dot (tree frame, int field_number, bool volatil) {
> +  tree field = cilk_trees[field_number];
> +  field = fold_build3 (COMPONENT_REF, TREE_TYPE (field), frame, field,
> +                      NULL_TREE);
> +  TREE_THIS_VOLATILE (field) = volatil;
> +  return field;
> +}
> 
> There's no description of what the VOLATIL argument does.  Similarly for
> cilk_arrow.  From reading other code it appears to have the usual meaning of
> volatile, but it's best to be explicit.

The main reason why I made it volatile (as expressed by the volatil bool variable) is that I want to make sure these values aren't optimized by the compiler and the value is fetched from memory on every access. I have added an explanation to the header comment.

> 
> In add_field:
> 
> 
> +/* This function will add FIELD of type TYPE to a defined built-in
> +   structure.  */
> +
> +static tree
> +add_field (const char *name, tree type, tree fields) {
> +  tree  t = get_identifier (name);
> +  tree field = build_decl (BUILTINS_LOCATION, FIELD_DECL, t, type);
> +  TREE_CHAIN (field) = fields;
> +  return field;
> +}
> NAME is not documented.  Nit: Just one space between the type specifier and
> the variable name  on this line:
> 

Fixed both.

> tree  t = get_identifier (name);
> 
> 
> Presumably add_field is called well in advance of layout_decl :-)
> 

Yes.

> install_builtin doesn't document CODE or PUBLISH in its comment header.
> 

Fixed.

> 
> Presumably users never concern themselves with the cilkrts_pedigree structure,
> it's entirely hidden, right?  So there's no way for a user to ask for an array of
> these things, right?
> 

A user can ignore pedigrees if they have no use for them (most don't), but __cilkrts_pedigree is specifically intended for read-only access by the user.  A pedigree is a way of identifying where a parallel application is in the Directed Acyclic Graph that can be used to represent it. Regardless of the schedule chosen by the runtime, the pedigree will always be the same for the same point in the calculation. The runtime creates and maintains this structure, but does not use it internally for anything (unless the record/replay logic is turned on).  The pedigree is accessed using a public function, __cilkrts_get_pedigree(), in the cilk_api.h header and can be used, for example, to create deterministic parallel random number generators.

> 
> I note you set DECL_ALIGN to BIGGEST_ALIGNMENT, which seems excessive.
> Why not compute the natural alignment for that structure and use that?
> Similarly for the cilkrts_stack_frame, except that it seems to use
> PREFERRED_STACK_BOUNDARY.  Is there some reason why each shouldn't be
> aligned on whatever boundary the target would normally align those structures?
> 

Fixed, I now let the compiler set these values

> For the "ctx" array, presumably it's some kind of context.  Where does the size
> of that array come from?
> 

Yes, it is context. Here is a link to the abi.h that describes fields in this structure: http://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libcilkrts/include/internal/abi.h;h=8f64b1bc5df8a0dad18cc14bb4e2cb51a60433e9;hb=7977b69ea6b3bb535aa85a757aee3d11d37d7ff0

> More generally is there any way we can ensure those structures are consistent
> between the runtime and the compiler?  How often have those structures
> changed over time and how did y'all deal with the changes?
> 

These structures don't change often and whenever they do (i.e. fields get added into it) it is kind of a big deal.  So far, when we have changed - once in the past 3 yrs. or so - we have allowed backward compatibility.

> 
> Presumably get_frame_arg is only used on calls into the cilk runtime, right and
> are all generated by GCC, and should always have the right number of
> arguments, should always be compatible, etc?!?  If so, then those sanity checks
> should probably be asserts.  Else it seems fine.
> 

Fixed.

> 
> Comment for cilk_detach indicates it returns const0_rtx, but signature has
> "void" for the return type.
> 

Fixed. The comment was an artifact of an old implementation.

> I'm assuming the code in expand_cilk_sync actually implements the pseudocode.
> I didn't verify every hunk of tree you built.
> 

Yes, it implements the equivalent pseudo code described in the function.

> I'd been wondering about concurrent access to various structures in here.  x86
> has a fairly easy-to-use memory model.  Has any thought been given to what
> would need to change to support the weaker memory models such as are found
> on PPC?
> 

We've given it some thought, but have neither the manpower nor charter to port Cilk to non-Intel architectures.  We've done a trial port to ARM to prove it could be done, but we ran fib (a small example) once and declared victory.  We're sure that more work needs to be done here and would welcome changes to the runtime to support other architectures.  In particular, most or all uses of assembly-language instructions should be replaced by compiler intrinsics and memory barriers probably need to be added for architectures that have a more relaxed memory model than the x86.  Our hope is that, once Cilk Plus was added to gcc, that members of the community would help us port the runtime to other architectures.

> More tomorrow I hope.
> 
> jeff



More information about the Gcc-patches mailing list