[patch] cilkplus array notation for C (clean, independent patchset, take 1)

Fri Mar 22 22:04:00 GMT 2013

Attached, please find a patch that should be applied to the head of cilkplus-merge. 
This patch is generated by diffing with the following hash: 4f4932be8230284919d197cccb4b10201f82a0b3

This patch also adds documentation about the built-in reduction funtions of array notations that Aldy pointed out in a previous email.

I have not fixed all the issues below (the big one that is left is the bultin function representation that Joseph Pointed out). I have fixed most of the other issues. All the things I have fixed are marked by "FIXED!"

Is this Ok to commit into cilkplus-merge?

Thanks,

Balaji V. Iyer.

> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of Joseph S. Myers
> Sent: Thursday, March 21, 2013 11:56 AM
> To: Aldy Hernandez
> Cc: Iyer, Balaji V; gcc-patches
> Subject: Re: [patch] cilkplus array notation for C (clean, independent patchset,
> take 1)
> 
> On Wed, 20 Mar 2013, Aldy Hernandez wrote:
> 
> > Joseph, folks, et al... How does this look?
> 
> This review largely deals with coding style (interpreted broadly).  I'll review more
> of the substance separately later; reposting with fixes for all the accumulated
> issues is probably a good idea anyway, to avoid the same issues coming up
> repeatedly.
> 
> > 	* c-common.c (c_define_builtins): When cilkplus is enabled, the
> > 	function array_notation_init_builtins() is called.
> 
> Don't use () after a function name when referring to the function.

FIXED!

> 
> > diff --git a/gcc/c-family/array-notation-common.c
> > b/gcc/c-family/array-notation-common.c
> 
> > +int extract_sec_implicit_index_arg (location_t, tree); bool
> > +is_sec_implicit_index_fn (tree); void array_notation_init_builtins
> > +(void);

FIXED!

> 
> Non-static function declarations like this should not be inside a .c file.
> If these functions are used outside this file, there should be an associated
> header that declares them; include it in the .c file.  If only used inside the .c file
> that defines them, make them static (and topologically sort static functions
> inside a source file so that forward static declarations are only needed for cases
> of recursion).
> 
> > +/* Mark the FNDECL as cold, meaning that the function specified by FNDECL
> is
> > +   not run as is.  */
> 
> The cold attribute means unlikely to be executed rather than "not run as is".
> Maybe "not run as is" is what's relevant here, but I'm not clear why this attribute
> would be useful for built-in functions at all - the documentation suggests it's
> only relevant when a user defines a function themselves, and affects the code
> generated for that function, so wouldn't be relevant at all for built-in functions.
> 
> > +void
> > +array_notation_init_builtins (void)
> 
> Other built-in functions use various .def files (builtins.def and the files it includes)
> to avoid lots of repetitive code like this - can you integrate this with that
> mechanism?  If you do so, then you should be able to avoid (or massively
> simplify) functions like:
> 
> > +/* Returns true if the function call specified in FUNC_NAME is
> > +   __sec_implicit_index.  */
> > +
> > +bool
> > +is_sec_implicit_index_fn (tree func_name)
> 
> because code can use the BUILT_IN_* enum values to test whether a particular
> function is in use - which is certainly cleaner than using strcmp against the
> function name.
> 
> > +/* Returns the first and only argument for FN, which should be a
> > +   sec_implicit_index function.  FN's location in the source file is is
> > +   indicated by LOCATION.  */
> > +
> > +int
> > +extract_sec_implicit_index_arg (location_t location, tree fn) {
> > +  tree fn_arg;
> > +  HOST_WIDE_INT return_int = 0;
> > +  if (!fn)
> > +    return -1;
> 
> Why the random check for a NULL argument?  If a NULL argument is valid
> (meaning that it makes the code cleaner to allow such arguments rather than
> making sure the function isn't called with them), this should be documented in
> the comment above the function; otherwise, if such an argument isn't valid,
> there is no need to check for it.

I always tend to check for a null pointer before I access the fields in the structure. In this case it is unnecessary. In some cases (e.g. find_rank) there is a good chance a null pointer will be passed into the function and we need to check that and reject those.

> 
> You declare return_int as HOST_WIDE_INT, but it only receives a value cast to
> int, and is used only to store a value returned as int.  Either use int consistently,
> or HOST_WIDE_INT consistently, but I don't see a reason to use both.

FIXED!

> 
> > +  if (TREE_CODE (fn) == CALL_EXPR)
> > +    {
> > +      fn_arg = CALL_EXPR_ARG (fn, 0);
> > +      if (really_constant_p (fn_arg))
> 
> I don't think really_constant_p is what's wanted;
> <http://software.intel.com/sites/default/files/m/4/e/7/3/1/40297-
> Intel_Cilk_plus_lang_spec_2.htm>
> says "The argument shall be an integer constant expression.", and such
> expressions always appear in the C front end as INTEGER_CST.  So you can just
> check for INTEGER_CST.

What about C++? This function is shared by both C and C++.

> 
> Now a subtlety here is that the function argument will have been folded by this
> point, meaning that cases that aren't integer constant expressions in C standard
> terms will be wrongly allowed (both by the original code and by a version
> checking against INTEGER_CST).  In such cases, the way to get things checked
> correctly is to use a keyword rather than a built-in function - as with
> __builtin_choose_expr or __builtin_shuffle, for example.  Since this operation
> seems special in ways that built-in functions generally aren't, that seems
> reasonable anyway.  So the code parsing this keyword would check that the
> argument is an INTEGER_CST, of integer type (since INTEGER_CSTs can have
> pointer type in GCC), like that for __builtin_choose_expr does.  It would then
> quite likely create its own tree code for the operation, rather than using a
> CALL_EXPR at all.  (It would need to manage converting to int, given how the
> specification defines things in terms of a prototype for type int - so e.g. a
> constant 1ULL << 32 would act like 0 if int is 32 bits, under the present
> specification.)
> 
> The specification doesn't seem very clear on to what extent the __sec_*
> operations must act like functions (what happens if someone puts parentheses
> around the __sec_* name, for example - that wouldn't work with the keyword
> approach).  So the specification should be clarified there, but I think saying the
> __sec_* operations are syntactically special, like keywords, is more appropriate
> than requiring other uses to work.
> 
> > +	return_int = (int) int_cst_value (fn_arg);
> > +      else
> > +	{
> > +	  if (location == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (fn))
> > +	    location = EXPR_LOCATION (fn);
> > +	  error_at (location, "__sec_implicit_index parameter must be a "
> > +		    "constant integer expression");
> 
> The term is "integer constant expression" not "constant integer expression".

FIXED!

> 
> > diff --git a/gcc/c/c-array-notation.c b/gcc/c/c-array-notation.c
> 
> > +void replace_array_notations (tree *, bool, vec<tree, va_gc> *,
> > +			      vec<tree, va_gc> *);
> > +void find_rank (tree, bool, size_t *); void
> > +extract_array_notation_exprs (tree, bool, vec<tree, va_gc> **); tree
> > +fix_conditional_array_notations (tree); struct c_expr
> > +fix_array_notation_expr (location_t, enum tree_code,
> > +				       struct c_expr);
> > +bool is_builtin_array_notation_fn (tree func_name, an_reduce_type
> > +*type); static tree fix_builtin_array_notation_fn (tree
> > +an_builtin_fn, tree *new_var); bool contains_array_notation_expr
> > +(tree expr); tree expand_array_notation_exprs (tree t);
> 
> As before, forward declarations inside .c files should only be for static functions
> with recursive calls to themselves, not for non-static functions or static
> functions not involved in recursion.

FIXED!

> 
> > +struct inv_list
> > +{
> > +  vec<tree, va_gc> *list_values;
> > +  vec<tree, va_gc> *replacement;
> > +};
> 
> Comment on this type explaining what it's for.

FIXED!

> 
> > +/* Returns the rank of ARRAY through the *RANK.  The user can specify
> whether
> > +   (s)he wants to step into array_notation-specific builtin functions
> > +   (specified by the IGNORE_BUILTIN_FN).
> 
> The wording seems awkward; "Set *RANK to the rank of ARRAY, ignoring array-
> notation-specific built-in functions if IGNORE_BUILTIN_FN." would be better.

Yes, I agree with your wording. Thanks! and FIXED!

> 
> > +void
> > +find_rank (tree array, bool ignore_builtin_fn, size_t *rank) {
> > +  tree ii_tree;
> > +  size_t current_rank = 0, ii = 0;
> > +  an_reduce_type dummy_type = REDUCE_UNKNOWN;
> > +  if (!array)
> > +    return;
> 
> As before, avoid random checks for NULL parameters unless there is an actual
> reason to allow them and the comments document that they are allowed and
> what the semantics are in that case.  In general, explain what ARRAY is - an
> expression?

This check is necessary. Find rank can get a NULL pointer and that must be caught and rejected.

> 
> Is *RANK always set by this function?  Make clear in the comment above the
> function whether it is, and whether the initial value of *RANK before the
> function is called is of any significance.  I note that
> 
> > +  else if (TREE_CODE (array) == ARRAY_NOTATION_REF)
> > +    {
> > +      for (ii_tree = array;
> > +	   ii_tree && TREE_CODE (ii_tree) == ARRAY_NOTATION_REF;
> > +	   ii_tree = ARRAY_NOTATION_ARRAY (ii_tree))
> > +	current_rank++;
> > +      if (*rank == 0)
> > +	*rank = current_rank;
> 
> does appear to look at the value before the function has set it, implying that the
> original value of *RANK *does* mean something.

FIXED! I looked this way due to an artifact of a previous implementation. It looks a little bit more cleaner now.

> 
> > +      if (TREE_CODE (array) == CALL_EXPR)
> > +	{
> > +	  tree func_name = CALL_EXPR_FN (array);
> > +	  if (TREE_CODE (func_name) == ADDR_EXPR)
> > +	    if (!ignore_builtin_fn)
> > +	      if (is_builtin_array_notation_fn (func_name, &dummy_type))
> > +		/* If it is a builtin function, then we know it returns a
> > +		   scalar.  */
> > +		return;
> > +	  if (TREE_CODE (TREE_OPERAND (array, 0)) == INTEGER_CST)
> > +	    {
> > +	      int length = TREE_INT_CST_LOW (TREE_OPERAND (array, 0));
> > +	      for (ii = 0; ii < (size_t) length; ii++)
> 
> TREE_INT_CST_LOW returns unsigned HOST_WIDE_INT.  There should be no
> need for converting twice, first to int and then to size_t.  And rather than
> depending on implementation default of CALL_EXPR, call_expr_nargs would be
> a better way to calculate the length.
> 
> > +		find_rank (TREE_OPERAND (array, ii), ignore_builtin_fn, rank);
> 
> But actually, you're dealing with a CALL_EXPR here.  So you should be able to
> use existing iterators over CALL_EXPR arguments (e.g.
> FOR_EACH_CALL_EXPR_ARG) rather than explicitly using the number of
> arguments at all.  Doing so, and separately checking CALL_EXPR_FN, and
> CALL_EXPR_STATIC_CHAIN if applicable, seems cleaner than depending on low-
> level details of the sequence of operands to a CALL_EXPR.

FIXED! (I used your suggestion and used FOR_EACH_CALL_EXPR_ARG)

> 
> > +/* Extracts all the array notations specified in NODE and stores them in a
> > +   dynamic tree array of ARRAY_LIST whose size is stored in *LIST_SIZE.  The
> > +   user can specify if (s)he wants to ignore the array notations inside the
> > +   array-notation specific builtin functions (by setting IGNORE_BUILTIN_FN to
> > +   true).  */
> > +
> > +void
> > +extract_array_notation_exprs (tree node, bool ignore_builtin_fn,
> > +			      vec<tree, va_gc> **array_list)
> 
> There's no argument LIST_SIZE, so the comment needs updating.  Again, the
> wording about "The user" is awkward; the comment should directly define the
> semantics of the function in terms of its argument, without reference to "The
> user".

FIXED! Reworded the comment.

> 
> > +{
> > +  size_t ii = 0;
> > +  an_reduce_type dummy_type = REDUCE_UNKNOWN;
> > +
> > +  if (!node)
> > +    return;
> 
> Again, check for NULL argument without any mention in the comment that such
> arguments are valid; remove unless there is a reason to make them valid.
> 
> > +  else if (TREE_CODE (node) == TREE_LIST)
> 
> What's NODE?  My first guess would have been an expression, but if a TREE_LIST
> is possible that's clearly not the answer, so explain in the comment above the
> function what NODE is.  (If a TREE_LIST is being used within expressions to store
> something specific to array notation, don't do so - TREE_LIST is deprecated,
> existing uses should be phased out in favour of more specific and less memory-
> hungry datastructures and new uses should not be added.)

FIXED! What is replacing tree-list? I have used tree-list in my later patches and in my Cilk Plus branch.

> 
> > +      if (TREE_CODE (TREE_OPERAND (node, 0)) == INTEGER_CST)
> > +	{
> > +	  int length = TREE_INT_CST_LOW (TREE_OPERAND (node, 0));
> > +
> > +	  for (ii = 0; ii < (size_t) length; ii++)
> > +	    extract_array_notation_exprs
> > +	      (TREE_OPERAND (node, ii), ignore_builtin_fn, array_list);
> 
> Same problems as before with an iterator over CALL_EXPR that should avoid
> depending on low-level details of how CALL_EXPR is implemented, and excess
> integer type conversions.

FIXED!

> 
> > +/* Replaces all occurances of array notations in tree ORIG that matches the
> > +   ones in LIST with the one in ARRAY_OPERAND.  The size of list and
> > +   ARRAY_OPERAND is ARRAY_SIZE.  For example, ARRAY_OPERAND[x] for
> some index
> > +   'x' will have the equivalent ARRAY_REF for the ARRAY_NOTATION_REF
> specified
> > +   in LIST[x].   The  user can specify if (s)he wants to ignore the array
> > +   notations inside the array-notation specific builtin functions (using the
> > +   bool variable IGNORE_BUILTIN_FN).  */
> 
> Again, avoid "The user".

FIXED!

> 
> > +void
> > +replace_array_notations (tree *orig, bool ignore_builtin_fn,
> > +			 vec<tree, va_gc> *list,
> > +			 vec<tree, va_gc> *array_operand)
> > +{
> > +  size_t ii = 0;
> > +  tree node = NULL_TREE, node_replacement = NULL_TREE;
> > +  an_reduce_type dummy_type = REDUCE_UNKNOWN;
> > +
> > +  if (vec_safe_length (list) == 0 || !*orig)
> > +    return;

This ORIG must be checked for NULL because you will step this function even if the node is NULL. This is the spot to check that.

> 
> Again, avoid checks for NULL or document that NULL arguments are valid if
> there's a good reason.  Generally, document what sort of thing ORIG is.
> 
> > +      if (TREE_CODE (TREE_OPERAND (*orig, 0)) == INTEGER_CST)
> > +	{
> > +	  int length = TREE_INT_CST_LOW (TREE_OPERAND (*orig, 0));
> > +	  for (ii = 0; ii < (size_t) length; ii++)
> > +	    replace_array_notations
> > +	      (&TREE_OPERAND (*orig, ii), ignore_builtin_fn, list,
> > +	       array_operand);
> 
> Again, better CALL_EXPR iterators.

FIXED!

> 
> > +/* This function will find all the scalar expressions in *TP and push it in
> > +   DATA struct, typecasted to (void *).  If *WALK_SUBTREES is set to 0 then
> > +   we have do not go into the *TP's subtrees.  */
> 
> Rather than "This function will", just "Find ..." (and say "Returns NULL_TREE." or
> something like that - presumably the return type is so it can be passed to
> walk_tree).
> 
> > +/* Replaces all the scalar expressions in *NODE. */
> > +
> > +tree
> > +replace_invariant_exprs (tree *node)
> 
> Comment needs to explain the semantics of the return value.

FIXED! 

> 
> > +tree
> > +build_array_notation_expr (location_t location, tree lhs, tree lhs_origtype,
> > +			   enum tree_code modifycode, location_t rhs_loc,
> > +			   tree rhs, tree rhs_origtype)
> 
> > +    }
> > +
> > +
> > +
> > +  for (ii = 0; ii < lhs_rank; ii++)
> 
> Excess blank lines in middle of function.  Generally there shouldn't be two or
> more consecutive blank lines inside a function (if you want to have different
> sizes of blanks to split up levels of structure in the function, that suggests the
> function is too big and should be split up into separate functions).

FIXED!

> 
> > +	     TREE_TYPE (lhs_var[ii]));
> > +
> > +	}
> 
> This location for a blank line doesn't make sense.

FIXED!

> 
> > +  /* The following statements will do the following:
> > +   * <if_stmt_label>: (in order from outermost to innermost)
> > +   *                  if (cond_expr) then go to body_label
> > +   *                  else                go to exit_label
> > +   * <body_label>:
> > +   *                  array expression
> > +   *
> > +   *                  (the increment, goto and exit_label goes from innermost to
> > +   *                   outermost).
> > +   *                  ii++ and jj++
> > +   *                  go to if_stmt_label
> > +   * <exit_label>:
> > +   *                  <REST OF CODE>
> > +   */
> 
> Comments should not have an initial "*" on each line.

FIXED!

> 
> > +/* Encloses the conditional statement passed in STMT with a loop around it
> > +   and replaces the condition in STMT with a ARRAY_REF tree-node to the
> array.
> > +   The condition must have a ARRAY_NOTATION_REF tree.  */
> > +
> > +static tree
> > +fix_conditional_array_notations_1 (tree stmt)
> 
> Comment should explain return value semantics.

FIXED!

> 
> > +			   TREE_TYPE (array_var[ii]));
> > +
> > +    }
> 
> Another stray blank line.  Check the patch generally for stray blank lines
> immediately before a '}', I don't think they ever make sense, but I may have
> missed some.

I think I have caught all of it. I apologize if I missed any.

> 
> > +  // XDELETEVEC (array_var);
> 
> I don't think this sort of commented-out code should be added.  If you're
> deliberately not doing something that a reader might expect to be done, have a
> comment explaining *why* you're not doing it, not just commented-out code to
> do it.

FIXED!

> 
> > +      error_at (location, "__sec_reduce_min_ind or __sec_reduce_max_ind
> cannot"
> > +		" have arrays with dimension greater than 1.");
> 
> Diagnostics don't end with ".".
> 
> > +    default:
> > +      gcc_unreachable ();  /* You should not reach here.  */
> 
> No need for comments like this that just repeat the plain semantics of the C
> code.  There's nothing else a call to gcc_unreachable could possibly mean; such
> a comment is of no more use than "i++; /* Add 1 to i.  */".

FIXED!

> 
> > +/* Returns true of FUNC_NAME is a builtin array notation function.  The type
> of
> > +   function is returned in *TYPE.  */
> 
> "true if", not "true of".
> 

FIXED! Sorry for this stupid mistake.

> > +bool
> > +is_builtin_array_notation_fn (tree func_name, an_reduce_type *type) {
> > +  const char *function_name = NULL;
> > +
> > +  if (!func_name)
> > +    return false;
> 
> Another unexplained test for a NULL argument.  Again, explain what sort of
> things FUNC_NAME may be.  (This is another function that should be using
> BUILT_IN_* enum values rather than strcmp, if you rework how the built-in
> functions are implemented.)
> 
> > +/* Returns true of EXPR (and its subtrees) contain
> > +ARRAY_NOTATION_EXPR node.  */
> 
> "true if", again.

FIXED!

> 
> > +/* Replaces array notations in void function call arguments in ARG with loop
> and
> > +   tree-node ARRAY_REF and returns that value in a tree node variable called
> > +   LOOP.  */
> 
> LOOP is not an argument to this function, so it doesn't make sense to refer to it
> in the comment.  I suspect " in a tree node variable called LOOP" should simply
> be removed.

FIXED!

> 
> > +  if (TREE_CODE (arg) == CALL_EXPR
> > +      && is_builtin_array_notation_fn (CALL_EXPR_FN (arg), &an_type))
> > +    {
> > +      loop = fix_builtin_array_notation_fn (arg, &new_var);
> > +      /* We are ignoring the new var because either the user does not want to
> > +	 capture it OR he is using sec_reduce_mutating function.  */
> 
> In general I think "the user" comments should be avoided though this one is a bit
> less awkward than those defining function semantics by reference to "the user".
> 
> > +/* Walks through tree node T and find all the call-statments that do not
> return
> > +   anything and fix up any array notations they may carry.  */
> > +
> > +tree
> > +expand_array_notation_exprs (tree t)
> 
> Comment should document return value.

FIXED!

> 
> > +{
> > +  if (!t || !contains_array_notation_expr (t))
> > +    return t;
> 
> Another check for NULL without a comment saying NULL is a valid argument.

This function also can receive a null pointer.

> 
> > +/* Returns array notation expression for the array base ARRAY of type TYPE,
> > +   with start index, length and stride given by START_INDEX, LENGTH and
> STRIDE,
> > +   respectively.  */
> > +
> > +tree
> > +build_array_notation_ref (location_t loc, tree array, tree start_index,
> > +			  tree length, tree stride, tree type) {
> > +  tree array_ntn_tree = NULL_TREE;
> > +  size_t stride_rank = 0, length_rank = 0, start_rank = 0;
> > +
> > +  if (!TREE_TYPE (start_index) || !INTEGRAL_TYPE_P (TREE_TYPE
> > + (start_index)))
> 
> I'd expect the argument would have to be an expression, so would always have
> a TYPE and the !TREE_TYPE (start_index) check should be unnecessary.
> If it is needed, explain further in the comment at the start of the function.
> Likewise for other checks for NULL types in this function.

FIXED! But Stride could easily be NULL, (e.g. array[0:10]) and in this case we have put one in. So, I have left that check in.
> 
> > +    {
> > +      error_at (loc,
> > +		"start-index of array notation triplet is not an integer.");
> 
> Diagnostic should not end with ".".

FIXED!

> 
> > +      error_at (loc, "length of array notation triplet is not an
> > + integer.");
> 
> Likewise.

FIXED!

> 
> > +      error_at (loc, "stride of array notation triplet is not an
> > + integer.");
> 
> Likewise.

FIXED!

> 
> > +      error_at (loc, "rank of an array notation triplet's start-index is not "
> > +		"zero.");
> 
> Likewise.

FIXED!

> 
> > +      error_at (loc, "rank of an array notation triplet's length is
> > + not zero.");
> 
> Likewise.

FIXED!

> 
> > +      error_at (loc, "rank of array notation triplet's stride is not
> > + zero.");
> 
> Likewise.

FIXED!

> 
> That's a coding style review of the first half or so of the patch, more later....

Thanks for reviewing my code!

-Balaji V. Iyer.

> 
> --
> Joseph S. Myers
> joseph@codesourcery.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: diff.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20130322/9a4d8ab9/attachment.txt>