[patch] cilkplus array notation for C (clean, independent patchset, take 1)
Iyer, Balaji V
balaji.v.iyer@intel.com
Fri Mar 22 22:04:00 GMT 2013
Attached, please find a patch that should be applied to the head of cilkplus-merge.
This patch is generated by diffing with the following hash: 4f4932be8230284919d197cccb4b10201f82a0b3
This patch also adds documentation about the built-in reduction funtions of array notations that Aldy pointed out in a previous email.
I have not fixed all the issues below (the big one that is left is the bultin function representation that Joseph Pointed out). I have fixed most of the other issues. All the things I have fixed are marked by "FIXED!"
Is this Ok to commit into cilkplus-merge?
Thanks,
Balaji V. Iyer.
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of Joseph S. Myers
> Sent: Thursday, March 21, 2013 11:56 AM
> To: Aldy Hernandez
> Cc: Iyer, Balaji V; gcc-patches
> Subject: Re: [patch] cilkplus array notation for C (clean, independent patchset,
> take 1)
>
> On Wed, 20 Mar 2013, Aldy Hernandez wrote:
>
> > Joseph, folks, et al... How does this look?
>
> This review largely deals with coding style (interpreted broadly). I'll review more
> of the substance separately later; reposting with fixes for all the accumulated
> issues is probably a good idea anyway, to avoid the same issues coming up
> repeatedly.
>
> > * c-common.c (c_define_builtins): When cilkplus is enabled, the
> > function array_notation_init_builtins() is called.
>
> Don't use () after a function name when referring to the function.
FIXED!
>
> > diff --git a/gcc/c-family/array-notation-common.c
> > b/gcc/c-family/array-notation-common.c
>
> > +int extract_sec_implicit_index_arg (location_t, tree); bool
> > +is_sec_implicit_index_fn (tree); void array_notation_init_builtins
> > +(void);
FIXED!
>
> Non-static function declarations like this should not be inside a .c file.
> If these functions are used outside this file, there should be an associated
> header that declares them; include it in the .c file. If only used inside the .c file
> that defines them, make them static (and topologically sort static functions
> inside a source file so that forward static declarations are only needed for cases
> of recursion).
>
> > +/* Mark the FNDECL as cold, meaning that the function specified by FNDECL
> is
> > + not run as is. */
>
> The cold attribute means unlikely to be executed rather than "not run as is".
> Maybe "not run as is" is what's relevant here, but I'm not clear why this attribute
> would be useful for built-in functions at all - the documentation suggests it's
> only relevant when a user defines a function themselves, and affects the code
> generated for that function, so wouldn't be relevant at all for built-in functions.
>
> > +void
> > +array_notation_init_builtins (void)
>
> Other built-in functions use various .def files (builtins.def and the files it includes)
> to avoid lots of repetitive code like this - can you integrate this with that
> mechanism? If you do so, then you should be able to avoid (or massively
> simplify) functions like:
>
> > +/* Returns true if the function call specified in FUNC_NAME is
> > + __sec_implicit_index. */
> > +
> > +bool
> > +is_sec_implicit_index_fn (tree func_name)
>
> because code can use the BUILT_IN_* enum values to test whether a particular
> function is in use - which is certainly cleaner than using strcmp against the
> function name.
>
> > +/* Returns the first and only argument for FN, which should be a
> > + sec_implicit_index function. FN's location in the source file is is
> > + indicated by LOCATION. */
> > +
> > +int
> > +extract_sec_implicit_index_arg (location_t location, tree fn) {
> > + tree fn_arg;
> > + HOST_WIDE_INT return_int = 0;
> > + if (!fn)
> > + return -1;
>
> Why the random check for a NULL argument? If a NULL argument is valid
> (meaning that it makes the code cleaner to allow such arguments rather than
> making sure the function isn't called with them), this should be documented in
> the comment above the function; otherwise, if such an argument isn't valid,
> there is no need to check for it.
I always tend to check for a null pointer before I access the fields in the structure. In this case it is unnecessary. In some cases (e.g. find_rank) there is a good chance a null pointer will be passed into the function and we need to check that and reject those.
>
> You declare return_int as HOST_WIDE_INT, but it only receives a value cast to
> int, and is used only to store a value returned as int. Either use int consistently,
> or HOST_WIDE_INT consistently, but I don't see a reason to use both.
FIXED!
>
> > + if (TREE_CODE (fn) == CALL_EXPR)
> > + {
> > + fn_arg = CALL_EXPR_ARG (fn, 0);
> > + if (really_constant_p (fn_arg))
>
> I don't think really_constant_p is what's wanted;
> <http://software.intel.com/sites/default/files/m/4/e/7/3/1/40297-
> Intel_Cilk_plus_lang_spec_2.htm>
> says "The argument shall be an integer constant expression.", and such
> expressions always appear in the C front end as INTEGER_CST. So you can just
> check for INTEGER_CST.
What about C++? This function is shared by both C and C++.
>
> Now a subtlety here is that the function argument will have been folded by this
> point, meaning that cases that aren't integer constant expressions in C standard
> terms will be wrongly allowed (both by the original code and by a version
> checking against INTEGER_CST). In such cases, the way to get things checked
> correctly is to use a keyword rather than a built-in function - as with
> __builtin_choose_expr or __builtin_shuffle, for example. Since this operation
> seems special in ways that built-in functions generally aren't, that seems
> reasonable anyway. So the code parsing this keyword would check that the
> argument is an INTEGER_CST, of integer type (since INTEGER_CSTs can have
> pointer type in GCC), like that for __builtin_choose_expr does. It would then
> quite likely create its own tree code for the operation, rather than using a
> CALL_EXPR at all. (It would need to manage converting to int, given how the
> specification defines things in terms of a prototype for type int - so e.g. a
> constant 1ULL << 32 would act like 0 if int is 32 bits, under the present
> specification.)
>
> The specification doesn't seem very clear on to what extent the __sec_*
> operations must act like functions (what happens if someone puts parentheses
> around the __sec_* name, for example - that wouldn't work with the keyword
> approach). So the specification should be clarified there, but I think saying the
> __sec_* operations are syntactically special, like keywords, is more appropriate
> than requiring other uses to work.
>
> > + return_int = (int) int_cst_value (fn_arg);
> > + else
> > + {
> > + if (location == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (fn))
> > + location = EXPR_LOCATION (fn);
> > + error_at (location, "__sec_implicit_index parameter must be a "
> > + "constant integer expression");
>
> The term is "integer constant expression" not "constant integer expression".
FIXED!
>
> > diff --git a/gcc/c/c-array-notation.c b/gcc/c/c-array-notation.c
>
> > +void replace_array_notations (tree *, bool, vec<tree, va_gc> *,
> > + vec<tree, va_gc> *);
> > +void find_rank (tree, bool, size_t *); void
> > +extract_array_notation_exprs (tree, bool, vec<tree, va_gc> **); tree
> > +fix_conditional_array_notations (tree); struct c_expr
> > +fix_array_notation_expr (location_t, enum tree_code,
> > + struct c_expr);
> > +bool is_builtin_array_notation_fn (tree func_name, an_reduce_type
> > +*type); static tree fix_builtin_array_notation_fn (tree
> > +an_builtin_fn, tree *new_var); bool contains_array_notation_expr
> > +(tree expr); tree expand_array_notation_exprs (tree t);
>
> As before, forward declarations inside .c files should only be for static functions
> with recursive calls to themselves, not for non-static functions or static
> functions not involved in recursion.
FIXED!
>
> > +struct inv_list
> > +{
> > + vec<tree, va_gc> *list_values;
> > + vec<tree, va_gc> *replacement;
> > +};
>
> Comment on this type explaining what it's for.
FIXED!
>
> > +/* Returns the rank of ARRAY through the *RANK. The user can specify
> whether
> > + (s)he wants to step into array_notation-specific builtin functions
> > + (specified by the IGNORE_BUILTIN_FN).
>
> The wording seems awkward; "Set *RANK to the rank of ARRAY, ignoring array-
> notation-specific built-in functions if IGNORE_BUILTIN_FN." would be better.
Yes, I agree with your wording. Thanks! and FIXED!
>
> > +void
> > +find_rank (tree array, bool ignore_builtin_fn, size_t *rank) {
> > + tree ii_tree;
> > + size_t current_rank = 0, ii = 0;
> > + an_reduce_type dummy_type = REDUCE_UNKNOWN;
> > + if (!array)
> > + return;
>
> As before, avoid random checks for NULL parameters unless there is an actual
> reason to allow them and the comments document that they are allowed and
> what the semantics are in that case. In general, explain what ARRAY is - an
> expression?
This check is necessary. Find rank can get a NULL pointer and that must be caught and rejected.
>
> Is *RANK always set by this function? Make clear in the comment above the
> function whether it is, and whether the initial value of *RANK before the
> function is called is of any significance. I note that
>
> > + else if (TREE_CODE (array) == ARRAY_NOTATION_REF)
> > + {
> > + for (ii_tree = array;
> > + ii_tree && TREE_CODE (ii_tree) == ARRAY_NOTATION_REF;
> > + ii_tree = ARRAY_NOTATION_ARRAY (ii_tree))
> > + current_rank++;
> > + if (*rank == 0)
> > + *rank = current_rank;
>
> does appear to look at the value before the function has set it, implying that the
> original value of *RANK *does* mean something.
FIXED! I looked this way due to an artifact of a previous implementation. It looks a little bit more cleaner now.
>
> > + if (TREE_CODE (array) == CALL_EXPR)
> > + {
> > + tree func_name = CALL_EXPR_FN (array);
> > + if (TREE_CODE (func_name) == ADDR_EXPR)
> > + if (!ignore_builtin_fn)
> > + if (is_builtin_array_notation_fn (func_name, &dummy_type))
> > + /* If it is a builtin function, then we know it returns a
> > + scalar. */
> > + return;
> > + if (TREE_CODE (TREE_OPERAND (array, 0)) == INTEGER_CST)
> > + {
> > + int length = TREE_INT_CST_LOW (TREE_OPERAND (array, 0));
> > + for (ii = 0; ii < (size_t) length; ii++)
>
> TREE_INT_CST_LOW returns unsigned HOST_WIDE_INT. There should be no
> need for converting twice, first to int and then to size_t. And rather than
> depending on implementation default of CALL_EXPR, call_expr_nargs would be
> a better way to calculate the length.
>
> > + find_rank (TREE_OPERAND (array, ii), ignore_builtin_fn, rank);
>
> But actually, you're dealing with a CALL_EXPR here. So you should be able to
> use existing iterators over CALL_EXPR arguments (e.g.
> FOR_EACH_CALL_EXPR_ARG) rather than explicitly using the number of
> arguments at all. Doing so, and separately checking CALL_EXPR_FN, and
> CALL_EXPR_STATIC_CHAIN if applicable, seems cleaner than depending on low-
> level details of the sequence of operands to a CALL_EXPR.
FIXED! (I used your suggestion and used FOR_EACH_CALL_EXPR_ARG)
>
> > +/* Extracts all the array notations specified in NODE and stores them in a
> > + dynamic tree array of ARRAY_LIST whose size is stored in *LIST_SIZE. The
> > + user can specify if (s)he wants to ignore the array notations inside the
> > + array-notation specific builtin functions (by setting IGNORE_BUILTIN_FN to
> > + true). */
> > +
> > +void
> > +extract_array_notation_exprs (tree node, bool ignore_builtin_fn,
> > + vec<tree, va_gc> **array_list)
>
> There's no argument LIST_SIZE, so the comment needs updating. Again, the
> wording about "The user" is awkward; the comment should directly define the
> semantics of the function in terms of its argument, without reference to "The
> user".
FIXED! Reworded the comment.
>
> > +{
> > + size_t ii = 0;
> > + an_reduce_type dummy_type = REDUCE_UNKNOWN;
> > +
> > + if (!node)
> > + return;
>
> Again, check for NULL argument without any mention in the comment that such
> arguments are valid; remove unless there is a reason to make them valid.
>
> > + else if (TREE_CODE (node) == TREE_LIST)
>
> What's NODE? My first guess would have been an expression, but if a TREE_LIST
> is possible that's clearly not the answer, so explain in the comment above the
> function what NODE is. (If a TREE_LIST is being used within expressions to store
> something specific to array notation, don't do so - TREE_LIST is deprecated,
> existing uses should be phased out in favour of more specific and less memory-
> hungry datastructures and new uses should not be added.)
FIXED! What is replacing tree-list? I have used tree-list in my later patches and in my Cilk Plus branch.
>
> > + if (TREE_CODE (TREE_OPERAND (node, 0)) == INTEGER_CST)
> > + {
> > + int length = TREE_INT_CST_LOW (TREE_OPERAND (node, 0));
> > +
> > + for (ii = 0; ii < (size_t) length; ii++)
> > + extract_array_notation_exprs
> > + (TREE_OPERAND (node, ii), ignore_builtin_fn, array_list);
>
> Same problems as before with an iterator over CALL_EXPR that should avoid
> depending on low-level details of how CALL_EXPR is implemented, and excess
> integer type conversions.
FIXED!
>
> > +/* Replaces all occurances of array notations in tree ORIG that matches the
> > + ones in LIST with the one in ARRAY_OPERAND. The size of list and
> > + ARRAY_OPERAND is ARRAY_SIZE. For example, ARRAY_OPERAND[x] for
> some index
> > + 'x' will have the equivalent ARRAY_REF for the ARRAY_NOTATION_REF
> specified
> > + in LIST[x]. The user can specify if (s)he wants to ignore the array
> > + notations inside the array-notation specific builtin functions (using the
> > + bool variable IGNORE_BUILTIN_FN). */
>
> Again, avoid "The user".
FIXED!
>
> > +void
> > +replace_array_notations (tree *orig, bool ignore_builtin_fn,
> > + vec<tree, va_gc> *list,
> > + vec<tree, va_gc> *array_operand)
> > +{
> > + size_t ii = 0;
> > + tree node = NULL_TREE, node_replacement = NULL_TREE;
> > + an_reduce_type dummy_type = REDUCE_UNKNOWN;
> > +
> > + if (vec_safe_length (list) == 0 || !*orig)
> > + return;
This ORIG must be checked for NULL because you will step this function even if the node is NULL. This is the spot to check that.
>
> Again, avoid checks for NULL or document that NULL arguments are valid if
> there's a good reason. Generally, document what sort of thing ORIG is.
>
> > + if (TREE_CODE (TREE_OPERAND (*orig, 0)) == INTEGER_CST)
> > + {
> > + int length = TREE_INT_CST_LOW (TREE_OPERAND (*orig, 0));
> > + for (ii = 0; ii < (size_t) length; ii++)
> > + replace_array_notations
> > + (&TREE_OPERAND (*orig, ii), ignore_builtin_fn, list,
> > + array_operand);
>
> Again, better CALL_EXPR iterators.
FIXED!
>
> > +/* This function will find all the scalar expressions in *TP and push it in
> > + DATA struct, typecasted to (void *). If *WALK_SUBTREES is set to 0 then
> > + we have do not go into the *TP's subtrees. */
>
> Rather than "This function will", just "Find ..." (and say "Returns NULL_TREE." or
> something like that - presumably the return type is so it can be passed to
> walk_tree).
>
> > +/* Replaces all the scalar expressions in *NODE. */
> > +
> > +tree
> > +replace_invariant_exprs (tree *node)
>
> Comment needs to explain the semantics of the return value.
FIXED!
>
> > +tree
> > +build_array_notation_expr (location_t location, tree lhs, tree lhs_origtype,
> > + enum tree_code modifycode, location_t rhs_loc,
> > + tree rhs, tree rhs_origtype)
>
> > + }
> > +
> > +
> > +
> > + for (ii = 0; ii < lhs_rank; ii++)
>
> Excess blank lines in middle of function. Generally there shouldn't be two or
> more consecutive blank lines inside a function (if you want to have different
> sizes of blanks to split up levels of structure in the function, that suggests the
> function is too big and should be split up into separate functions).
FIXED!
>
> > + TREE_TYPE (lhs_var[ii]));
> > +
> > + }
>
> This location for a blank line doesn't make sense.
FIXED!
>
> > + /* The following statements will do the following:
> > + * <if_stmt_label>: (in order from outermost to innermost)
> > + * if (cond_expr) then go to body_label
> > + * else go to exit_label
> > + * <body_label>:
> > + * array expression
> > + *
> > + * (the increment, goto and exit_label goes from innermost to
> > + * outermost).
> > + * ii++ and jj++
> > + * go to if_stmt_label
> > + * <exit_label>:
> > + * <REST OF CODE>
> > + */
>
> Comments should not have an initial "*" on each line.
FIXED!
>
> > +/* Encloses the conditional statement passed in STMT with a loop around it
> > + and replaces the condition in STMT with a ARRAY_REF tree-node to the
> array.
> > + The condition must have a ARRAY_NOTATION_REF tree. */
> > +
> > +static tree
> > +fix_conditional_array_notations_1 (tree stmt)
>
> Comment should explain return value semantics.
FIXED!
>
> > + TREE_TYPE (array_var[ii]));
> > +
> > + }
>
> Another stray blank line. Check the patch generally for stray blank lines
> immediately before a '}', I don't think they ever make sense, but I may have
> missed some.
I think I have caught all of it. I apologize if I missed any.
>
> > + // XDELETEVEC (array_var);
>
> I don't think this sort of commented-out code should be added. If you're
> deliberately not doing something that a reader might expect to be done, have a
> comment explaining *why* you're not doing it, not just commented-out code to
> do it.
FIXED!
>
> > + error_at (location, "__sec_reduce_min_ind or __sec_reduce_max_ind
> cannot"
> > + " have arrays with dimension greater than 1.");
>
> Diagnostics don't end with ".".
>
> > + default:
> > + gcc_unreachable (); /* You should not reach here. */
>
> No need for comments like this that just repeat the plain semantics of the C
> code. There's nothing else a call to gcc_unreachable could possibly mean; such
> a comment is of no more use than "i++; /* Add 1 to i. */".
FIXED!
>
> > +/* Returns true of FUNC_NAME is a builtin array notation function. The type
> of
> > + function is returned in *TYPE. */
>
> "true if", not "true of".
>
FIXED! Sorry for this stupid mistake.
> > +bool
> > +is_builtin_array_notation_fn (tree func_name, an_reduce_type *type) {
> > + const char *function_name = NULL;
> > +
> > + if (!func_name)
> > + return false;
>
> Another unexplained test for a NULL argument. Again, explain what sort of
> things FUNC_NAME may be. (This is another function that should be using
> BUILT_IN_* enum values rather than strcmp, if you rework how the built-in
> functions are implemented.)
>
> > +/* Returns true of EXPR (and its subtrees) contain
> > +ARRAY_NOTATION_EXPR node. */
>
> "true if", again.
FIXED!
>
> > +/* Replaces array notations in void function call arguments in ARG with loop
> and
> > + tree-node ARRAY_REF and returns that value in a tree node variable called
> > + LOOP. */
>
> LOOP is not an argument to this function, so it doesn't make sense to refer to it
> in the comment. I suspect " in a tree node variable called LOOP" should simply
> be removed.
FIXED!
>
> > + if (TREE_CODE (arg) == CALL_EXPR
> > + && is_builtin_array_notation_fn (CALL_EXPR_FN (arg), &an_type))
> > + {
> > + loop = fix_builtin_array_notation_fn (arg, &new_var);
> > + /* We are ignoring the new var because either the user does not want to
> > + capture it OR he is using sec_reduce_mutating function. */
>
> In general I think "the user" comments should be avoided though this one is a bit
> less awkward than those defining function semantics by reference to "the user".
>
> > +/* Walks through tree node T and find all the call-statments that do not
> return
> > + anything and fix up any array notations they may carry. */
> > +
> > +tree
> > +expand_array_notation_exprs (tree t)
>
> Comment should document return value.
FIXED!
>
> > +{
> > + if (!t || !contains_array_notation_expr (t))
> > + return t;
>
> Another check for NULL without a comment saying NULL is a valid argument.
This function also can receive a null pointer.
>
> > +/* Returns array notation expression for the array base ARRAY of type TYPE,
> > + with start index, length and stride given by START_INDEX, LENGTH and
> STRIDE,
> > + respectively. */
> > +
> > +tree
> > +build_array_notation_ref (location_t loc, tree array, tree start_index,
> > + tree length, tree stride, tree type) {
> > + tree array_ntn_tree = NULL_TREE;
> > + size_t stride_rank = 0, length_rank = 0, start_rank = 0;
> > +
> > + if (!TREE_TYPE (start_index) || !INTEGRAL_TYPE_P (TREE_TYPE
> > + (start_index)))
>
> I'd expect the argument would have to be an expression, so would always have
> a TYPE and the !TREE_TYPE (start_index) check should be unnecessary.
> If it is needed, explain further in the comment at the start of the function.
> Likewise for other checks for NULL types in this function.
FIXED! But Stride could easily be NULL, (e.g. array[0:10]) and in this case we have put one in. So, I have left that check in.
>
> > + {
> > + error_at (loc,
> > + "start-index of array notation triplet is not an integer.");
>
> Diagnostic should not end with ".".
FIXED!
>
> > + error_at (loc, "length of array notation triplet is not an
> > + integer.");
>
> Likewise.
FIXED!
>
> > + error_at (loc, "stride of array notation triplet is not an
> > + integer.");
>
> Likewise.
FIXED!
>
> > + error_at (loc, "rank of an array notation triplet's start-index is not "
> > + "zero.");
>
> Likewise.
FIXED!
>
> > + error_at (loc, "rank of an array notation triplet's length is
> > + not zero.");
>
> Likewise.
FIXED!
>
> > + error_at (loc, "rank of array notation triplet's stride is not
> > + zero.");
>
> Likewise.
FIXED!
>
> That's a coding style review of the first half or so of the patch, more later....
Thanks for reviewing my code!
-Balaji V. Iyer.
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: diff.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20130322/9a4d8ab9/attachment.txt>
More information about the Gcc-patches
mailing list