[PATCH] Describe coding conventions surrounding "auto"

Mon May 18 16:37:36 GMT 2020

On 5/16/20 4:43 AM, Richard Sandiford wrote:
> Sorry for the empty subject line earlier...
> 
> Jason Merrill <jason@redhat.com> writes:
>> On Fri, May 15, 2020 at 9:47 PM Martin Sebor <msebor@gmail.com> wrote:
>>
>>> On 5/15/20 8:08 AM, Richard Sandiford wrote:
>>>>> Those are all good examples.  Mind putting that into a patch
>>>>> for the coding conventions?
>>>> How's this?  I added "new" expressions as another example of the
>>>> first category.
>>>>
>>>> I'm sure I've missed other good uses, but we can always add to the
>>>> list later if necessary.
>>>>
>>>> Thanks,
>>>> Richard
>>>>
>>>>
>>>> 0001-Describe-coding-conventions-surrounding-auto.patch
>>>>
>>>>   From 10b27e367de0fa9d5bf91544385401cdcbdb8c00 Mon Sep 17 00:00:00 2001
>>>> From: Richard Sandiford<richard.sandiford@arm.com>
>>>> Date: Fri, 15 May 2020 14:58:46 +0100
>>>> Subject: [PATCH] Describe coding conventions surrounding "auto"
>>>>
>>>> ---
>>>>    htdocs/codingconventions.html | 53 +++++++++++++++++++++++++++++++++++
>>>>    htdocs/codingrationale.html   | 17 +++++++++++
>>>>    2 files changed, 70 insertions(+)
>>>>
>>>> diff --git a/htdocs/codingconventions.html
>>> b/htdocs/codingconventions.html
>>>> index f4732ef6..ae49fb91 100644
>>>> --- a/htdocs/codingconventions.html
>>>> +++ b/htdocs/codingconventions.html
>>>> @@ -51,6 +51,7 @@ the conventions separately from any other changes to
>>> the code.</p>
>>>>        <li><a href="#Cxx_Language">Language Use</a>
>>>>            <ul>
>>>>            <li><a href="#Variable">Variable Definitions</a></li>
>>>> +        <li><a href="#Auto">Use of <code>auto</code></a></li>
>>>>            <li><a href="#Struct_Use">Struct Definitions</a></li>
>>>>            <li><a href="#Class_Use">Class Definitions</a></li>
>>>>            <li><a href="#Constructors">Constructors and
>>> Destructors</a></li>
>>>> @@ -884,6 +885,58 @@ Variables may be simultaneously defined and tested
>>> in control expressions.
>>>>    <a href="codingrationale.html#variables">Rationale and Discussion</a>
>>>>    </p>
>>>>
>>>> +<h4 id="Auto">Use of <code>auto</code></h4>
>>>> +
>>>> +<p><code>auto</code> should be used in the following circumstances:
>>>> +<ul>
>>>> +  <li><p>when the expression gives the C++ type explicitly.  For
>>> example</p>
>>>> +
>>>> +    <blockquote>
>>>> +<pre>if (<b>auto *</b>table = dyn_cast <<b>rtx_jump_table_data
>>> *</b>> (insn))                 // OK
>>>> +  ...
>>>> +if (rtx_jump_table_data *table = dyn_cast <rtx_jump_table_data *>
>>> (insn))  // Avoid
>>>> +  ...
>>>> +<b>auto *</b>map = new <b>hash_map <tree, size_t></b>;
>>>          // OK
>>>> +hash_map <tree, size_t> *map = new hash_map <tree, size_t>;
>>> // Avoid</pre></blockquote>
>>>> +
>>>> +    <p>This rule does not apply to abbreviated type names embedded in
>>>> +    an identifier, such as the result of <code>tree_to_shwi</code>.</p>
>>>> +  </li>
>>>> +  <li>
>>>> +    <p>when the expression simply provides an alternative view of an
>>> object
>>>> +    and is bound to a read-only temporary.  For example:</p>
>>>> +
>>>> +    <blockquote>
>>>> +<pre><b>auto</b> wioff = <b>wi::to_wide (off);</b>         // OK
>>>> +wide_int wioff = wi::to_wide (off);     // Avoid if wioff is read-only
>>>> +<b>auto</b> minus1 = <b>std::shwi (-1, prec);</b>     // OK
>>>> +wide_int minus1 = std::shwi (-1, prec); // Avoid if minus1 is
>>> read-only</pre></blockquote>
>>>> +
>>>> +    <p>In principle this rule applies to other views of an object too,
>>>> +    such as a reversed view of a list, or a sequential view of a
>>>> +    <code>hash_set</code>.  It does not apply to general
>>> temporaries.</p>
>>>> +  </li>
>>>> +  <li>
>>>> +    <p>the type of an iterator.  For example:</p>
>>>> +
>>>> +    <blockquote>
>>>> +<pre><b>auto</b> it = <b>std::find (names.begin (), names.end (),
>>> needle)</b>;        // OK
>>>> +vector <name_map>::iterator it = std::find (names.begin (),
>>>> +                                            names.end (), needle); //
>>> Avoid</pre></blockquote>
>>>> +  </li>
>>>> +  <li>
>>>> +    <p>the type of a lambda expression.  For example:</p>
>>>> +
>>>> +    <blockquote>
>>>> +<pre><b>auto</b> f = <b>[] (int x) { return x + 1; }</b>; //
>>> OK</pre></blockquote>
>>>> +  </li>
>>>> +</ul></p>
>>>> +
>>>> +<p><code>auto</code> should <b>not</b> be used in other contexts.</p>
>>>
>>> This seems like a severe (and unnecessary) restriction...
>>>
>>>> +
>>>> +<p>
>>>> +<a href="codingrationale.html#auto">Rationale and Discussion</a>
>>>> +</p>
>>>>
>>>>    <h4 id="Struct_Use">Struct Definitions</h4>
>>>>
>>>> diff --git a/htdocs/codingrationale.html b/htdocs/codingrationale.html
>>>> index 0b44f1da..a919023c 100644
>>>> --- a/htdocs/codingrationale.html
>>>> +++ b/htdocs/codingrationale.html
>>>> @@ -50,6 +50,23 @@ if (info *q = get_any_available_info ()) {
>>>>    }
>>>>    </code></pre></blockquote>
>>>>
>>>> +<h4 id="auto">Use of <code>auto</code></h4>
>>>> +
>>>> +<p>The reason for preferring <code>auto</code> in expressions like:
>>>> +<blockquote><pre>auto wioff = wi::to_wide (off);</pre></blockquote>
>>>> +is that using the natural type of the expression is more efficient than
>>>> +converting it to types like <code>wide_int</code>.</p>
>>>> +
>>>> +<p>The reason for excluding other uses of <code>auto</code> is that
>>>> +in most other cases the type carries useful information.  For example:
>>>> +<blockquote><pre>for (const std::pair <const char *, tree>
>>> &elt : indirect_pool)
>>>> +  ...</pre></blockquote>
>>>> +makes it obvious that <code>elt</code> is a pair and gives the types of
>>>> +<code>elt.first</code> and <code>elt.second</code>.  In contrast:
>>>> +<blockquote><pre>for (const auto &elt : indirect_pool)
>>>> +  ...</pre></blockquote>
>>>> +gives no immediate indication what <code>elt</code> is or what can
>>>> +be done with it.</p>
>>>
>>> ...there are countless constructs in C++ 98 as well in C where there
>>> is no such indication yet we don't (and very well can't) try to avoid
>>> using them.  Examples include macros, members of structures defined
>>> far away from the point of their use, results of ordinary function
>>> calls, results of overloaded functions or templates, default function
>>> arguments, default template parameters, etc.
>>>
>>> By way of a random example from genrecog.c:
>>>
>>>           int_set::iterator end
>>>          = std::set_union (trans1->labels.begin (), trans1->labels.end (),
>>>                            combined->begin (), combined->end (),
>>>                            next->begin ());
>>>
>>> There is no immediate indication precisely what type int_set::iterator
>>> is.  All we can tell is that that it's some sort of an iterator, and
>>> that should be good enough.  It lets us (even forces us to) write code
>>> that satisfies the requirements of the abstraction (whatever it happens
>>> to be), and avoid tying it closely to the implementation.  That's
>>> a good thing.
> 
> Do you mean that this example should or shouldn't use "auto"?
> Iterators are included in the list above, so the idea was that using
> "auto" would be the recommended style here.

I meant it as a general example where the exact type isn't (and isn't
supposed to be) apparent to the caller because it's an implementation
detail.

Similarly, if an API defines a typedef string_tree_pair for
std::pair<const char*, tree>, it's the typedef that's meant to
be used in preference to what it expands to.

>>> Unless there is a sound technical reason for avoiding it (e.g.,
>>> unacceptable inefficiency or known safety problems) I'd say leave
>>> it to everyone's judgment what convenience features to use. If
>>> something turns out to be a problem we'll deal with it then.
> 
> But using "auto" is never going to be an efficiency concern,
> and probably not a safety concern.  So in the case of "auto",
> using that principle would basically come down to "when to use
> auto is purely a judgement call".
> 
> I don't see how we can get consistency with that kind of approach.
> Or is the argument that we're (or I'm :-)) worrying too much about
> consistency and we should just go with the flow?
> 
> If we do treat it as a pure judgement call, the problem then is:
> who's judgement matters most here?  The author's or the reviewer's?
> Should the reviewer respect the choice of the author even if they
> don't personally agree with it, given that there are no technical
> issues at stake?

When no technical concerns are at stake contributors should be free
to use the language as they feel is appropriate.  The fewer hurdles
we put in place the more time we will be able to focus on getting
the many technical details right, and the more fun it will be to
contribute.  If a consensus emerges that some uses are generally
best avoided then it might be appropriate to reflect it in
the coding conventions.  But I'd hope that wouldn't happen before
we've had time to gain experience with it.

Martin

> 
>> I agree with this.  If using 'auto' makes the code harder to read, that's a
>> good comment for code review, but it's hard to write a general rule for
>> this sort of thing.
> 
> I think the downsides of a pure "it's up to the reviewer" approach are:
> 
> - Every contributor and reviewer has their own opinions.  The point of
>    coding standards is to aim for consistency in a large project despite
>    these various and often conflicting opinions.
> 
> - If it is (or gives the impression of being) purely down to the
>    individual opinions of the reviewer, things can get awkward.
>    E.g. -- and maybe I shouldn't use a personal example like this --
>    I remember Martin S has complained a few times when, in code review,
>    Jeff has asked for changes to match existing GCC style (e.g. to use
>    upper case instead of CamelCase for enum values).  In all the examples
>    I saw, Jeff was asking for what I understood the GCC style to be.
>    But Martin gave the impression of being quite annoyed at having to
>    make these changes, talking in the "enum case" about having to learn
>    individual reviewers' preferences.
> 
>    I think that kind of reaction is inevitable if we don't try to agree
>    common standards.  If things aren't written down, requests from reviewers
>    not to use a feature can seem arbitrary or whimsical even when they're not.
>    The same goes for requests to follow conventions that have built up
>    over time without being written down.  The temptation can be to push
>    back against code review comments that aren't purely technical in nature.
> 
>    If we take this approach, then over time (because of the different
>    opinions of different contributors) there is likely to be code in the
>    codebase that uses "auto" in the way that any given person wants to
>    use it.  It will seem even more arbitrary to reject the use of "auto"
>    in one context if it has already been used elsewhere in another
>    similar context.  So over time, we'll either get more review friction
>    or come to an understanding that the use of "auto" isn't a suitable
>    comment for code review after all.
> 
>    I think the latter is more likely.  It can even be framed positively:
>    reviewers should respect the choice of author to use "auto" whenever
>    they like, and not impose their own opinions.
> 
>    But while individual patches are one source of hard-to-read code,
>    another is when different styles build up over time.  And that's what
>    I'd really like to avoid.
> 
>    Having a consistent style in itself makes code easier to read.
>    It also makes it easier to modify, because there's less guesswork
>    involved.
> 
> Like I say, this list isn't supposed to be fixed for all time.
> We can add to it whenever a good missing case crops up.  And the
> template one that you (Jason) mentioned in the earlier message is
> definitely a good one :-)
> 
> And to be clear, this is in no way an "anti-auto" drive.  It's purely
> about taking one feature that is (rightly) going to be used often
> and trying to add some consistency and predictability around it.
> 
> Thanks,
> Richard
>