[PATCH] Describe coding conventions surrounding "auto"

Mon May 18 22:51:50 GMT 2020

On 5/18/20 12:02 PM, Richard Sandiford wrote:
> Martin Sebor <msebor@gmail.com> writes:
>> On 5/16/20 4:43 AM, Richard Sandiford wrote:
>>> Sorry for the empty subject line earlier...
>>>
>>> Jason Merrill <jason@redhat.com> writes:
>>>> On Fri, May 15, 2020 at 9:47 PM Martin Sebor <msebor@gmail.com> wrote:
>>>>
>>>>> On 5/15/20 8:08 AM, Richard Sandiford wrote:
>>>>>>> Those are all good examples.  Mind putting that into a patch
>>>>>>> for the coding conventions?
>>>>>> How's this?  I added "new" expressions as another example of the
>>>>>> first category.
>>>>>>
>>>>>> I'm sure I've missed other good uses, but we can always add to the
>>>>>> list later if necessary.
>>>>>>
>>>>>> Thanks,
>>>>>> Richard
>>>>>>
>>>>>>
>>>>>> 0001-Describe-coding-conventions-surrounding-auto.patch
>>>>>>
>>>>>>    From 10b27e367de0fa9d5bf91544385401cdcbdb8c00 Mon Sep 17 00:00:00 2001
>>>>>> From: Richard Sandiford<richard.sandiford@arm.com>
>>>>>> Date: Fri, 15 May 2020 14:58:46 +0100
>>>>>> Subject: [PATCH] Describe coding conventions surrounding "auto"
>>>>>>
>>>>>> ---
>>>>>>     htdocs/codingconventions.html | 53 +++++++++++++++++++++++++++++++++++
>>>>>>     htdocs/codingrationale.html   | 17 +++++++++++
>>>>>>     2 files changed, 70 insertions(+)
>>>>>>
>>>>>> diff --git a/htdocs/codingconventions.html
>>>>> b/htdocs/codingconventions.html
>>>>>> index f4732ef6..ae49fb91 100644
>>>>>> --- a/htdocs/codingconventions.html
>>>>>> +++ b/htdocs/codingconventions.html
>>>>>> @@ -51,6 +51,7 @@ the conventions separately from any other changes to
>>>>> the code.</p>
>>>>>>         <li><a href="#Cxx_Language">Language Use</a>
>>>>>>             <ul>
>>>>>>             <li><a href="#Variable">Variable Definitions</a></li>
>>>>>> +        <li><a href="#Auto">Use of <code>auto</code></a></li>
>>>>>>             <li><a href="#Struct_Use">Struct Definitions</a></li>
>>>>>>             <li><a href="#Class_Use">Class Definitions</a></li>
>>>>>>             <li><a href="#Constructors">Constructors and
>>>>> Destructors</a></li>
>>>>>> @@ -884,6 +885,58 @@ Variables may be simultaneously defined and tested
>>>>> in control expressions.
>>>>>>     <a href="codingrationale.html#variables">Rationale and Discussion</a>
>>>>>>     </p>
>>>>>>
>>>>>> +<h4 id="Auto">Use of <code>auto</code></h4>
>>>>>> +
>>>>>> +<p><code>auto</code> should be used in the following circumstances:
>>>>>> +<ul>
>>>>>> +  <li><p>when the expression gives the C++ type explicitly.  For
>>>>> example</p>
>>>>>> +
>>>>>> +    <blockquote>
>>>>>> +<pre>if (<b>auto *</b>table = dyn_cast <<b>rtx_jump_table_data
>>>>> *</b>> (insn))                 // OK
>>>>>> +  ...
>>>>>> +if (rtx_jump_table_data *table = dyn_cast <rtx_jump_table_data *>
>>>>> (insn))  // Avoid
>>>>>> +  ...
>>>>>> +<b>auto *</b>map = new <b>hash_map <tree, size_t></b>;
>>>>>           // OK
>>>>>> +hash_map <tree, size_t> *map = new hash_map <tree, size_t>;
>>>>> // Avoid</pre></blockquote>
>>>>>> +
>>>>>> +    <p>This rule does not apply to abbreviated type names embedded in
>>>>>> +    an identifier, such as the result of <code>tree_to_shwi</code>.</p>
>>>>>> +  </li>
>>>>>> +  <li>
>>>>>> +    <p>when the expression simply provides an alternative view of an
>>>>> object
>>>>>> +    and is bound to a read-only temporary.  For example:</p>
>>>>>> +
>>>>>> +    <blockquote>
>>>>>> +<pre><b>auto</b> wioff = <b>wi::to_wide (off);</b>         // OK
>>>>>> +wide_int wioff = wi::to_wide (off);     // Avoid if wioff is read-only
>>>>>> +<b>auto</b> minus1 = <b>std::shwi (-1, prec);</b>     // OK
>>>>>> +wide_int minus1 = std::shwi (-1, prec); // Avoid if minus1 is
>>>>> read-only</pre></blockquote>
>>>>>> +
>>>>>> +    <p>In principle this rule applies to other views of an object too,
>>>>>> +    such as a reversed view of a list, or a sequential view of a
>>>>>> +    <code>hash_set</code>.  It does not apply to general
>>>>> temporaries.</p>
>>>>>> +  </li>
>>>>>> +  <li>
>>>>>> +    <p>the type of an iterator.  For example:</p>
>>>>>> +
>>>>>> +    <blockquote>
>>>>>> +<pre><b>auto</b> it = <b>std::find (names.begin (), names.end (),
>>>>> needle)</b>;        // OK
>>>>>> +vector <name_map>::iterator it = std::find (names.begin (),
>>>>>> +                                            names.end (), needle); //
>>>>> Avoid</pre></blockquote>
>>>>>> +  </li>
>>>>>> +  <li>
>>>>>> +    <p>the type of a lambda expression.  For example:</p>
>>>>>> +
>>>>>> +    <blockquote>
>>>>>> +<pre><b>auto</b> f = <b>[] (int x) { return x + 1; }</b>; //
>>>>> OK</pre></blockquote>
>>>>>> +  </li>
>>>>>> +</ul></p>
>>>>>> +
>>>>>> +<p><code>auto</code> should <b>not</b> be used in other contexts.</p>
>>>>>
>>>>> This seems like a severe (and unnecessary) restriction...
>>>>>
>>>>>> +
>>>>>> +<p>
>>>>>> +<a href="codingrationale.html#auto">Rationale and Discussion</a>
>>>>>> +</p>
>>>>>>
>>>>>>     <h4 id="Struct_Use">Struct Definitions</h4>
>>>>>>
>>>>>> diff --git a/htdocs/codingrationale.html b/htdocs/codingrationale.html
>>>>>> index 0b44f1da..a919023c 100644
>>>>>> --- a/htdocs/codingrationale.html
>>>>>> +++ b/htdocs/codingrationale.html
>>>>>> @@ -50,6 +50,23 @@ if (info *q = get_any_available_info ()) {
>>>>>>     }
>>>>>>     </code></pre></blockquote>
>>>>>>
>>>>>> +<h4 id="auto">Use of <code>auto</code></h4>
>>>>>> +
>>>>>> +<p>The reason for preferring <code>auto</code> in expressions like:
>>>>>> +<blockquote><pre>auto wioff = wi::to_wide (off);</pre></blockquote>
>>>>>> +is that using the natural type of the expression is more efficient than
>>>>>> +converting it to types like <code>wide_int</code>.</p>
>>>>>> +
>>>>>> +<p>The reason for excluding other uses of <code>auto</code> is that
>>>>>> +in most other cases the type carries useful information.  For example:
>>>>>> +<blockquote><pre>for (const std::pair <const char *, tree>
>>>>> &elt : indirect_pool)
>>>>>> +  ...</pre></blockquote>
>>>>>> +makes it obvious that <code>elt</code> is a pair and gives the types of
>>>>>> +<code>elt.first</code> and <code>elt.second</code>.  In contrast:
>>>>>> +<blockquote><pre>for (const auto &elt : indirect_pool)
>>>>>> +  ...</pre></blockquote>
>>>>>> +gives no immediate indication what <code>elt</code> is or what can
>>>>>> +be done with it.</p>
>>>>>
>>>>> ...there are countless constructs in C++ 98 as well in C where there
>>>>> is no such indication yet we don't (and very well can't) try to avoid
>>>>> using them.  Examples include macros, members of structures defined
>>>>> far away from the point of their use, results of ordinary function
>>>>> calls, results of overloaded functions or templates, default function
>>>>> arguments, default template parameters, etc.
>>>>>
>>>>> By way of a random example from genrecog.c:
>>>>>
>>>>>            int_set::iterator end
>>>>>           = std::set_union (trans1->labels.begin (), trans1->labels.end (),
>>>>>                             combined->begin (), combined->end (),
>>>>>                             next->begin ());
>>>>>
>>>>> There is no immediate indication precisely what type int_set::iterator
>>>>> is.  All we can tell is that that it's some sort of an iterator, and
>>>>> that should be good enough.  It lets us (even forces us to) write code
>>>>> that satisfies the requirements of the abstraction (whatever it happens
>>>>> to be), and avoid tying it closely to the implementation.  That's
>>>>> a good thing.
>>>
>>> Do you mean that this example should or shouldn't use "auto"?
>>> Iterators are included in the list above, so the idea was that using
>>> "auto" would be the recommended style here.
>>
>> I meant it as a general example where the exact type isn't (and isn't
>> supposed to be) apparent to the caller because it's an implementation
>> detail.
> 
> But like I say, the proposal was that this example should use "auto",
> and it sounds like you might agree.  In that case I don't think the
> example really helps make the decision about whether the coding
> standards are useful or not.
> 
> Do you have other examples that you think would be better written
> using "auto" that aren't covered by the list, and aren't covered
> by Jason's point about template-dependent types?

I think it applies any time mentioning the exact type isn't
necessary, not just because it might introduce a dependency on
details that the code doesn't need, but also because it's more
verbose than the alternative.  The for range loop mentioned
upthread is an example.

But auto is just one of a number of new core language features
new in C++.  Unlike some other C++ features(*), I don't think
it's easily misused, at least not to such an extent to justify
adding a guideline for us to use safely, consistently, or simply
in good taste.

Martin

[*] I could see value in guidelines around rvalue references
or deleted and defaulted functions, for instance, since those
are easily used in dangerous ways.

PS Herb Sutter has a nice article (as usual) where he summarizes
commonly raised concerns with auto and his thoughts on them:
https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/

>> Similarly, if an API defines a typedef string_tree_pair for
>> std::pair<const char*, tree>, it's the typedef that's meant to
>> be used in preference to what it expands to.
> 
> Using the typedef would be fine.  string_tree_pair describes
> the type pretty well: it's still obvious that "elt" is a pair,
> and what "elt.first" and "elt.second" are.  Typedefs that don't
> describe the type well are bad typedefs. :-)
> 
> The distinction is more between whether "auto" should be used or whether
> an explicit type should be used, rather than between different ways of
> writing the explicit type.  I agree it'd be worth adding "string_tree_pair"
> to the rationale too, to make it clearer that this isn't about expanding
> the typedefs as far they'll go.
> 
>>>>> Unless there is a sound technical reason for avoiding it (e.g.,
>>>>> unacceptable inefficiency or known safety problems) I'd say leave
>>>>> it to everyone's judgment what convenience features to use. If
>>>>> something turns out to be a problem we'll deal with it then.
>>>
>>> But using "auto" is never going to be an efficiency concern,
>>> and probably not a safety concern.  So in the case of "auto",
>>> using that principle would basically come down to "when to use
>>> auto is purely a judgement call".
>>>
>>> I don't see how we can get consistency with that kind of approach.
>>> Or is the argument that we're (or I'm :-)) worrying too much about
>>> consistency and we should just go with the flow?
>>>
>>> If we do treat it as a pure judgement call, the problem then is:
>>> who's judgement matters most here?  The author's or the reviewer's?
>>> Should the reviewer respect the choice of the author even if they
>>> don't personally agree with it, given that there are no technical
>>> issues at stake?
>>
>> When no technical concerns are at stake contributors should be free
>> to use the language as they feel is appropriate.  The fewer hurdles
>> we put in place the more time we will be able to focus on getting
>> the many technical details right, and the more fun it will be to
>> contribute.
> 
> I agree that's a self-consistent approach.  So I think at this point
> three options have been suggested:
> 
> (1) Try to add coding conventions around when "auto" should and
>      shouldn't be used.
> 
> (2) Be broadly accepting of "auto", but reject cases that seem
>      hard to read during code review.
> 
> (3) Allow "auto" to be used anywhere that a contributor thinks is
>      appropriate.  Since the decision isn't usually a technical one,
>      reviewers would be encouraged to respect the author's choice and
>      be discouraged from asking for a different choice.
> 
> Personally my preference order is (1), (3) and (2).  I think (2)
> is the worst of both worlds: it wouldn't give a consistent codebase,
> because whether something is seen as hard to read would vary based
> on the people involved.  And it wouldn't give predictability because
> contributors would only know whether a use of "auto" is acceptable
> by submitting a patch and seeing what the reaction is.
> 
> (3) also wouldn't give a consistent codebase, but it would give more
> predictable reviews.
> 
>> If a consensus emerges that some uses are generally
>> best avoided then it might be appropriate to reflect it in
>> the coding conventions.  But I'd hope that wouldn't happen before
>> we've had time to gain experience with it.
> 
> I think the difference here is between whether we start with a list
> of acceptable uses and expand it with experience, or whether we start
> by assuming all uses are acceptable and restrict it with experience.
> 
> The reason I think the former is better is that we're starting with
> a codebase that doesn't use "auto" at all.  So when the switch is
> flipped and the code is C++11, we'll have a C++11 codebase that never
> uses "auto".  Given that starting point, it seems more natural to list
> cases in which "auto" should be used (as a change to the status quo)
> rather than those in which it shouldn't.
> 
> But besides Jason's point about template-dependent types, I think the
> objections have been to the idea of having coding conventions around
> this in principle, rather than to the actual list.  So at this point
> I'm not sure whether the proposed list would actually stop someone
> from using "auto" in the way they'd typically want to use it,
> and if so, what those use cases are.
> 
> Like I say, the list was only supposed to be a starting point, based on
> my guess at what would be broadly acceptable.  So if the chosen cases
> are themselves a sticking point, suggestions for additions or modifications
> are definitely welcome.
> 
> And the list can be expanded later if new uses crop up.  That's still
> an improvement on how things were until now, where "auto" couldn't be
> used at all.
> 
> Thanks,
> Richard
>