This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation


On 16/04/15 17:01, David Malcolm wrote:
Attached is a work-in-progress patch for a new
   -Wmisleading-indentation
warning I've been experimenting with, for GCC 6.

It sounds very cool...

(D) tabs vs spaces.  This is probably the biggest can of worms.

I would suggest to be very conservative when warning. In case of doubt, do not warn.

An issue here is how to determine (i), or if it's OK to default to 8 and
have a command-line option (param?) to override it? (though what about,
say, each header file?)

In case of a potential warning, you could detect if there is a mix of tab and spaces for the lines affected that may make the warning bogus, and not warn in that case. That is,

<6 spaces>if (flagC)
<1 tab><space>foo ();
<1 tab><space>bar ();

warns but

<6 spaces>if (flagC)
<8 spaces>foo ();
<2   tabs>bar ();

does not because you don't know how many spaces those 2 tabs take.

Also, in order to detect whether something is a tab or a space, why not simply re-open and seek? The input file is probably already in memory. See how diagnostic_show_locus in diagnostics.c gets lines from given locations.

Not even Emacs, which is a GNU package, can get tabs vs. spaces right and compile-goto-error goes to the wrong column. That seems a can of worms not worth opening.


+
+   This is fed three things by the frontend:
+
+   (A) a series of location_t by the frontend's tokenizer,
+   corresponding to the locations of the token stream.  It uses
+   this to model a stack of "visual blocks", corresponding to
+   indentation levels within the source code.

I don't understand why you need more than three, one for the conditional statement, another for the solo statement and another for the next-statement. I also don't get why you need to pass the whole block and new_stmt to visual_block, don't the locations returned by EXPR_LOCATION suffice?

Given that, the C++ tentative parsing should not be a problem since there is no tentative parsing (or there should not be) of if/else/while, and you only care about the first token when parsing the other two statements.

@@ -236,6 +236,9 @@ c_lex_one_token (c_parser *parser, c_token *token)
   token->keyword = RID_MAX;
   token->pragma_kind = PRAGMA_NONE;

+  if (vis_parser)
+    vis_parser->on_token (token->location);
+

Why you need to save every location? Why not just do this when the parser detects an if/else/while? Then,

@@ -5081,7 +5086,11 @@ c_parser_if_body (c_parser *parser, bool *if_p)
   else if (c_parser_next_token_is (parser, CPP_OPEN_BRACE))
     add_stmt (c_parser_compound_statement (parser));
   else
-    c_parser_statement_after_labels (parser);
+    {
+      c_parser_statement_after_labels (parser);
+      if (vis_parser)
+	vis_parser->on_solo_stmt (block, if_loc);
+    }
   return c_end_compound_stmt (body_loc, block, flag_isoc99);
 }

You can simply here peek the location of the next token before and after c_parser_statement_after_labels (parser). Then warn if they are equal or if they are on the same line, no? You may need to skip pragmas.


+   Note that we can't simply use statement locations; for example, in:
+     if (flag)
+       x = 1;
+   the "if (flag)"'s location is at the open-paren, and that of the
+   assignment "x = 1;" is at the equals-sign, so any attempt to use
+   statement locations can be fooled by varying the spacing:

Aren't these bugs (or missing features) in the FE locations? I know that the location of the assignment is at '=' because we don't have (yet) locations for constants or variables (all binary expressions should have 3 locations!).

In any case, don't you have the location of the token that starts the statements? This is all you need as far as I understand.

And why the 'if' points to the open paren? One can easily find the open-paren location if needed by seeking in the input buffer.


+	  /* FIXME: what about entirely empty lines???
+	     Presumably they simply don't get tokens.  */

Empty lines no, but preprocessor directives do.

Cheers,

Manuel.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]