This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

From: Manuel LÃpez-IbÃÃez <lopezibanez at gmail dot com>
To: David Malcolm <dmalcolm at redhat dot com>, gcc-patches at gcc dot gnu dot org
Date: Thu, 16 Apr 2015 20:29:05 +0200
Subject: Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation
Authentication-results: sourceware.org; auth=none
References: <1429196485 dot 32584 dot 46 dot camel at surprise>

On 16/04/15 17:01, David Malcolm wrote:

Attached is a work-in-progress patch for a new
   -Wmisleading-indentation
warning I've been experimenting with, for GCC 6.


It sounds very cool...

(D) tabs vs spaces.  This is probably the biggest can of worms.

I would suggest to be very conservative when warning. In case of doubt, do notwarn.

An issue here is how to determine (i), or if it's OK to default to 8 and
have a command-line option (param?) to override it? (though what about,
say, each header file?)

In case of a potential warning, you could detect if there is a mix of tab andspaces for the lines affected that may make the warning bogus, and not warn inthat case. That is,


<6 spaces>if (flagC)
<1 tab><space>foo ();
<1 tab><space>bar ();

warns but

<6 spaces>if (flagC)
<8 spaces>foo ();
<2   tabs>bar ();

does not because you don't know how many spaces those 2 tabs take.

Also, in order to detect whether something is a tab or a space, why not simplyre-open and seek? The input file is probably already in memory. See howdiagnostic_show_locus in diagnostics.c gets lines from given locations.

Not even Emacs, which is a GNU package, can get tabs vs. spaces right andcompile-goto-error goes to the wrong column. That seems a can of worms notworth opening.



+
+   This is fed three things by the frontend:
+
+   (A) a series of location_t by the frontend's tokenizer,
+   corresponding to the locations of the token stream.  It uses
+   this to model a stack of "visual blocks", corresponding to
+   indentation levels within the source code.

I don't understand why you need more than three, one for the conditionalstatement, another for the solo statement and another for the next-statement.I also don't get why you need to pass the whole block and new_stmt tovisual_block, don't the locations returned by EXPR_LOCATION suffice?

Given that, the C++ tentative parsing should not be a problem since there is notentative parsing (or there should not be) of if/else/while, and you only careabout the first token when parsing the other two statements.


@@ -236,6 +236,9 @@ c_lex_one_token (c_parser *parser, c_token *token)
   token->keyword = RID_MAX;
   token->pragma_kind = PRAGMA_NONE;

+  if (vis_parser)
+    vis_parser->on_token (token->location);
+

Why you need to save every location? Why not just do this when the parserdetects an if/else/while? Then,


@@ -5081,7 +5086,11 @@ c_parser_if_body (c_parser *parser, bool *if_p)
   else if (c_parser_next_token_is (parser, CPP_OPEN_BRACE))
     add_stmt (c_parser_compound_statement (parser));
   else
-    c_parser_statement_after_labels (parser);
+    {
+      c_parser_statement_after_labels (parser);
+      if (vis_parser)
+	vis_parser->on_solo_stmt (block, if_loc);
+    }
   return c_end_compound_stmt (body_loc, block, flag_isoc99);
 }

You can simply here peek the location of the next token before and afterc_parser_statement_after_labels (parser). Then warn if they are equal or ifthey are on the same line, no? You may need to skip pragmas.



+   Note that we can't simply use statement locations; for example, in:
+     if (flag)
+       x = 1;
+   the "if (flag)"'s location is at the open-paren, and that of the
+   assignment "x = 1;" is at the equals-sign, so any attempt to use
+   statement locations can be fooled by varying the spacing:

Aren't these bugs (or missing features) in the FE locations? I know that thelocation of the assignment is at '=' because we don't have (yet) locations forconstants or variables (all binary expressions should have 3 locations!).

In any case, don't you have the location of the token that starts thestatements? This is all you need as far as I understand.

And why the 'if' points to the open paren? One can easily find the open-parenlocation if needed by seeking in the input buffer.



+	  /* FIXME: what about entirely empty lines???
+	     Presumably they simply don't get tokens.  */

Empty lines no, but preprocessor directives do.

Cheers,

Manuel.

References:
- [RFC stage 1] Proposed new warning: -Wmisleading-indentation
  - From: David Malcolm

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]