[PATCH] v2: C/C++: don't suggest implementation names as spelling fixes (PR c/83236)

David Malcolm dmalcolm@redhat.com
Sat Dec 2 00:07:00 GMT 2017


On Fri, 2017-12-01 at 22:56 +0100, Jakub Jelinek wrote:
> On Fri, Dec 01, 2017 at 04:48:20PM -0500, David Malcolm wrote:
> > PR c/83236 reports an issue where the C FE unhelpfully suggests the
> > use
> > of glibc's private "__ino_t" type when it fails to recognize
> > "ino_t":
> > 
> > $ cat > test.c <<EOF
> > #include <sys/stat.h>
> > ino_t inode;
> > EOF
> > $ gcc -std=c89 -fsyntax-only test.c
> > test.c:2:1: error: unknown type name 'ino_t'; did you mean
> > '__ino_t'?
> >  ino_t inode;
> >  ^~~~~
> >  __ino_t
> > 
> > This patch updates the C/C++ FEs suggestions for unrecognized
> > identifiers
> > so that they don't suggest names that are reserved for use by the
> > implementation i.e. those that begin with an underscore and either
> > an
> > uppercase letter or another underscore.
> > 
> > However, it allows built-in macros that match this pattern to be
> > suggested, since it's useful to be able to suggest __FILE__,
> > __LINE__
> > etc.  Other macros *are* filtered.
> > 
> > One wart with the patch: the existing macro-handling spellcheck
> > code
> > is in spellcheck-tree.c, and needs to call the the new function
> > "name_reserved_for_implementation_p", however the latter relates to
> > the C family of FEs.
> > Perhaps I should move all of the the macro-handling stuff in
> > spellcheck-tree.h/c to e.g. a new c-family/c-spellcheck.h/c as a
> > first step?
> > 
> > Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.
> > 
> > OK for trunk?
> > 
> > gcc/c/ChangeLog:
> > 	PR c/83236
> > 	* c-decl.c (lookup_name_fuzzy): Don't suggest names that are
> > 	reserved for use by the implementation.
> > 
> > gcc/cp/ChangeLog:
> > 	PR c/83236
> > 	* name-lookup.c (consider_binding_level): Don't suggest names
> > that
> > 	are reserved for use by the implementation.
> > 
> > gcc/ChangeLog:
> > 	PR c/83236
> > 	* spellcheck-tree.c (name_reserved_for_implementation_p): New
> > 	function.
> > 	(should_suggest_as_macro_p): New function.
> > 	(find_closest_macro_cpp_cb): Move the check for NT_MACRO to
> > 	should_suggest_as_macro_p and call it.
> > 	(selftest::test_name_reserved_for_implementation_p): New
> > function.
> > 	(selftest::spellcheck_tree_c_tests): Call it.
> > 	* spellcheck-tree.h (name_reserved_for_implementation_p): New
> > 	decl.
> > 
> > gcc/testsuite/ChangeLog:
> > 	PR c/83236
> > 	* c-c++-common/spellcheck-reserved.c: New test case.
> > ---
> >  gcc/c/c-decl.c                                   |  5 +++
> >  gcc/cp/name-lookup.c                             | 18 +++++++---
> >  gcc/spellcheck-tree.c                            | 46
> > +++++++++++++++++++++++-
> >  gcc/spellcheck-tree.h                            |  2 ++
> >  gcc/testsuite/c-c++-common/spellcheck-reserved.c | 25
> > +++++++++++++
> >  5 files changed, 91 insertions(+), 5 deletions(-)
> >  create mode 100644 gcc/testsuite/c-c++-common/spellcheck-
> > reserved.c
> > 
> > diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> > index 56c63d8..dfd136d 100644
> > --- a/gcc/c/c-decl.c
> > +++ b/gcc/c/c-decl.c
> > @@ -4041,6 +4041,11 @@ lookup_name_fuzzy (tree name, enum
> > lookup_name_fuzzy_kind kind, location_t loc)
> >  	if (TREE_CODE (binding->decl) == FUNCTION_DECL)
> >  	  if (C_DECL_IMPLICIT (binding->decl))
> >  	    continue;
> > +	/* Don't suggest names that are reserved for use by the
> > +	   implementation.  */
> > +	if (name_reserved_for_implementation_p
> > +	    (IDENTIFIER_POINTER (binding->id)))
> 
> Can't you use a temporary to avoid wrapping line between function
> name and ( ?

Fixed.

> More importantly, does this mean if I mistype __builtin_strtchr it
> won't suggest __builtin_strrchr?  Would be nice if the filtering
> of the names reserved for implementation isn't done if the
> name being looked up is reserved for implementation.

Good idea, thanks.

Here's an updated version of the patch.

Changed in v2:
* don't filter suggestions if the name name being looked up
  is itself reserved for implementation
* fix wrapping in c-decl.c's lookup_name_fuzzy
* name-lookup.c (consider_binding_level): rename new variable from "name"
  to "suggestion" to avoid shadowing a param
* spellcheck-tree.c (test_name_reserved_for_implementation_p): Add more
  test coverage ("_" and "__")

One additional wart I noticed writing the testase is that the
C and C++ frontends offer different suggestions for "__builtin_strtchr".
C recomends:
  __builtin_strchr
whereas C++ recommends:
  __builtin_strrchr

The reason is that the C FE visits the builtins in order of builtins.def,
whereas C++ visits them in the reverse order.

Both have the same edit distance, and so the first "winner" in
best_match varies between FEs.

This is a pre-existing issue, though (not sure if it warrants a PR).

Bootstrap&regrtest in progress; OK if it passes?

As before, the other wart is that the existing macro-handling
spellcheck code is in spellcheck-tree.c, and needs to call the the
new function "name_reserved_for_implementation_p", however the latter
relates to the C family of FEs.
Perhaps I should move all of the the macro-handling stuff in
spellcheck-tree.h/c to e.g. a new c-family/c-spellcheck.h/c as a
first step?

gcc/c/ChangeLog:
	PR c/83236
	* c-decl.c (lookup_name_fuzzy): Don't suggest names that are
	reserved for use by the implementation.

gcc/cp/ChangeLog:
	PR c/83236
	* name-lookup.c (consider_binding_level): Don't suggest names that
	are reserved for use by the implementation.

gcc/ChangeLog:
	PR c/83236
	* spellcheck-tree.c (name_reserved_for_implementation_p): New
	function.
	(should_suggest_as_macro_p): New function.
	(find_closest_macro_cpp_cb): Move the check for NT_MACRO to
	should_suggest_as_macro_p and call it.
	(selftest::test_name_reserved_for_implementation_p): New function.
	(selftest::spellcheck_tree_c_tests): Call it.
	* spellcheck-tree.h (name_reserved_for_implementation_p): New
	decl.

gcc/testsuite/ChangeLog:
	PR c/83236
	* c-c++-common/spellcheck-reserved.c: New test case.
---
 gcc/c/c-decl.c                                   | 11 ++++++
 gcc/cp/name-lookup.c                             | 24 ++++++++++--
 gcc/spellcheck-tree.c                            | 48 +++++++++++++++++++++++-
 gcc/spellcheck-tree.h                            |  2 +
 gcc/testsuite/c-c++-common/spellcheck-reserved.c | 35 +++++++++++++++++
 5 files changed, 115 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/spellcheck-reserved.c

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 56c63d8..eaa6d55 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -4026,6 +4026,9 @@ lookup_name_fuzzy (tree name, enum lookup_name_fuzzy_kind kind, location_t loc)
 						  IDENTIFIER_POINTER (name),
 						  header_hint));
 
+  bool name_reserved
+    = name_reserved_for_implementation_p (IDENTIFIER_POINTER (name));
+
   best_match<tree, tree> bm (name);
 
   /* Look within currently valid scopes.  */
@@ -4041,6 +4044,14 @@ lookup_name_fuzzy (tree name, enum lookup_name_fuzzy_kind kind, location_t loc)
 	if (TREE_CODE (binding->decl) == FUNCTION_DECL)
 	  if (C_DECL_IMPLICIT (binding->decl))
 	    continue;
+	/* Don't suggest names that are reserved for use by the
+	   implementation, unless NAME itself is reserved.  */
+	if (!name_reserved)
+	  {
+	    const char *suggestion_str = IDENTIFIER_POINTER (binding->id);
+	    if (name_reserved_for_implementation_p (suggestion_str))
+	      continue;
+	  }
 	switch (kind)
 	  {
 	  case FUZZY_LOOKUP_TYPENAME:
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 9f65c4d..f5581b7 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -5613,6 +5613,9 @@ consider_binding_level (tree name, best_match <tree, const char *> &bm,
 	  bm.consider (IDENTIFIER_POINTER (best_matching_field));
       }
 
+  bool name_reserved
+    = name_reserved_for_implementation_p (IDENTIFIER_POINTER (name));
+
   for (tree t = lvl->names; t; t = TREE_CHAIN (t))
     {
       tree d = t;
@@ -5633,10 +5636,23 @@ consider_binding_level (tree name, best_match <tree, const char *> &bm,
 	  && DECL_ANTICIPATED (d))
 	continue;
 
-      if (tree name = DECL_NAME (d))
-	/* Ignore internal names with spaces in them.  */
-	if (!strchr (IDENTIFIER_POINTER (name), ' '))
-	  bm.consider (IDENTIFIER_POINTER (name));
+      tree suggestion = DECL_NAME (d);
+      if (!suggestion)
+	continue;
+
+      const char *suggestion_str = IDENTIFIER_POINTER (suggestion);
+
+      /* Ignore internal names with spaces in them.  */
+      if (strchr (suggestion_str, ' '))
+	continue;
+
+      /* Don't suggest names that are reserved for use by the
+	 implementation, unless NAME itself is reserved.  */
+      if (name_reserved_for_implementation_p (suggestion_str)
+	  && !name_reserved)
+	continue;
+
+      bm.consider (suggestion_str);
     }
 }
 
diff --git a/gcc/spellcheck-tree.c b/gcc/spellcheck-tree.c
index 56740b9..310f8d2 100644
--- a/gcc/spellcheck-tree.c
+++ b/gcc/spellcheck-tree.c
@@ -66,6 +66,36 @@ find_closest_identifier (tree target, const auto_vec<tree> *candidates)
   return bm.get_best_meaningful_candidate ();
 }
 
+/* Return true iff STR begin with an underscore and either an uppercase
+   letter or another underscore, and is thus, for C and C++, reserved for
+   use by the implementation.  */
+
+bool
+name_reserved_for_implementation_p (const char *str)
+{
+  if (str[0] != '_')
+    return false;
+  return (str[1] == '_' || ISUPPER(str[1]));
+}
+
+/* Return true iff HASHNODE is a macro that should be offered as a
+   suggestion for a misspelling.  */
+
+static bool
+should_suggest_as_macro_p (cpp_hashnode *hashnode)
+{
+  if (hashnode->type != NT_MACRO)
+    return false;
+
+  /* Don't suggest names reserved for the implementation, but do suggest the builtin
+     macros such as __FILE__, __LINE__ etc.  */
+  if (name_reserved_for_implementation_p ((const char *)hashnode->ident.str)
+      && !(hashnode->flags & NODE_BUILTIN))
+    return false;
+
+  return true;
+}
+
 /* A callback for cpp_forall_identifiers, for use by best_macro_match's ctor.
    Process HASHNODE and update the best_macro_match instance pointed to be
    USER_DATA.  */
@@ -74,7 +104,7 @@ static int
 find_closest_macro_cpp_cb (cpp_reader *, cpp_hashnode *hashnode,
 			   void *user_data)
 {
-  if (hashnode->type != NT_MACRO)
+  if (!should_suggest_as_macro_p (hashnode))
     return 1;
 
   best_macro_match *bmm = (best_macro_match *)user_data;
@@ -131,12 +161,28 @@ test_find_closest_identifier ()
 				      &candidates));
 }
 
+/* Verify that name_reserved_for_implementation_p is sane.  */
+
+static void
+test_name_reserved_for_implementation_p ()
+{
+  ASSERT_FALSE (name_reserved_for_implementation_p (""));
+  ASSERT_FALSE (name_reserved_for_implementation_p ("foo"));
+  ASSERT_FALSE (name_reserved_for_implementation_p ("_"));
+  ASSERT_FALSE (name_reserved_for_implementation_p ("_foo"));
+  ASSERT_FALSE (name_reserved_for_implementation_p ("_42"));
+  ASSERT_TRUE (name_reserved_for_implementation_p ("_Foo"));
+  ASSERT_TRUE (name_reserved_for_implementation_p ("__"));
+  ASSERT_TRUE (name_reserved_for_implementation_p ("__foo"));
+}
+
 /* Run all of the selftests within this file.  */
 
 void
 spellcheck_tree_c_tests ()
 {
   test_find_closest_identifier ();
+  test_name_reserved_for_implementation_p ();
 }
 
 } // namespace selftest
diff --git a/gcc/spellcheck-tree.h b/gcc/spellcheck-tree.h
index eecfd1a..851ea22 100644
--- a/gcc/spellcheck-tree.h
+++ b/gcc/spellcheck-tree.h
@@ -74,4 +74,6 @@ class best_macro_match : public best_match<tree, cpp_hashnode *>
 		    cpp_reader *reader);
 };
 
+extern bool name_reserved_for_implementation_p (const char *str);
+
 #endif  /* GCC_SPELLCHECK_TREE_H  */
diff --git a/gcc/testsuite/c-c++-common/spellcheck-reserved.c b/gcc/testsuite/c-c++-common/spellcheck-reserved.c
new file mode 100644
index 0000000..08630ed
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/spellcheck-reserved.c
@@ -0,0 +1,35 @@
+/* Verify that we don't offer names that are reserved for the
+   implementation (PR c/83236).  */
+/* { dg-options "-nostdinc" } */
+
+/* Example of an identifier with a leading double-underscore.
+   We shouldn't offer '__ino_t' as a suggestion for an unknown 'ino_t'.  */
+
+typedef unsigned long int __ino_t;
+ino_t inode; /* { dg-error "did you mean 'int'" } */
+
+
+/* Example of a typedef with a leading double-underscore.  */
+
+typedef unsigned char __u_char;
+u_char ch; /* { dg-error "did you mean 'char'" } */
+
+
+/* Example of a macro with a leading double-underscore.  */
+
+# define __SOME_MACRO	int
+
+SOME_MACRO foo; /* { dg-bogus "__SOME_MACRO" } */
+/* { dg-error "'SOME_MACRO'" "" { target *-*-* } .-1 } */
+
+
+/* If the misspelled name itself matches the "reserved" pattern, then
+   it's OK to suggest such names.  */
+
+void test (const char *buf, char ch)
+{
+  __builtin_strtchr (buf, ch); /* { dg-line misspelled_reserved } */
+  /* { dg-warning "did you mean '__builtin_strchr'" "" { target c } misspelled_reserved } */
+  /* { dg-error "not declared"  "" { target c++ } misspelled_reserved } */
+  /* { dg-message "'__builtin_strrchr'"  "" { target c++ } misspelled_reserved } */
+}
-- 
1.8.5.3



More information about the Gcc-patches mailing list