This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Better error recovery for merge-conflict markers (v5)


On Wed, 2015-12-09 at 18:44 +0100, Bernd Schmidt wrote:
> On 12/09/2015 05:58 PM, David Malcolm wrote:
> > On Wed, 2015-11-04 at 14:56 +0100, Bernd Schmidt wrote:
> >>
> >> This seems like fairly low impact but also low cost, so I'm fine with it
> >> in principle. I wonder whether the length of the marker is the same
> >> across all versions of patch (and VC tools)?
> >
> > It's hardcoded for GNU patch:
> [...]
>  >From what I can tell, Perforce is the outlier here.
>
> Thanks for checking all that.
>
> >> Just thinking out loud - I guess it would be too much to hope for to
> >> share lexers between frontends so that we need only one copy of this?
> >
> > Probably :(
>
> Someone slap sense into me, I just thought of deriving C and C++ parsers
> from a common base class... (no this is not a suggestion for this patch).
>
> > Would a better wording be:
> >
> > extern short some_var; /* This line would lead to a warning due to the
> >                            duplicate name, but it is skipped when handling
> >                            the conflict marker.  */
>
> I think so, yes.
>
> > That said, it's not clear they're always at the beginning of a line;
> > this bazaar bug indicates that CVS (and bazaar) can emit them
> > mid-line:
> >    https://bugs.launchpad.net/bzr/+bug/36399
>
> Ok. CVS I think we shouldn't worry about, and it looks like this is one
> particular bug/corner case where the conflict end marker is the last
> thing in the file. I think on the whole it's best to check for beginning
> of the line as you've done.
>
> > Wording-wise, should it be "merge conflict marker", rather
> > than "patch conflict marker"?
> >
> > Clang spells it:
> > "error: version control conflict marker in file"
> > http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#merge_conflicts
>
> Yeah, if another compiler has a similar/identical diagnostic I think we
> should just copy that unless there's a very good reason not to.
>
> > Rebased on top of r231445 (from yesterday).
> > Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.
> > Adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum.
> >
> > OK for trunk?
>
> I'm inclined to say yes since it was originally submitted in time and
> it's hard to imagine how the change could be risky (I'll point out right
> away that there are one or two other patches in the queue that were also
> submitted in time which I feel should not be considered for gcc-6 at
> this point due to risk).
>
> Let's wait until the end of the week for objections, commit then.

I got thinking about what we'd have to do to support Perforce-style
markers, and began to find my token-matching approach to be a little
clunky (in conjunction with reading Martin's observations on
c_parser_peek_nth_token).

Here's a reimplementation of the patch which takes a much simpler
approach, and avoids the need to touch the C lexer: check that we're
not in a macro expansion and then read in the source line, and
textually compare against the various possible conflict markers.
This adds the requirement that the source file be readable, so it
won't detect conflict markers in a .i file from -save-temps, but
that seems no great loss compared to the simpler, more flexible
implementation.  We're about to emit an error at the line, so
this shouldn't add any extra file access for the default case
of printing the source line after the error.

Is this approach preferable, or should I just go with the
v4 approach?

Other changes:
- Updated wording to match clang's
  "error: version control conflict marker in file"
and replaced "patch conflict marker" with "conflict marker" in the code
and names of test cases.  (I have a version of the v4 patch
with those changes).
- Renamed local "loc" to "marker_loc" to clarify things.

Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.

Sorry to prolong this; thanks for your patience.

gcc/c-family/ChangeLog:
	* c-common.h (conflict_marker_at_location_p): New prototype.
	* c-lex.c (conflict_marker_within_line_p): New function.
	(conflict_marker_at_location_p): New function.

gcc/c/ChangeLog:
	* c-parser.c (c_parser_error): Detect conflict markers and report
	them as such.

gcc/cp/ChangeLog:
	* parser.c (cp_parser_error): Detect conflict markers and report
	them as such.

gcc/testsuite/ChangeLog:
	* c-c++-common/conflict-markers-1.c: New testcase.
	* c-c++-common/conflict-markers-2.c: Likewise.
	* c-c++-common/conflict-markers-3.c: Likewise.
	* c-c++-common/conflict-markers-4.c: Likewise.
	* c-c++-common/conflict-markers-5.c: Likewise.
	* c-c++-common/conflict-markers-6.c: Likewise.
	* c-c++-common/conflict-markers-7.c: Likewise.
	* c-c++-common/conflict-markers-8.c: Likewise.
	* c-c++-common/conflict-markers-9.c: Likewise.
	* c-c++-common/conflict-markers-10.c: Likewise.
	* c-c++-common/conflict-markers-11.c: Likewise.
	* g++.dg/conflict-markers-1.C: Likewise.
---
 gcc/c-family/c-common.h                          |  4 ++
 gcc/c-family/c-lex.c                             | 67 ++++++++++++++++++++++++
 gcc/c/c-parser.c                                 | 14 +++++
 gcc/cp/parser.c                                  | 13 +++++
 gcc/testsuite/c-c++-common/conflict-markers-1.c  |  9 ++++
 gcc/testsuite/c-c++-common/conflict-markers-10.c | 23 ++++++++
 gcc/testsuite/c-c++-common/conflict-markers-11.c | 14 +++++
 gcc/testsuite/c-c++-common/conflict-markers-2.c  |  2 +
 gcc/testsuite/c-c++-common/conflict-markers-3.c  | 11 ++++
 gcc/testsuite/c-c++-common/conflict-markers-4.c  | 11 ++++
 gcc/testsuite/c-c++-common/conflict-markers-5.c  | 11 ++++
 gcc/testsuite/c-c++-common/conflict-markers-6.c  | 38 ++++++++++++++
 gcc/testsuite/c-c++-common/conflict-markers-7.c  |  6 +++
 gcc/testsuite/c-c++-common/conflict-markers-8.c  |  4 ++
 gcc/testsuite/c-c++-common/conflict-markers-9.c  |  8 +++
 gcc/testsuite/g++.dg/conflict-markers-1.C        | 13 +++++
 16 files changed, 248 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-1.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-10.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-11.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-2.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-3.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-4.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-5.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-6.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-7.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-8.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-9.c
 create mode 100644 gcc/testsuite/g++.dg/conflict-markers-1.C

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index ef64e6b..519c3fa 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1089,6 +1089,10 @@ extern void c_genericize (tree);
 extern int c_gimplify_expr (tree *, gimple_seq *, gimple_seq *);
 extern tree c_build_bind_expr (location_t, tree, tree);
 
+/* In c-lex.c  */
+extern bool conflict_marker_at_location_p (location_t start_loc,
+					   location_t *out_loc);
+
 /* In c-pch.c  */
 extern void pch_init (void);
 extern void pch_cpp_save_state (void);
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 9c86ba7..dd598c6 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -1263,3 +1263,70 @@ lex_charconst (const cpp_token *token)
 
   return value;
 }
+
+/* Determine if LINE begins with a conflict marker.
+   If so, write to *OUT_LEN with the length of the marker.  */
+
+static bool
+conflict_marker_within_line_p (const char *line, int *out_len)
+{
+  if (0 == strncmp (line, "<<<<<<<", 7))
+    {
+      *out_len = 7;
+      return true;
+    }
+  if (0 == strncmp (line, "=======", 7))
+    {
+      *out_len = 7;
+      return true;
+    }
+  if (0 == strncmp (line, ">>>>>>>", 7))
+    {
+      *out_len = 7;
+      return true;
+    }
+
+  return false;
+}
+
+/* Helper function for c_parser_error and cp_parser_error.
+   Given an error occurring at a "<<", ">>" or "==" token at START_LOC,
+   determine if this is a conflict marker.
+   If it returns true, *OUT_LOC is written to with the location/range
+   of the marker.  */
+
+bool
+conflict_marker_at_location_p (location_t start_loc, location_t *out_loc)
+{
+  /* It must not be in a macro expansion.  */
+  if (linemap_location_from_macro_expansion_p (line_table, start_loc))
+    return false;
+
+  /* It must be at the start of the line.  */
+  expanded_location exploc = expand_location (start_loc);
+  if (exploc.column != 1)
+    return false;
+
+  const char *line = location_get_source_line (exploc.file, exploc.line,
+					       NULL);
+  /* If we can't read the source file (e.g. when handling a .i file)
+     don't attempt to detect conflict markers.  */
+  if (!line)
+    return false;
+
+  /* Does "line" start with a conflict marker?  */
+  int marker_len;
+  if (!conflict_marker_within_line_p (line, &marker_len))
+    return false;
+
+  /* We have a conflict marker.  Construct a location of the form:
+       <<<<<<<
+       ^~~~~~~
+     with start == caret, finishing at the end of the marker.  */
+  location_t finish_loc
+    = linemap_position_for_loc_and_offset (line_table, start_loc,
+					   marker_len - 1);
+  *out_loc = make_location (start_loc, start_loc, finish_loc);
+
+  return true;
+}
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 124c30b..5014c8d 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -850,6 +850,20 @@ c_parser_error (c_parser *parser, const char *gmsgid)
   parser->error = true;
   if (!gmsgid)
     return;
+
+  /* If this is actually a conflict marker, report it as such.  */
+  if (token->type == CPP_LSHIFT
+      || token->type == CPP_RSHIFT
+      || token->type == CPP_EQ_EQ)
+    {
+      location_t marker_loc;
+      if (conflict_marker_at_location_p (token->location, &marker_loc))
+	{
+	  error_at (marker_loc, "version control conflict marker in file");
+	  return;
+	}
+    }
+
   /* This diagnostic makes more sense if it is tagged to the line of
      the token we just peeked at.  */
   c_parser_set_source_position_from_token (token);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index a420cf1..8bdaed7 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2713,6 +2713,19 @@ cp_parser_error (cp_parser* parser, const char* gmsgid)
 	  return;
 	}
 
+      /* If this is actually a conflict marker, report it as such.  */
+      if (token->type == CPP_LSHIFT
+	  || token->type == CPP_RSHIFT
+	  || token->type == CPP_EQ_EQ)
+	{
+	  location_t marker_loc;
+	  if (conflict_marker_at_location_p (token->location, &marker_loc))
+	    {
+	      error_at (marker_loc, "version control conflict marker in file");
+	      return;
+	    }
+	}
+
       c_parse_error (gmsgid,
 		     /* Because c_parser_error does not understand
 			CPP_KEYWORD, keywords are treated like
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-1.c b/gcc/testsuite/c-c++-common/conflict-markers-1.c
new file mode 100644
index 0000000..752f146
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-1.c
@@ -0,0 +1,9 @@
+int p;
+
+<<<<<<< HEAD /* { dg-error "conflict marker" } */
+extern int some_var;
+=======      /* { dg-error "conflict marker" } */
+extern short some_var; /* this line would lead to a warning */
+>>>>>>> Some commit message  /* { dg-error "conflict marker" } */
+
+int q;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-10.c b/gcc/testsuite/c-c++-common/conflict-markers-10.c
new file mode 100644
index 0000000..279406f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-10.c
@@ -0,0 +1,23 @@
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+<<<<<<< HEAD /* { dg-error "conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ <<<<<<< HEAD
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
+
+extern int some_var;
+
+=======      /* { dg-error "conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ =======
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
+
+extern short some_var; /* this line would lead to a warning */
+
+>>>>>>> Some commit message  /* { dg-error "conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ >>>>>>>
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-11.c b/gcc/testsuite/c-c++-common/conflict-markers-11.c
new file mode 100644
index 0000000..8771453
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-11.c
@@ -0,0 +1,14 @@
+/* Verify that we only report conflict markers at the start of lines.  */
+int p;
+
+ <<<<<<< HEAD /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int q;
+
+ =======      /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int r;
+
+ >>>>>>> Some commit message  /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int s;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-2.c b/gcc/testsuite/c-c++-common/conflict-markers-2.c
new file mode 100644
index 0000000..f06d043
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-2.c
@@ -0,0 +1,2 @@
+/* This should not be flagged as a conflict marker.  */
+const char *msg = "<<<<<<< ";
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-3.c b/gcc/testsuite/c-c++-common/conflict-markers-3.c
new file mode 100644
index 0000000..f149ecc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-3.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle unterminated conflict markers.  */
+
+int p;
+
+<<<<<<< HEAD  /* { dg-error "conflict marker" } */
+
+int q;
+
+<<<<<<< HEAD  /* { dg-error "conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-4.c b/gcc/testsuite/c-c++-common/conflict-markers-4.c
new file mode 100644
index 0000000..a3c53ea
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-4.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+>>>>>>> Some commit message  /* { dg-error "conflict marker" } */
+
+int q;
+
+>>>>>>> Some other commit message  /* { dg-error "conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-5.c b/gcc/testsuite/c-c++-common/conflict-markers-5.c
new file mode 100644
index 0000000..b55c9c3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-5.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+=======  /* { dg-error "conflict marker" } */
+
+int q;
+
+=======  /* { dg-error "conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-6.c b/gcc/testsuite/c-c++-common/conflict-markers-6.c
new file mode 100644
index 0000000..081e289
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-6.c
@@ -0,0 +1,38 @@
+/* Branch coverage of conflict marker detection:
+   none of these should be reported as conflict markers.  */
+
+int a0;
+
+<< HEAD  /* { dg-error "expected" } */
+
+int a1;
+
+<<<< HEAD  /* { dg-error "expected" } */
+
+int a2;
+
+<<<<<< HEAD  /* { dg-error "expected" } */
+
+int b0;
+
+== HEAD  /* { dg-error "expected" } */
+
+int b1;
+
+==== HEAD  /* { dg-error "expected" } */
+
+int b2;
+
+====== HEAD  /* { dg-error "expected" } */
+
+int c0;
+
+>> HEAD  /* { dg-error "expected" } */
+
+int c1;
+
+>>>> HEAD  /* { dg-error "expected" } */
+
+int c2;
+
+>>>>>> HEAD  /* { dg-error "expected" } */
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-7.c b/gcc/testsuite/c-c++-common/conflict-markers-7.c
new file mode 100644
index 0000000..e68f84d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-7.c
@@ -0,0 +1,6 @@
+/* It's valid to stringize the "<<<<<<<"; don't
+   report it as a conflict marker.  */
+#define str(s) #s
+const char *s = str(
+<<<<<<<
+);
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-8.c b/gcc/testsuite/c-c++-common/conflict-markers-8.c
new file mode 100644
index 0000000..be2e121
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-8.c
@@ -0,0 +1,4 @@
+/* A macro that's never expanded shouldn't be reported as a
+   conflict marker.  */
+#define foo \
+<<<<<<<
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-9.c b/gcc/testsuite/c-c++-common/conflict-markers-9.c
new file mode 100644
index 0000000..5c1e663
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-9.c
@@ -0,0 +1,8 @@
+/* It's valid to have
+<<<<<<<
+   inside both
+   comments (as above), and within string literals.  */
+const char *s = "\
+<<<<<<<";
+
+/* The above shouldn't be reported as errors.  */
diff --git a/gcc/testsuite/g++.dg/conflict-markers-1.C b/gcc/testsuite/g++.dg/conflict-markers-1.C
new file mode 100644
index 0000000..ae19193
--- /dev/null
+++ b/gcc/testsuite/g++.dg/conflict-markers-1.C
@@ -0,0 +1,13 @@
+/* Ensure that we don't complain about patch conflict markers on
+   valid template argument lists, valid in C++11 onwards.  */
+// { dg-options "-std=c++11" }
+
+template <typename T>
+struct foo
+{
+  T t;
+};
+
+foo <foo <foo <foo <foo <foo <foo <int
+>>>>>>> f;
+// The above line is valid C++11, and isn't a patch conflict marker
-- 
1.8.5.3


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]