[PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

Lewis Hyatt lhyatt@gmail.com
Fri May 8 19:35:25 GMT 2020


On Fri, Jan 31, 2020 at 03:31:59PM -0500, David Malcolm wrote:
> On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:
> > Hello-
> > 
> > Here is the second patch that I mentioned when I submitted the other
> > related
> > patch (which is awaiting review):
> > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. 
> 
> Sorry about that; I'm v. busy with analyzer bugs right now.
> 
> > This second patch
> > is based on top of the first one and it closes out PR49973 and
> > PR86904 by
> > adding the new option -fdiagnostics-column-unit=[display|byte]. This
> > allows
> > to specify whether columns are output as simple byte counts (the
> > current
> > behavior), or as display columns including handling multibyte
> > characters and
> > tabs. The patch makes display columns the new default. Additionally,
> > a
> > second new option -fdiagnostics-column-origin is added, which allows
> > to make
> > the column 0-based (or N-based for any N) instead of 1-based. The
> > default
> > remains at 1-based as it is now.
> > 
> > A number of testcases were explicitly testing for the old behavior,
> > so I
> > have updated them to test for the new behavior instead, since the
> > column
> > number adjusted for tabs is more natural to test for, and matches
> > what
> > editors typically show (give or take 1 for the origin convention).
> > 
> > One other testcase (go.dg/arrayclear.go) was a bit of an oddity. It
> > failed
> > after this patch, although it doesn't test for any column numbers.
> > The
> > answer turned out to be, this test checks for identical error text on
> > two
> > different lines. When the column units are changed to display
> > columns, then
> > the column of the second error happens to match the line of the first
> > one. dejagnu then misinterprets the second error as if it matched the
> > location of the first one (it doesn't distinguish whether it checks
> > for the
> > line number or the column number in the output). I added a comment to
> > the
> > test explaining the situation; since adding the comment has the side
> > effect
> > of making the first line number no longer match the second column
> > number, it
> > also makes the test pass again.
> > 
> > It wasn't quite clear to me whether this change was appropriate for
> > GCC 10
> > or not at this point. We discussed it a couple months ago here:
> > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either way,
> > I hope
> > it isn't a problem that I submitted the patch for review now, whether
> > it
> > will end up in 10 or 11. Please let me know what's normally expected?
> > Thanks!
> 
> Thanks Lewis.
> 
> This patch looks very promising, but should wait until gcc 11; we're
> trying to stabilize gcc 10 right now (I'm knee-deep in analyzer bug-
> fixing, so I don't want to add any more diagnostics changes).
>

Hi Dave-

Well GCC 10 was released for a whole day so I thought I would bug you with this
patch again now :). To summarize, I previously sent this in two separate parts.

Part 1: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01626.html
Part 2: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg02108.html

Part 1 added the support for converting tabs to spaces when outputting
diagnostics. Part 2 added the new options -fdiagnostics-column-unit and
-fdiagnostics-column-origin to control whether the column number is printed
in display or byte units. Together they resolve both PR49973 and PR86904.

You provided me with feedback on part 2, which is quoted below with some
notes interspersed. The new version of the patch incorporates all of your
suggestions. Part 1 has not changed other than some trivial rebasing
conflicts. The two patches touch nearly disjoint sets of files and are
logically linked together, so I thought it would be simpler if I just sent
one combined patch now. If you prefer them to be separated as before, please
let me know and I can send them that way as well.

Bootstrap and reg tests were done on x86-64 Linux for all languages.  Tests
look good:

type, before, after
FAIL 96 96
PASS 474637 475097
UNSUPPORTED 11607 11607
UNTESTED 195 195
XFAIL 1816 1816
XPASS 36 36

> 
> > gcc/ChangeLog:
> > 
> > 2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>
> >
> 
> Please reference the PRs here
> 
> [...]
> 
> > gcc/testsuite/ChangeLog:
> > 
> > 2020-01-31  Lewis Hyatt  <lhyatt@gmail.com>
> 
> Likewise here.
> 
> [...]
>

Done.

> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index 630c380bd6a..657985450c2 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -1309,6 +1309,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)
> >  EnumValue
> >  Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)
> >  
> > +fdiagnostics-column-unit=
> > +Common Joined RejectNegative Enum(diagnostics_column_unit)
> > +-fdiagnostics-column-unit=[display|byte]	Select units for column numbers.
> Should this line mention the default?
>

Done.

> > +fdiagnostics-column-origin=
> > +Common Joined RejectNegative UInteger
> > +-fdiagnostics-column-origin=<number>	Set the number of the first column.  Default 1-based.
> 
> These new options should be documented in gcc/doc/invoke.texi.
> 
> [...]
>

Done.

> > @@ -43,21 +44,23 @@ static json::array *cur_children_array;
> >  /* Generate a JSON object for LOC.  */
> >  
> >  json::value *
> > -json_from_expanded_location (location_t loc)
> > +json_from_expanded_location (diagnostic_context *context, location_t loc)
> >  {
> >    expanded_location exploc = expand_location (loc);
> >    json::object *result = new json::object ();
> >    if (exploc.file)
> >      result->set ("file", new json::string (exploc.file));
> >    result->set ("line", new json::integer_number (exploc.line));
> > -  result->set ("column", new json::integer_number (exploc.column));
> > +  const int col = diagnostic_converted_column (context, exploc);
> > +  result->set ("column", new json::integer_number (col));
> 
> I wonder if the JSON output format should show *both* values: perhaps
> add fields "byte-column" and "display-column", and retain the field
> "column", which would follow -fdiagnostics-column-unit?
> 
> [...]
>

Done. Adjusted the docs for JSON output as well.

> > @@ -219,6 +220,8 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
> >    context->min_margin_width = 0;
> >    context->show_ruler_p = false;
> >    context->parseable_fixits_p = false;
> > +  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;
> > +  context->column_adj = 0;
> 
> I'm not sure, but I think I prefer it if we store the column origin
> instead, rather than an offset relative to an origin of 1.
> 
> [...]
> 
> > @@ -338,8 +341,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)
> >    return diagnostic_kind_color[kind];
> >  }
> >  
> > +/* Given an expanded_location, convert the column (which is in 1-based bytes)
> > +   to the requested units and origin.  Return -1 if the column is
> > +   invalid (<= 0).  */
> > +int
> > +diagnostic_converted_column (diagnostic_context *context, expanded_location s)
> > +{
> > +  if (s.column <= 0)
> > +    return -1;
> > +
> > +  int col;
> 
> ...so this would be one_based_col.
> 
> > +  switch (context->column_unit)
> > +    {
> > +    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:
> > +      col = location_compute_display_column (s);
> > +      break;
> > +
> > +    case DIAGNOSTICS_COLUMN_UNIT_BYTE:
> > +      col = s.column;
> > +      break;
> > +
> > +    default:
> > +      gcc_unreachable ();
> > +    }
> > +
> > +  return col + context->column_adj;
> 
> ...and this would be (I think):
> 
>      return context->column_origin + one_based_col - 1;
> 
> It would be doing the -1 each time, but maybe it's conceptually clearer?
> I'm not sure.
>

Sure, done.

> [...]
> 
> > @@ -882,8 +930,10 @@ print_parseable_fixits (pretty_printer *pp, rich_location *richloc)
> >        location_t next_loc = hint->get_next_loc ();
> >        expanded_location next_exploc = expand_location (next_loc);
> >        pp_printf (pp, ":{%i:%i-%i:%i}:",
> > -		 start_exploc.line, start_exploc.column,
> > -		 next_exploc.line, next_exploc.column);
> > +		 start_exploc.line,
> > +		 diagnostic_converted_column (context, start_exploc),
> > +		 next_exploc.line,
> > +		 diagnostic_converted_column (context, next_exploc));
> >        print_escaped_string (pp, hint->get_string ());
> >        pp_newline (pp);
> >      }
> 
> If we're going to change the output of parseable fixits, that takes us away
> from bug-for-bug-compatibility with clang in this area.
> 
> That should be documented, at least.
>

I didn't mean to do anything controversial here, I was just assuming this should
change for consistency, but didn't realize it needed to match an existing
standard. I removed this part of the patch for now, can send it in a separate
one if there's a desire to change this.

> [...]
> 
> There's selftest coverage which is good; it would be good to *also*
> have a few simple DejaGnu-based tests, showing the explicit use of both
> units, and trying some offset values, with some lines with tabs, some
> with spaces (if nothing else to verify that the option-parsing is wired
> up correctly).
>

Done.

> I'm nit-picking - apart from the lack of docs, this looks very
> promising.  But as I said earlier, this should wait until gcc 11.
> 
> Thanks
> Dave
> 

Thanks again for your time!

-Lewis
-------------- next part --------------
gcc/ChangeLog:

2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	PR other/86904
	* common.opt: Handle -ftabstop here instead of in c-family
	options.  Add -fdiagnostics-column-unit= and
	-fdiagnostics-column-origin= options.
	* opts.c (common_handle_option): Handle the new options.
	* diagnostic-format-json.cc (json_from_expanded_location): Add
	diagnostic_context argument.  Use it to convert column numbers as per
	the new options.
	(json_from_location_range): Likewise.
	(json_from_fixit_hint): Likewise.
	(json_end_diagnostic): Pass the new context argument to helper
	functions above.  Add "column-origin" field to the output.
	(test_unknown_location): Add the new context argument to calls to
	helper functions.
	(test_bad_endpoints): Likewise.
	* diagnostic-show-locus.c (struct line_bounds): Clarify that the
	units are now always display columns.  Rename members accordingly.
	Add constructor.
	(layout::print_source_line): Add support for tab expansion.
	(layout::print_annotation_line): Adapt to struct line_bounds changes.
	(layout::print_line): Likewise.
	(test_layout_x_offset_display_tab): New selftest.
	(test_one_liner_colorized_utf8): Likewise.
	(test_tab_expansion): Likewise.
	(test_diagnostic_show_locus_one_liner_utf8): Call the new tests.
	(diagnostic_show_locus_c_tests): Likewise.
	* diagnostic.c (diagnostic_initialize): Initialize new column_unit and
	column_origin members.
	(diagnostic_converted_column): New function.
	(maybe_line_and_column): Be willing to output a column of 0.
	(diagnostic_get_location_text): Convert column number as per the new
	options.
	(diagnostic_report_current_module): Likewise.
	(assert_location_text): Add origin and column_unit arguments for
	testing the new functionality.
	(test_diagnostic_get_location_text): Test the new functionality.
	* diagnostic.h (enum diagnostics_column_unit): New enum.
	(struct diagnostic_context): Add members for the new options.
	(diagnostic_converted_column): Declare.
	(json_from_expanded_location): Add new context argument.
	* doc/invoke.texi: Document the new options.
	* input.h (location_compute_display_column): Add tabstop argument.
	* input.c (location_compute_display_column): Likewise.
	(test_cpp_utf8): Add selftests for tab expansion.
	* tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the
	new context argument to json_from_expanded_location().

gcc/c-family/ChangeLog:

2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

	PR other/86904
	* c-indentation.c (should_warn_for_misleading_indentation): Get
	global tabstop from the new source.
	* c-opts.c (c_common_handle_option): Remove handling of -ftabstop, which
	is now a common option.
	* c.opt: Likewise.

gcc/testsuite/ChangeLog:

2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	PR other/86904
	* c-c++-common/Wmisleading-indentation-3.c: Adjust expected output
	for new defaults.
	* c-c++-common/Wmisleading-indentation.c: Likewise.
	* c-c++-common/diagnostic-format-json-1.c: Likewise.
	* c-c++-common/diagnostic-format-json-2.c: Likewise.
	* c-c++-common/diagnostic-format-json-3.c: Likewise.
	* c-c++-common/diagnostic-format-json-4.c: Likewise.
	* c-c++-common/diagnostic-format-json-5.c: Likewise.
	* c-c++-common/missing-close-symbol.c: Likewise.
	* g++.dg/diagnostic/bad-binary-ops.C: Likewise.
	* g++.dg/parse/error4.C: Likewise.
	* g++.old-deja/g++.brendan/crash11.C: Likewise.
	* g++.old-deja/g++.pt/overload2.C: Likewise.
	* g++.old-deja/g++.robertl/eb109.C: Likewise.
	* gcc.dg/analyzer/malloc-paths-9.c: Likewise.
	* gcc.dg/bad-binary-ops.c: Likewise.
	* gcc.dg/format/branch-1.c: Likewise.
	* gcc.dg/format/pr79210.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-expressions-1.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: Likewise.
	* gcc.dg/redecl-4.c: Likewise.
	* gfortran.dg/diagnostic-format-json-1.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
	* go.dg/arrayclear.go: Add a comment explaining why adding a
	comment was necessary to work around a dejagnu bug.
	* c-c++-common/diagnostic-units-1.c: New test.
	* c-c++-common/diagnostic-units-2.c: New test.
	* c-c++-common/diagnostic-units-3.c: New test.
	* c-c++-common/diagnostic-units-4.c: New test.
	* c-c++-common/diagnostic-units-5.c: New test.
	* c-c++-common/diagnostic-units-6.c: New test.
	* c-c++-common/diagnostic-units-7.c: New test.
	* c-c++-common/diagnostic-units-8.c: New test.

libcpp/ChangeLog:

2020-05-08  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	PR other/86904
	* include/cpplib.h (struct cpp_options): Removed support for -ftabstop,
	which is now handled by cpp_set_tabstop ().
	(class cpp_display_width_computation): New class.
	(cpp_byte_column_to_display_column): Add optional tabstop argument.
	(cpp_display_width): Likewise.
	(cpp_display_column_to_byte_column): Likewise.
	(cpp_set_tabstop): New function.
	(cpp_get_tabstop): Likewise.
	* charset.c (global_tabstop): New static variable.
	(cpp_set_tabstop): New function to access global_tabstop.
	(cpp_get_tabstop): Likewise.
	(cpp_display_width_computation::cpp_display_width_computation): New
	function.
	(compute_next_display_width): Removed and implemented this
	functionality in a new function...
	(cpp_display_width_computation::process_next_codepoint): ...here.
	(cpp_display_width_computation::advance_display_cols): New function.
	(cpp_byte_column_to_display_column): Added tabstop argument.
	Reimplemented in terms of class cpp_display_width_computation.
	(cpp_display_column_to_byte_column): Likewise.
	* init.c (cpp_create_reader): Remove handling of -ftabstop, which is now
	handled via cpp_set_tabstop().
-------------- next part --------------
commit 080d5f5ac4c18c5b8dd5d4fdd43034624e4f55a9
Author: Lewis Hyatt <lhyatt@gmail.com>
Date:   Fri Jan 17 17:53:58 2020 -0500

    diagnostics: Support conversion of tabs to spaces [PR49973] [PR86904]

diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index 9fba3bcc67c..fa4739c47a9 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -299,7 +299,7 @@ should_warn_for_misleading_indentation (const token_indent_info &guard_tinfo,
   expanded_location next_stmt_exploc = expand_location (next_stmt_loc);
   expanded_location guard_exploc = expand_location (guard_loc);
 
-  const unsigned int tab_width = cpp_opts->tabstop;
+  const unsigned int tab_width = cpp_get_tabstop ();
 
   /* They must be in the same file.  */
   if (next_stmt_exploc.file != body_exploc.file)
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 58ba0948e79..cddf1e28e1d 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -504,12 +504,6 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,
 	cpp_opts->track_macro_expansion = 2;
       break;
 
-    case OPT_ftabstop_:
-      /* It is documented that we silently ignore silly values.  */
-      if (value >= 1 && value <= 100)
-	cpp_opts->tabstop = value;
-      break;
-
     case OPT_fexec_charset_:
       cpp_opts->narrow_charset = arg;
       break;
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index c49da99d395..dbdb78e0ad3 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1876,10 +1876,6 @@ Enum(strong_eval_order) String(some) Value(1)
 EnumValue
 Enum(strong_eval_order) String(all) Value(2)
 
-ftabstop=
-C ObjC C++ ObjC++ Joined RejectNegative UInteger
--ftabstop=<number>	Distance between tab stops for column reporting.
-
 ftemplate-backtrace-limit=
 C++ ObjC++ Joined RejectNegative UInteger Var(template_backtrace_limit) Init(10)
 Set the maximum number of template instantiation notes for a single warning or error.
diff --git a/gcc/common.opt b/gcc/common.opt
index 30d05734d16..e3c62a3e7ea 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1321,6 +1321,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)
 EnumValue
 Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)
 
+fdiagnostics-column-unit=
+Common Joined RejectNegative Enum(diagnostics_column_unit)
+-fdiagnostics-column-unit=[display|byte]	Select whether column numbers are output as display columns (default) or raw bytes.
+
+fdiagnostics-column-origin=
+Common Joined RejectNegative UInteger
+-fdiagnostics-column-origin=<number>	Set the number of the first column.  The default is 1-based as per GNU style, but some utilities may expect 0-based, for example.
+
 fdiagnostics-format=
 Common Joined RejectNegative Enum(diagnostics_output_format)
 -fdiagnostics-format=[text|json]	Select output format.
@@ -1329,6 +1337,15 @@ Common Joined RejectNegative Enum(diagnostics_output_format)
 SourceInclude
 diagnostic.h
 
+Enum
+Name(diagnostics_column_unit) Type(int)
+
+EnumValue
+Enum(diagnostics_column_unit) String(display) Value(DIAGNOSTICS_COLUMN_UNIT_DISPLAY)
+
+EnumValue
+Enum(diagnostics_column_unit) String(byte) Value(DIAGNOSTICS_COLUMN_UNIT_BYTE)
+
 Enum
 Name(diagnostics_output_format) Type(int)
 
@@ -1358,6 +1375,10 @@ fdiagnostics-path-format=
 Common Joined RejectNegative Var(flag_diagnostics_path_format) Enum(diagnostic_path_format) Init(DPF_INLINE_EVENTS)
 Specify how to print any control-flow path associated with a diagnostic.
 
+ftabstop=
+Common Joined RejectNegative UInteger
+-ftabstop=<number>      Distance between tab stops for column reporting.
+
 Enum
 Name(diagnostic_path_format) Type(int)
 
diff --git a/gcc/diagnostic-format-json.cc b/gcc/diagnostic-format-json.cc
index 7bda5c4ba83..465c42fdfde 100644
--- a/gcc/diagnostic-format-json.cc
+++ b/gcc/diagnostic-format-json.cc
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "diagnostic.h"
+#include "selftest-diagnostic.h"
 #include "diagnostic-metadata.h"
 #include "json.h"
 #include "selftest.h"
@@ -43,21 +44,43 @@ static json::array *cur_children_array;
 /* Generate a JSON object for LOC.  */
 
 json::value *
-json_from_expanded_location (location_t loc)
+json_from_expanded_location (diagnostic_context *context, location_t loc)
 {
   expanded_location exploc = expand_location (loc);
   json::object *result = new json::object ();
   if (exploc.file)
     result->set ("file", new json::string (exploc.file));
   result->set ("line", new json::integer_number (exploc.line));
-  result->set ("column", new json::integer_number (exploc.column));
+
+  const enum diagnostics_column_unit orig_unit = context->column_unit;
+  struct
+  {
+    const char *name;
+    enum diagnostics_column_unit unit;
+  } column_fields[] = {
+    {"display-column", DIAGNOSTICS_COLUMN_UNIT_DISPLAY},
+    {"byte-column", DIAGNOSTICS_COLUMN_UNIT_BYTE}
+  };
+  int the_column = INT_MIN;
+  for (int i = 0; i != sizeof column_fields / sizeof (*column_fields); ++i)
+    {
+      context->column_unit = column_fields[i].unit;
+      const int col = diagnostic_converted_column (context, exploc);
+      result->set (column_fields[i].name, new json::integer_number (col));
+      if (column_fields[i].unit == orig_unit)
+	the_column = col;
+    }
+  gcc_assert (the_column != INT_MIN);
+  result->set ("column", new json::integer_number (the_column));
+  context->column_unit = orig_unit;
   return result;
 }
 
 /* Generate a JSON object for LOC_RANGE.  */
 
 static json::object *
-json_from_location_range (const location_range *loc_range, unsigned range_idx)
+json_from_location_range (diagnostic_context *context,
+			  const location_range *loc_range, unsigned range_idx)
 {
   location_t caret_loc = get_pure_location (loc_range->m_loc);
 
@@ -68,13 +91,13 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)
   location_t finish_loc = get_finish (loc_range->m_loc);
 
   json::object *result = new json::object ();
-  result->set ("caret", json_from_expanded_location (caret_loc));
+  result->set ("caret", json_from_expanded_location (context, caret_loc));
   if (start_loc != caret_loc
       && start_loc != UNKNOWN_LOCATION)
-    result->set ("start", json_from_expanded_location (start_loc));
+    result->set ("start", json_from_expanded_location (context, start_loc));
   if (finish_loc != caret_loc
       && finish_loc != UNKNOWN_LOCATION)
-    result->set ("finish", json_from_expanded_location (finish_loc));
+    result->set ("finish", json_from_expanded_location (context, finish_loc));
 
   if (loc_range->m_label)
     {
@@ -91,14 +114,14 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)
 /* Generate a JSON object for HINT.  */
 
 static json::object *
-json_from_fixit_hint (const fixit_hint *hint)
+json_from_fixit_hint (diagnostic_context *context, const fixit_hint *hint)
 {
   json::object *fixit_obj = new json::object ();
 
   location_t start_loc = hint->get_start_loc ();
-  fixit_obj->set ("start", json_from_expanded_location (start_loc));
+  fixit_obj->set ("start", json_from_expanded_location (context, start_loc));
   location_t next_loc = hint->get_next_loc ();
-  fixit_obj->set ("next", json_from_expanded_location (next_loc));
+  fixit_obj->set ("next", json_from_expanded_location (context, next_loc));
   fixit_obj->set ("string", new json::string (hint->get_string ()));
 
   return fixit_obj;
@@ -190,11 +213,13 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
   else
     {
       /* Otherwise, make diag_obj be the top-level object within the group;
-	 add a "children" array.  */
+	 add a "children" array and record the column origin.  */
       toplevel_array->append (diag_obj);
       cur_group = diag_obj;
       cur_children_array = new json::array ();
       diag_obj->set ("children", cur_children_array);
+      diag_obj->set ("column-origin",
+		     new json::integer_number (context->column_origin));
     }
 
   const rich_location *richloc = diagnostic->richloc;
@@ -205,7 +230,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
   for (unsigned int i = 0; i < richloc->get_num_locations (); i++)
     {
       const location_range *loc_range = richloc->get_range (i);
-      json::object *loc_obj = json_from_location_range (loc_range, i);
+      json::object *loc_obj = json_from_location_range (context, loc_range, i);
       if (loc_obj)
 	loc_array->append (loc_obj);
     }
@@ -217,7 +242,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
       for (unsigned int i = 0; i < richloc->get_num_fixit_hints (); i++)
 	{
 	  const fixit_hint *hint = richloc->get_fixit_hint (i);
-	  json::object *fixit_obj = json_from_fixit_hint (hint);
+	  json::object *fixit_obj = json_from_fixit_hint (context, hint);
 	  fixit_array->append (fixit_obj);
 	}
     }
@@ -320,7 +345,8 @@ namespace selftest {
 static void
 test_unknown_location ()
 {
-  delete json_from_expanded_location (UNKNOWN_LOCATION);
+  test_diagnostic_context dc;
+  delete json_from_expanded_location (&dc, UNKNOWN_LOCATION);
 }
 
 /* Verify that we gracefully handle attempts to serialize bad
@@ -338,7 +364,8 @@ test_bad_endpoints ()
   loc_range.m_range_display_kind = SHOW_RANGE_WITH_CARET;
   loc_range.m_label = NULL;
 
-  json::object *obj = json_from_location_range (&loc_range, 0);
+  test_diagnostic_context dc;
+  json::object *obj = json_from_location_range (&dc, &loc_range, 0);
   /* We should have a "caret" value, but no "start" or "finish" values.  */
   ASSERT_TRUE (obj != NULL);
   ASSERT_TRUE (obj->get ("caret") != NULL);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 4618b4edb7d..8a34e30c4c7 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -226,22 +226,18 @@ class layout_range
 
 /* A struct for use by layout::print_source_line for telling
    layout::print_annotation_line the extents of the source line that
-   it printed, so that underlines can be clipped appropriately.  */
+   it printed, so that underlines can be clipped appropriately.  Units
+   are 1-based display columns.  */
 
 struct line_bounds
 {
-  int m_first_non_ws;
-  int m_last_non_ws;
+  int m_first_non_ws_disp_col;
+  int m_last_non_ws_disp_col;
 
-  void convert_to_display_cols (char_span line)
+  line_bounds ()
   {
-    m_first_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),
-							line.length (),
-							m_first_non_ws);
-
-    m_last_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),
-						       line.length (),
-						       m_last_non_ws);
+    m_first_non_ws_disp_col = INT_MAX;
+    m_last_non_ws_disp_col = 0;
   }
 };
 
@@ -351,8 +347,8 @@ class layout
  private:
   bool will_show_line_p (linenum_type row) const;
   void print_leading_fixits (linenum_type row);
-  void print_source_line (linenum_type row, const char *line, int line_bytes,
-			  line_bounds *lbounds_out);
+  line_bounds print_source_line (linenum_type row, const char *line,
+				 int line_bytes);
   bool should_print_annotation_line_p (linenum_type row) const;
   void start_annotation_line (char margin_char = ' ') const;
   void print_annotation_line (linenum_type row, const line_bounds lbounds);
@@ -1445,16 +1441,13 @@ layout::calculate_x_offset_display ()
 }
 
 /* Print line ROW of source code, potentially colorized at any ranges, and
-   populate *LBOUNDS_OUT.
-   LINE is the source line (not necessarily 0-terminated) and LINE_BYTES
-   is its length in bytes.
-   This function deals only with byte offsets, not display columns, so
-   m_x_offset_display must be converted from display to byte units.  In
-   particular, LINE_BYTES and LBOUNDS_OUT are in bytes.  */
+   return the line bounds.  LINE is the source line (not necessarily
+   0-terminated) and LINE_BYTES is its length in bytes.  In order to handle both
+   colorization and tab expansion, this function tracks the line position in
+   both byte and display column units.  */
 
-void
-layout::print_source_line (linenum_type row, const char *line, int line_bytes,
-			   line_bounds *lbounds_out)
+line_bounds
+layout::print_source_line (linenum_type row, const char *line, int line_bytes)
 {
   m_colorizer.set_normal_text ();
 
@@ -1469,30 +1462,29 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
   else
     pp_space (m_pp);
 
-  /* We will stop printing the source line at any trailing whitespace, and start
-     printing it as per m_x_offset_display.  */
+  /* We will stop printing the source line at any trailing whitespace.  */
   line_bytes = get_line_bytes_without_trailing_whitespace (line,
 							   line_bytes);
-  int x_offset_bytes = 0;
-  if (m_x_offset_display)
-    {
-      x_offset_bytes = cpp_display_column_to_byte_column (line, line_bytes,
-							  m_x_offset_display);
-      /* In case the leading portion of the line that will be skipped over ends
-	 with a character with wcwidth > 1, then it is possible we skipped too
-	 much, so account for that by padding with spaces.  */
-      const int overage
-	= cpp_byte_column_to_display_column (line, line_bytes, x_offset_bytes)
-	- m_x_offset_display;
-      for (int column = 0; column < overage; ++column)
-	pp_space (m_pp);
-      line += x_offset_bytes;
-    }
 
-  /* Print the line.  */
-  int first_non_ws = INT_MAX;
-  int last_non_ws = 0;
-  for (int col_byte = 1 + x_offset_bytes; col_byte <= line_bytes; col_byte++)
+  /* This object helps to keep track of which display column we are at, which is
+     necessary for computing the line bounds in display units, for doing
+     tab expansion, and for implementing m_x_offset_display.  */
+  cpp_display_width_computation dw (line, line_bytes);
+
+  /* Skip the first m_x_offset_display display columns.  In case the leading
+     portion that will be skipped ends with a character with wcwidth > 1, then
+     it is possible we skipped too much, so account for that by padding with
+     spaces.  Note that this does the right thing too in case a tab was the last
+     character to be skipped over; the tab is effectively replaced by the
+     correct number of trailing spaces needed to offset by the desired number of
+     display columns.  */
+  for (int skipped_display_cols = dw.advance_display_cols (m_x_offset_display);
+       skipped_display_cols > m_x_offset_display; --skipped_display_cols)
+    pp_space (m_pp);
+
+  /* Print the line and compute the line_bounds.  */
+  line_bounds lbounds;
+  while (!dw.done ())
     {
       /* Assuming colorization is enabled for the caret and underline
 	 characters, we may also colorize the associated characters
@@ -1510,7 +1502,8 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
 	{
 	  bool in_range_p;
 	  point_state state;
-	  in_range_p = get_state_at_point (row, col_byte,
+	  const int start_byte_col = dw.bytes_processed () + 1;
+	  in_range_p = get_state_at_point (row, start_byte_col,
 					   0, INT_MAX,
 					   CU_BYTES,
 					   &state);
@@ -1519,22 +1512,44 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
 	  else
 	    m_colorizer.set_normal_text ();
 	}
-      char c = *line;
-      if (c == '\0' || c == '\t' || c == '\r')
-	c = ' ';
-      if (c != ' ')
+
+      /* Get the display width of the next character to be output, expanding
+	 tabs and replacing some control bytes with spaces as necessary.  */
+      const char *c = dw.next_byte ();
+      const int start_disp_col = dw.display_cols_processed () + 1;
+      const int this_display_width = dw.process_next_codepoint ();
+      if (*c == '\t')
+	{
+	  /* The returned display width is the number of spaces into which the
+	     tab should be expanded.  */
+	  for (int i = 0; i != this_display_width; ++i)
+	    pp_space (m_pp);
+	  continue;
+	}
+      if (*c == '\0' || *c == '\r')
+	{
+	  /* cpp_wcwidth() promises to return 1 for all control bytes, and we
+	     want to output these as a single space too, so this case is
+	     actually the same as the '\t' case.  */
+	  gcc_assert (this_display_width == 1);
+	  pp_space (m_pp);
+	  continue;
+	}
+
+      /* We have a (possibly multibyte) character to output; update the line
+	 bounds if it is not whitespace.  */
+      if (*c != ' ')
 	{
-	  last_non_ws = col_byte;
-	  if (first_non_ws == INT_MAX)
-	    first_non_ws = col_byte;
+	  lbounds.m_last_non_ws_disp_col = dw.display_cols_processed ();
+	  if (lbounds.m_first_non_ws_disp_col == INT_MAX)
+	    lbounds.m_first_non_ws_disp_col = start_disp_col;
 	}
-      pp_character (m_pp, c);
-      line++;
+
+      /* Output the character.  */
+      while (c != dw.next_byte ()) pp_character (m_pp, *c++);
     }
   print_newline ();
-
-  lbounds_out->m_first_non_ws = first_non_ws;
-  lbounds_out->m_last_non_ws = last_non_ws;
+  return lbounds;
 }
 
 /* Determine if we should print an annotation line for ROW.
@@ -1576,14 +1591,13 @@ layout::start_annotation_line (char margin_char) const
 }
 
 /* Print a line consisting of the caret/underlines for the given
-   source line.  This function works with display columns, rather than byte
-   counts; in particular, LBOUNDS should be in display column units.  */
+   source line.  */
 
 void
 layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
 {
   int x_bound = get_x_bound_for_row (row, m_exploc.m_display_col,
-				     lbounds.m_last_non_ws);
+				     lbounds.m_last_non_ws_disp_col);
 
   start_annotation_line ();
   pp_space (m_pp);
@@ -1593,8 +1607,8 @@ layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
       bool in_range_p;
       point_state state;
       in_range_p = get_state_at_point (row, column,
-				       lbounds.m_first_non_ws,
-				       lbounds.m_last_non_ws,
+				       lbounds.m_first_non_ws_disp_col,
+				       lbounds.m_last_non_ws_disp_col,
 				       CU_DISPLAY_COLS,
 				       &state);
       if (in_range_p)
@@ -2499,15 +2513,11 @@ layout::print_line (linenum_type row)
   if (!line)
     return;
 
-  line_bounds lbounds;
   print_leading_fixits (row);
-  print_source_line (row, line.get_buffer (), line.length (), &lbounds);
+  const line_bounds lbounds
+    = print_source_line (row, line.get_buffer (), line.length ());
   if (should_print_annotation_line_p (row))
-    {
-      if (lbounds.m_first_non_ws != INT_MAX)
-	lbounds.convert_to_display_cols (line);
-      print_annotation_line (row, lbounds);
-    }
+    print_annotation_line (row, lbounds);
   if (m_show_labels_p)
     print_any_labels (row);
   print_trailing_fixits (row);
@@ -2774,6 +2784,114 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
 
 }
 
+static void
+test_layout_x_offset_display_tab (const line_table_case &case_)
+{
+  const char *content
+    = "This line is very long, so that we can use it to test the logic for "
+      "clipping long lines.  Also this: `\t' is a tab that occupies 1 byte and "
+      "a variable number of display columns, starting at column #103.\n";
+
+  /* Number of bytes in the line, subtracting one to remove the newline.  */
+  const int line_bytes = strlen (content) - 1;
+
+ /* The column where the tab begins.  Byte or display is the same as there are
+    no multibyte characters earlier on the line.  */
+  const int tab_col = 103;
+
+  /* Effective extra size of the tab beyond what a single space would have taken
+     up, indexed by tabstop.  */
+  static const int num_tabstops = 11;
+  int extra_width[num_tabstops];
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      const int this_tab_size = tabstop - (tab_col - 1) % tabstop;
+      extra_width[tabstop] = this_tab_size - 1;
+    }
+  /* Example of this calculation: if tabstop is 10, the tab starting at column
+     #103 has to expand into 8 spaces, covering columns 103-110, so that the
+     next character is at column #111.  So it takes up 7 more columns than
+     a space would have taken up.  */
+  ASSERT_EQ (7, extra_width[10]);
+
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+  line_table_test ltt (case_);
+
+  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+
+  location_t line_end = linemap_position_for_column (line_table, line_bytes);
+
+  /* Don't attempt to run the tests if column data might be unavailable.  */
+  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return;
+
+  /* Check that cpp_display_width handles the tabs as expected.  */
+  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  ASSERT_EQ ('\t', *(lspan.get_buffer () + (tab_col - 1)));
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      ASSERT_EQ (line_bytes + extra_width[tabstop],
+		 cpp_display_width (lspan.get_buffer (), lspan.length (),
+				    tabstop));
+      ASSERT_EQ (line_bytes + extra_width[tabstop],
+		 location_compute_display_column (expand_location (line_end),
+						  tabstop));
+    }
+
+  /* Check that the tab is expanded to the expected number of spaces.  */
+  const int global_tabstop = cpp_get_tabstop ();
+  rich_location richloc (line_table,
+			 linemap_position_for_column (line_table,
+						      tab_col + 1));
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      cpp_set_tabstop (tabstop);
+      test_diagnostic_context dc;
+      layout test_layout (&dc, &richloc, DK_ERROR);
+      test_layout.print_line (1);
+      const char *out = pp_formatted_text (dc.printer);
+      ASSERT_EQ (NULL, strchr (out, '\t'));
+      const char *left_quote = strchr (out, '`');
+      const char *right_quote = strchr (out, '\'');
+      ASSERT_NE (NULL, left_quote);
+      ASSERT_NE (NULL, right_quote);
+      ASSERT_EQ (right_quote - left_quote, extra_width[tabstop] + 2);
+    }
+
+  /* Check that the line is offset properly and that the tab is broken up
+     into the expected number of spaces when it is the last character skipped
+     over.  */
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      cpp_set_tabstop (tabstop);
+      test_diagnostic_context dc;
+      static const int small_width = 24;
+      dc.caret_max_width = small_width - 4;
+      dc.min_margin_width = test_left_margin - test_linenum_sep + 1;
+      dc.show_line_numbers_p = true;
+      layout test_layout (&dc, &richloc, DK_ERROR);
+      test_layout.print_line (1);
+
+      /* We have arranged things so that two columns will be printed before
+	 the caret.  If the tab results in more than one space, this should
+	 produce two spaces in the output; otherwise, it will be a single space
+	 preceded by the opening quote before the tab character.  */
+      const char *output1
+	= "   1 |   ' is a tab that occupies 1 byte and a variable number of "
+	  "display columns, starting at column #103.\n"
+	  "     |   ^\n\n";
+      const char *output2
+	= "   1 | ` ' is a tab that occupies 1 byte and a variable number of "
+	  "display columns, starting at column #103.\n"
+	  "     |   ^\n\n";
+      const char *expected_output = (extra_width[tabstop] ? output1 : output2);
+      ASSERT_STREQ (expected_output, pp_formatted_text (dc.printer));
+    }
+
+  cpp_set_tabstop (global_tabstop);
+}
+
+
 /* Verify that diagnostic_show_locus works sanely on UNKNOWN_LOCATION.  */
 
 static void
@@ -3854,6 +3972,27 @@ test_one_liner_labels_utf8 ()
   }
 }
 
+/* Make sure that colorization codes don't interrupt a multibyte
+   sequence, which would corrupt it.  */
+static void
+test_one_liner_colorized_utf8 ()
+{
+  test_diagnostic_context dc;
+  dc.colorize_source_p = true;
+  diagnostic_color_init (&dc, DIAGNOSTICS_COLOR_YES);
+  const location_t pi = linemap_position_for_column (line_table, 12);
+  rich_location richloc (line_table, pi);
+  diagnostic_show_locus (&dc, &richloc, DK_ERROR);
+
+  /* In order to avoid having the test depend on exactly how the colorization
+     was effected, just confirm there are two pi characters in the output.  */
+  const char *result = pp_formatted_text (dc.printer);
+  const char *null_term = result + strlen (result);
+  const char *first_pi = strstr (result, "\xcf\x80");
+  ASSERT_TRUE (first_pi && first_pi <= null_term - 2);
+  ASSERT_STR_CONTAINS (first_pi + 2, "\xcf\x80");
+}
+
 /* Run the various one-liner tests.  */
 
 static void
@@ -3900,6 +4039,7 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
   test_one_liner_many_fixits_1_utf8 ();
   test_one_liner_many_fixits_2_utf8 ();
   test_one_liner_labels_utf8 ();
+  test_one_liner_colorized_utf8 ();
 }
 
 /* Verify that gcc_rich_location::add_location_if_nearby works.  */
@@ -4955,6 +5095,68 @@ test_fixit_deletion_affecting_newline (const line_table_case &case_)
 		pp_formatted_text (dc.printer));
 }
 
+static void
+test_tab_expansion (const line_table_case &case_)
+{
+  /* Set up the tabstop to be sure it is 8.  */
+  const int global_tabstop = cpp_get_tabstop ();
+  cpp_set_tabstop (8);
+
+  /* Create a tempfile and write some text to it.  This example uses a tabstop
+     of 8, as the column numbers attempt to indicate:
+
+    .....................000.01111111111.22222333333  display
+    .....................123.90123456789.56789012345  columns  */
+  const char *content = "  \t   This: `\t' is a tab.\n";
+  /* ....................000 00000011111 11111222222  byte
+     ....................123 45678901234 56789012345  columns  */
+
+  const int first_non_ws_byte_col = 7;
+  const int right_quote_byte_col = 15;
+  const int last_byte_col = 25;
+  ASSERT_EQ (35, cpp_display_width (content, last_byte_col));
+
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+  line_table_test ltt (case_);
+  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+
+  /* Don't attempt to run the tests if column data might be unavailable.  */
+  location_t line_end = linemap_position_for_column (line_table, last_byte_col);
+  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return;
+
+  /* Check that the leading whitespace with mixed tabs and spaces is expanded
+     into 11 spaces.  Recall that print_line() also puts one space before
+     everything too.  */
+  {
+    test_diagnostic_context dc;
+    rich_location richloc (line_table,
+			   linemap_position_for_column (line_table,
+							first_non_ws_byte_col));
+    layout test_layout (&dc, &richloc, DK_ERROR);
+    test_layout.print_line (1);
+    ASSERT_STREQ ("            This: `      ' is a tab.\n"
+		  "            ^\n",
+		  pp_formatted_text (dc.printer));
+  }
+
+  /* Confirm the display width was tracked correctly across the internal tab
+     as well.  */
+  {
+    test_diagnostic_context dc;
+    rich_location richloc (line_table,
+			   linemap_position_for_column (line_table,
+							right_quote_byte_col));
+    layout test_layout (&dc, &richloc, DK_ERROR);
+    test_layout.print_line (1);
+    ASSERT_STREQ ("            This: `      ' is a tab.\n"
+		  "                         ^\n",
+		  pp_formatted_text (dc.printer));
+  }
+
+  cpp_set_tabstop (global_tabstop);
+}
+
 /* Verify that line numbers are correctly printed for the case of
    a multiline range in which the width of the line numbers changes
    (e.g. from "9" to "10").  */
@@ -5012,6 +5214,7 @@ diagnostic_show_locus_c_tests ()
   test_layout_range_for_multiple_lines ();
 
   for_each_line_table_case (test_layout_x_offset_display_utf8);
+  for_each_line_table_case (test_layout_x_offset_display_tab);
 
   test_get_line_bytes_without_trailing_whitespace ();
 
@@ -5029,6 +5232,7 @@ diagnostic_show_locus_c_tests ()
   for_each_line_table_case (test_fixit_insert_containing_newline_2);
   for_each_line_table_case (test_fixit_replace_containing_newline);
   for_each_line_table_case (test_fixit_deletion_affecting_newline);
+  for_each_line_table_case (test_tab_expansion);
 
   test_line_numbers_multiline_range ();
 }
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index ed52bc03d17..120c3258540 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "selftest.h"
 #include "selftest-diagnostic.h"
 #include "opts.h"
+#include "cpplib.h"
 
 #ifdef HAVE_TERMIOS_H
 # include <termios.h>
@@ -219,6 +220,8 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
   context->min_margin_width = 0;
   context->show_ruler_p = false;
   context->parseable_fixits_p = false;
+  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;
+  context->column_origin = 1;
   context->edit_context_ptr = NULL;
   context->diagnostic_group_nesting_depth = 0;
   context->diagnostic_group_emission_count = 0;
@@ -353,8 +356,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)
   return diagnostic_kind_color[kind];
 }
 
+/* Given an expanded_location, convert the column (which is in 1-based bytes)
+   to the requested units and origin.  Return -1 if the column is
+   invalid (<= 0).  */
+int
+diagnostic_converted_column (diagnostic_context *context, expanded_location s)
+{
+  if (s.column <= 0)
+    return -1;
+
+  int one_based_col;
+  switch (context->column_unit)
+    {
+    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:
+      one_based_col = location_compute_display_column (s);
+      break;
+
+    case DIAGNOSTICS_COLUMN_UNIT_BYTE:
+      one_based_col = s.column;
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return one_based_col + (context->column_origin - 1);
+}
+
 /* Return a formatted line and column ':%line:%column'.  Elided if
-   zero.  The result is a statically allocated buffer.  */
+   line == 0 or col < 0.  (A column of 0 may be valid due to the
+   -fdiagnostics-column-origin option.)
+   The result is a statically allocated buffer.  */
 
 static const char *
 maybe_line_and_column (int line, int col)
@@ -363,8 +395,9 @@ maybe_line_and_column (int line, int col)
 
   if (line)
     {
-      size_t l = snprintf (result, sizeof (result),
-			   col ? ":%d:%d" : ":%d", line, col);
+      size_t l
+	= snprintf (result, sizeof (result),
+		    col >= 0 ? ":%d:%d" : ":%d", line, col);
       gcc_checking_assert (l < sizeof (result));
     }
   else
@@ -383,8 +416,14 @@ diagnostic_get_location_text (diagnostic_context *context,
   const char *locus_cs = colorize_start (pp_show_color (pp), "locus");
   const char *locus_ce = colorize_stop (pp_show_color (pp));
   const char *file = s.file ? s.file : progname;
-  int line = strcmp (file, N_("<built-in>")) ? s.line : 0;
-  int col = context->show_column ? s.column : 0;
+  int line = 0;
+  int col = -1;
+  if (strcmp (file, N_("<built-in>")))
+    {
+      line = s.line;
+      if (context->show_column)
+	col = diagnostic_converted_column (context, s);
+    }
 
   const char *line_col = maybe_line_and_column (line, col);
   return build_message_string ("%s%s%s:%s", locus_cs, file,
@@ -650,14 +689,20 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
       if (! MAIN_FILE_P (map))
 	{
 	  bool first = true;
+	  expanded_location s = {};
 	  do
 	    {
 	      where = linemap_included_from (map);
 	      map = linemap_included_from_linemap (line_table, map);
-	      const char *line_col
-		= maybe_line_and_column (SOURCE_LINE (map, where),
-					 first && context->show_column
-					 ? SOURCE_COLUMN (map, where) : 0);
+	      s.file = LINEMAP_FILE (map);
+	      s.line = SOURCE_LINE (map, where);
+	      int col = -1;
+	      if (first && context->show_column)
+		{
+		  s.column = SOURCE_COLUMN (map, where);
+		  col = diagnostic_converted_column (context, s);
+		}
+	      const char *line_col = maybe_line_and_column (s.line, col);
 	      static const char *const msgs[] =
 		{
 		 N_("In file included from"),
@@ -666,7 +711,7 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
 	      unsigned index = !first;
 	      pp_verbatim (context->printer, "%s%s %r%s%s%R",
 			   first ? "" : ",\n", _(msgs[index]),
-			   "locus", LINEMAP_FILE (map), line_col);
+			   "locus", s.file, line_col);
 	      first = false;
 	    }
 	  while (! MAIN_FILE_P (map));
@@ -2042,10 +2087,15 @@ test_print_parseable_fixits_replace ()
 static void
 assert_location_text (const char *expected_loc_text,
 		      const char *filename, int line, int column,
-		      bool show_column)
+		      bool show_column,
+		      int origin = 1,
+		      enum diagnostics_column_unit column_unit
+			= DIAGNOSTICS_COLUMN_UNIT_BYTE)
 {
   test_diagnostic_context dc;
   dc.show_column = show_column;
+  dc.column_unit = column_unit;
+  dc.column_origin = origin;
 
   expanded_location xloc;
   xloc.file = filename;
@@ -2069,7 +2119,10 @@ test_diagnostic_get_location_text ()
   assert_location_text ("PROGNAME:", NULL, 0, 0, true);
   assert_location_text ("<built-in>:", "<built-in>", 42, 10, true);
   assert_location_text ("foo.c:42:10:", "foo.c", 42, 10, true);
-  assert_location_text ("foo.c:42:", "foo.c", 42, 0, true);
+  assert_location_text ("foo.c:42:9:", "foo.c", 42, 10, true, 0);
+  assert_location_text ("foo.c:42:1010:", "foo.c", 42, 10, true, 1001);
+  for (int origin = 0; origin != 2; ++origin)
+    assert_location_text ("foo.c:42:", "foo.c", 42, 0, true, origin);
   assert_location_text ("foo.c:", "foo.c", 0, 10, true);
   assert_location_text ("foo.c:42:", "foo.c", 42, 10, false);
   assert_location_text ("foo.c:", "foo.c", 0, 10, false);
@@ -2077,6 +2130,39 @@ test_diagnostic_get_location_text ()
   maybe_line_and_column (INT_MAX, INT_MAX);
   maybe_line_and_column (INT_MIN, INT_MIN);
 
+  {
+    /* In order to test display columns vs byte columns, we need to create a
+       file for location_get_source_line() to read.  */
+
+    const char *const content = "smile \xf0\x9f\x98\x82\n";
+    const int line_bytes = strlen (content) - 1;
+    const int display_width = cpp_display_width (content, line_bytes);
+    ASSERT_EQ (line_bytes - 2, display_width);
+    temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+    const char *const fname = tmp.get_filename ();
+    const int buf_len = strlen (fname) + 16;
+    char *const expected = XNEWVEC (char, buf_len);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    XDELETEVEC (expected);
+  }
+
+
   progname = old_progname;
 }
 
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 307dbcfb34a..ab152a129c9 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -24,6 +24,20 @@ along with GCC; see the file COPYING3.  If not see
 #include "pretty-print.h"
 #include "diagnostic-core.h"
 
+/* An enum for controlling what units to use for the column number
+   when diagnostics are output, used by the -fdiagnostics-column-unit option.
+   Tabs will be expanded or not according to the value of -ftabstop.  The origin
+   (default 1) is controlled by -fdiagnostics-column-origin.  */
+
+enum diagnostics_column_unit
+{
+  /* The new default: display columns.  */
+  DIAGNOSTICS_COLUMN_UNIT_DISPLAY,
+
+  /* The historical behavior: simple bytes.  */
+  DIAGNOSTICS_COLUMN_UNIT_BYTE
+};
+
 /* Enum for overriding the standard output format.  */
 
 enum diagnostics_output_format
@@ -280,6 +294,12 @@ struct diagnostic_context
      rest of the diagnostic.  */
   bool parseable_fixits_p;
 
+  /* What units to use when outputting the column number.  */
+  enum diagnostics_column_unit column_unit;
+
+  /* The origin for the column number (1-based or 0-based typically).  */
+  int column_origin;
+
   /* If non-NULL, an edit_context to which fix-it hints should be
      applied, for generating patches.  */
   edit_context *edit_context_ptr;
@@ -458,6 +478,8 @@ diagnostic_same_line (const diagnostic_context *context,
 }
 
 extern const char *diagnostic_get_color_for_kind (diagnostic_t kind);
+extern int diagnostic_converted_column (diagnostic_context *context,
+					expanded_location s);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
@@ -470,6 +492,7 @@ extern void diagnostic_output_format_init (diagnostic_context *,
 /* Compute the number of digits in the decimal representation of an integer.  */
 extern int num_digits (int);
 
-extern json::value *json_from_expanded_location (location_t loc);
+extern json::value *json_from_expanded_location (diagnostic_context *context,
+						 location_t loc);
 
 #endif /* ! GCC_DIAGNOSTIC_H */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 35e8242af5f..aa76f6acbae 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -290,7 +290,9 @@ Objective-C and Objective-C++ Dialects}.
 -fdiagnostics-show-template-tree  -fno-elide-type @gol
 -fdiagnostics-path-format=@r{[}none@r{|}separate-events@r{|}inline-events@r{]} @gol
 -fdiagnostics-show-path-depths @gol
--fno-show-column}
+-fno-show-column @gol
+-fdiagnostics-column-unit=@r{[}display@r{|}byte@r{]} @gol
+-fdiagnostics-column-origin=@var{origin}}
 
 @item Warning Options
 @xref{Warning Options,,Options to Request or Suppress Warnings}.
@@ -4418,6 +4420,29 @@ Do not print column numbers in diagnostics.  This may be necessary if
 diagnostics are being scanned by a program that does not understand the
 column numbers, such as @command{dejagnu}.
 
+@item -fdiagnostics-column-unit=@var{UNIT}
+@opindex fdiagnostics-column-unit
+Select the units for the column number.  This affects traditional diagnostics
+(in the absence of @option{-fno-show-column}), as well as JSON format
+diagnostics if requested.
+
+The default @var{UNIT}, @samp{display}, considers the number of display columns
+occupied by each character.  This may be larger than the number of bytes
+occupied, in the case of tab characters, or it may be smaller, in the case of
+multibyte characters.  For example, the UTF-8 character ``@U{03C0}'' occupies
+two bytes and one display column, while the character ``@U{1F642}'' occupies
+four bytes and two display columns.
+
+Setting @var{UNIT} to @samp{byte} changes the column number to the raw byte
+count in all cases, as was traditionally output by GCC prior to version 11.1.0.
+
+@item -fdiagnostics-column-origin=@var{ORIGIN}
+@opindex fdiagnostics-column-origin
+Select the origin for column numbers, i.e. the column number assigned to the
+first column.  The default value of 1 corresponds to traditional GCC
+behavior and to the GNU style guide.  Some utilities may perform better with an
+origin of 0; any non-negative value may be specified.
+
 @item -fdiagnostics-format=@var{FORMAT}
 @opindex fdiagnostics-format
 Select a different format for printing diagnostics.
@@ -4453,11 +4478,15 @@ might be printed in JSON form (after formatting) like this:
         "locations": [
             @{
                 "caret": @{
+		    "display-column": 3,
+		    "byte-column": 3,
                     "column": 3,
                     "file": "misleading-indentation.c",
                     "line": 15
                 @},
                 "finish": @{
+		    "display-column": 4,
+		    "byte-column": 4,
                     "column": 4,
                     "file": "misleading-indentation.c",
                     "line": 15
@@ -4473,6 +4502,8 @@ might be printed in JSON form (after formatting) like this:
                 "locations": [
                     @{
                         "caret": @{
+			    "display-column": 5,
+			    "byte-column": 5,
                             "column": 5,
                             "file": "misleading-indentation.c",
                             "line": 17
@@ -4482,6 +4513,7 @@ might be printed in JSON form (after formatting) like this:
                 "message": "...this statement, but the latter is @dots{}"
             @}
         ]
+	"column-origin": 1,
     @},
     @dots{}
 ]
@@ -4494,10 +4526,22 @@ A diagnostic has a @code{kind}.  If this is @code{warning}, then there is
 an @code{option} key describing the command-line option controlling the
 warning.
 
-A diagnostic can contain zero or more locations.  Each location has up
-to three positions within it: a @code{caret} position and optional
-@code{start} and @code{finish} positions.  A location can also have
-an optional @code{label} string.  For example, this error:
+A diagnostic can contain zero or more locations.  Each location has an
+optional @code{label} string and up to three positions within it: a
+@code{caret} position and optional @code{start} and @code{finish} positions.
+A position is described by a @code{file} name, a @code{line} number, and
+three numbers indicating a column position: @code{display-column} counts
+display columns, accounting for tabs and multibyte characters;
+@code{byte-column} counts raw bytes; and @code{column} is equal to one of
+the previous two, as dictated by the @option{-fdiagnostics-column-unit}
+option.  All three columns are relative to the origin specified by
+@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may
+be set, for instance, to 0 for compatibility with other utilities that
+number columns from 0.  The column origin is recorded in the JSON output in
+the @code{column-origin} tag.  In the remaining examples below, the extra
+column number outputs have been omitted for brevity.
+
+For example, this error:
 
 @smallexample
 bad-binary-ops.c:64:23: error: invalid operands to binary + (have 'S' @{aka
diff --git a/gcc/input.c b/gcc/input.c
index dd1d23df2f7..ab2fb7092d1 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -913,7 +913,7 @@ make_location (location_t caret, source_range src_range)
    source line in order to calculate the display width.  If that cannot be done
    for any reason, then returns the byte column as a fallback.  */
 int
-location_compute_display_column (expanded_location exploc)
+location_compute_display_column (expanded_location exploc, int tabstop)
 {
   if (!(exploc.file && *exploc.file && exploc.line && exploc.column))
     return exploc.column;
@@ -921,7 +921,7 @@ location_compute_display_column (expanded_location exploc)
   /* If line is NULL, this function returns exploc.column which is the
      desired fallback.  */
   return cpp_byte_column_to_display_column (line.get_buffer (), line.length (),
-					    exploc.column);
+					    exploc.column, tabstop);
 }
 
 /* Dump statistics to stderr about the memory usage of the line_table
@@ -3612,8 +3612,8 @@ void test_cpp_utf8 ()
   {
     int w_bad = cpp_display_width ("\xf0!\x9f!\x98!\x82!", 8);
     ASSERT_EQ (8, w_bad);
-    int w_ctrl = cpp_display_width ("\r\t\n\v\0\1", 6);
-    ASSERT_EQ (6, w_ctrl);
+    int w_ctrl = cpp_display_width ("\r\n\v\0\1", 5);
+    ASSERT_EQ (5, w_ctrl);
   }
 
   /* Verify that wcwidth of valid UTF-8 is as expected.  */
@@ -3635,6 +3635,15 @@ void test_cpp_utf8 ()
     ASSERT_EQ (18, w_mixed);
   }
 
+  /* Verify that display width properly expands tabs.  */
+  {
+    const char *tstr = "\tabc\td";
+    ASSERT_EQ (6, cpp_display_width (tstr, 6, 1));
+    ASSERT_EQ (10, cpp_display_width (tstr, 6, 3));
+    ASSERT_EQ (17, cpp_display_width (tstr, 6, 8));
+    ASSERT_EQ (1, cpp_display_column_to_byte_column (tstr, 6, 7, 8));
+  }
+
   /* Verify that cpp_byte_column_to_display_column can go past the end,
      and similar edge cases.  */
   {
diff --git a/gcc/input.h b/gcc/input.h
index df48ce63ef9..906d3ae244b 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -38,7 +38,12 @@ STATIC_ASSERT (BUILTINS_LOCATION < RESERVED_LOCATION_COUNT);
 
 extern bool is_location_from_builtin_token (location_t);
 extern expanded_location expand_location (location_t);
-extern int location_compute_display_column (expanded_location);
+
+/* As with cpp_byte_column_to_display_column(), TABSTOP <= 0 means to use the
+   global default cpp_get_tabstop(), which is typically set with the
+   -ftabstop option.  */
+extern int location_compute_display_column (expanded_location exploc,
+					    int tabstop = 0);
 
 /* A class capturing the bounds of a buffer, to allow for run-time
    bounds-checking in a checked build.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index ec3ca0720f9..f6bd2d2972b 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "opt-suggestions.h"
 #include "diagnostic-color.h"
 #include "selftest.h"
+#include "cpplib.h"
 
 static void set_Wstrict_aliasing (struct gcc_options *opts, int onoff);
 
@@ -2439,6 +2440,14 @@ common_handle_option (struct gcc_options *opts,
       dc->parseable_fixits_p = value;
       break;
 
+    case OPT_fdiagnostics_column_unit_:
+      dc->column_unit = (enum diagnostics_column_unit)value;
+      break;
+
+    case OPT_fdiagnostics_column_origin_:
+      dc->column_origin = value;
+      break;
+
     case OPT_fdiagnostics_show_cwe:
       dc->show_cwe = value;
       break;
@@ -2827,6 +2836,12 @@ common_handle_option (struct gcc_options *opts,
       check_alignment_argument (loc, arg, "functions");
       break;
 
+    case OPT_ftabstop_:
+      /* It is documented that we silently ignore silly values.  */
+      if (value >= 1 && value <= 100)
+	cpp_set_tabstop (value);
+      break;
+
     default:
       /* If the flag was handled in a standard way, assume the lack of
 	 processing here is intentional.  */
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
index 870ba720c5f..2314ad42402 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
@@ -36,20 +36,20 @@ int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
 
 /* { dg-begin-multiline-output "" }
-  if ((err = foo (b)) != 0)
-  ^~
+         if ((err = foo (b)) != 0)
+         ^~
    { dg-end-multiline-output "" } */
 /* { dg-begin-multiline-output "" }
-   goto fail;
-   ^~~~
+                 goto fail;
+                 ^~~~
    { dg-end-multiline-output "" } */
 
 fail:
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
index 5cdeba1cbba..202c6bc7fdf 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
@@ -65,9 +65,9 @@ int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
@@ -178,7 +178,7 @@ void fn_16_tabs (void)
     while (flagA)
       if (flagB) /* { dg-message "7: this 'if' clause does not guard..." } */
 	foo (0);
-	foo (1);/* { dg-message "2: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+	foo (1);/* { dg-message "9: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 }
 
 void fn_17_spaces (void)
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
index 9359db48c17..740becb5548 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
@@ -8,17 +8,22 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#error message\"" } */
 
 /* { dg-regexp "\"caret\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 6" } */
+/* { dg-regexp "\"display-column\": 6" } */
+/* { dg-regexp "\"byte-column\": 6" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
index 557ccf8378b..2f24a6c6596 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
@@ -8,6 +8,7 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"warning\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#warning message\"" } */
 /* { dg-regexp "\"option\": \"-Wcpp\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */
@@ -16,11 +17,15 @@
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 8" } */
+/* { dg-regexp "\"display-column\": 8" } */
+/* { dg-regexp "\"byte-column\": 8" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
index 378205c5bf5..afe96a9048f 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
@@ -8,6 +8,7 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#warning message\"" } */
 /* { dg-regexp "\"option\": \"-Werror=cpp\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */
@@ -16,11 +17,15 @@
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 8" } */
+/* { dg-regexp "\"display-column\": 8" } */
+/* { dg-regexp "\"byte-column\": 8" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
index 2738be6548f..ae51091e0ea 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
@@ -24,15 +24,20 @@ int test (void)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 5" } */
+/* { dg-regexp "\"display-column\": 5" } */
+/* { dg-regexp "\"byte-column\": 5" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 10" } */
+/* { dg-regexp "\"display-column\": 10" } */
+/* { dg-regexp "\"byte-column\": 10" } */
 
 /* The outer diagnostic.  */
 
 /* { dg-regexp "\"kind\": \"warning\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"this 'if' clause does not guard...\"" } */
 /* { dg-regexp "\"option\": \"-Wmisleading-indentation\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wmisleading-indentation\"" } */
@@ -41,11 +46,15 @@ int test (void)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 6" } */
 /* { dg-regexp "\"column\": 3" } */
+/* { dg-regexp "\"display-column\": 3" } */
+/* { dg-regexp "\"byte-column\": 3" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 6" } */
 /* { dg-regexp "\"column\": 4" } */
+/* { dg-regexp "\"display-column\": 4" } */
+/* { dg-regexp "\"byte-column\": 4" } */
 
 /* More from the nested diagnostic (we can't guarantee what order the
    "file" keys are consumed).  */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
index f36e896d228..e0e9ce4be98 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
@@ -13,6 +13,7 @@ int test (struct s *ptr)
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \".*\"" } */
 
 /* Verify fix-it hints.  */
@@ -23,11 +24,15 @@ int test (struct s *ptr)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 15" } */
+/* { dg-regexp "\"display-column\": 15" } */
+/* { dg-regexp "\"byte-column\": 15" } */
 
 /* { dg-regexp "\"next\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 21" } */
+/* { dg-regexp "\"display-column\": 21" } */
+/* { dg-regexp "\"byte-column\": 21" } */
 
 /* { dg-regexp "\"fixits\": \[\[\{\}, \]*\]" } */
 
@@ -35,11 +40,15 @@ int test (struct s *ptr)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 15" } */
+/* { dg-regexp "\"display-column\": 15" } */
+/* { dg-regexp "\"byte-column\": 15" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 20" } */
+/* { dg-regexp "\"display-column\": 20" } */
+/* { dg-regexp "\"byte-column\": 20" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-1.c b/gcc/testsuite/c-c++-common/diagnostic-units-1.c
new file mode 100644
index 00000000000..8d38b7de03e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-1.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via default)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-2.c b/gcc/testsuite/c-c++-common/diagnostic-units-2.c
new file mode 100644
index 00000000000..29a2edefd9f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-2.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -Wmultichar" } */
+
+/* column units: display (via arg)
+   column origin: 1 (via default)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "25: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-3.c b/gcc/testsuite/c-c++-common/diagnostic-units-3.c
new file mode 100644
index 00000000000..714ee8f2de4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-3.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=200 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via fallback from overly large argument)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-4.c b/gcc/testsuite/c-c++-common/diagnostic-units-4.c
new file mode 100644
index 00000000000..f9c9da914b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-4.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 0 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "10: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-5.c b/gcc/testsuite/c-c++-common/diagnostic-units-5.c
new file mode 100644
index 00000000000..99d5299a732
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */
+
+/* column units: display (via arg)
+   column origin: 0 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "24: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-6.c b/gcc/testsuite/c-c++-common/diagnostic-units-6.c
new file mode 100644
index 00000000000..c1e6e4ed477
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-6.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=100 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 100 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "110: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "117: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "118: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-7.c b/gcc/testsuite/c-c++-common/diagnostic-units-7.c
new file mode 100644
index 00000000000..dab221ae235
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-7.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via default)
+   tabstop: 9 (via arg) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c1 = 'c1';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c2 = 'c2';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+         int c3 = 	'c3'; /* { dg-warning "20: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c3 =          'c3';
+                            ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-8.c b/gcc/testsuite/c-c++-common/diagnostic-units-8.c
new file mode 100644
index 00000000000..d713b32dabc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-8.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */
+
+/* column units: display (via default)
+   column origin: 1 (via default)
+   tabstop: 9 (via arg) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c1 = 'c1';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c2 = 'c2';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+         int c3 = 	'c3'; /* { dg-warning "28: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c3 =          'c3';
+                            ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/missing-close-symbol.c b/gcc/testsuite/c-c++-common/missing-close-symbol.c
index abeb83748c1..9f1de3d0c47 100644
--- a/gcc/testsuite/c-c++-common/missing-close-symbol.c
+++ b/gcc/testsuite/c-c++-common/missing-close-symbol.c
@@ -24,9 +24,9 @@ void test_static_assert_different_line (void)
   _Static_assert(sizeof(int) >= sizeof(char), /* { dg-message "to match this '\\('" } */
 		 "msg"; /* { dg-error "expected '\\)' before ';' token" } */
   /* { dg-begin-multiline-output "" }
-    "msg";
-         ^
-         )
+                  "msg";
+                       ^
+                       )
      { dg-end-multiline-output "" } */
   /* { dg-begin-multiline-output "" }
    _Static_assert(sizeof(int) >= sizeof(char),
diff --git a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
index fab5849dfc7..ebbf3001055 100644
--- a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
+++ b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
@@ -33,10 +33,10 @@ int test_2 (void)
            ~~~~~~~~~~~~~~~~
                          |
                          s
-    + some_other_function ());
-    ^ ~~~~~~~~~~~~~~~~~~~~~~
-                          |
-                          t
+           + some_other_function ());
+           ^ ~~~~~~~~~~~~~~~~~~~~~~
+                                 |
+                                 t
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/g++.dg/parse/error4.C b/gcc/testsuite/g++.dg/parse/error4.C
index 792bf4dc063..fe8de73790d 100644
--- a/gcc/testsuite/g++.dg/parse/error4.C
+++ b/gcc/testsuite/g++.dg/parse/error4.C
@@ -7,4 +7,4 @@ struct X {
 		 int);
 };
 
-// { dg-error "4:'itn' has not been declared" "" { target *-*-* } 6 }
+// { dg-error "18:'itn' has not been declared" "" { target *-*-* } 6 }
diff --git a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
index 96ebb71645c..d2b37a5122d 100644
--- a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
+++ b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
@@ -9,13 +9,13 @@ class A {
 	int	h;
 	A() { i=10; j=20; }
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "16:virtual functions cannot be friends" }
 };
 
 class B : public A {
     public:
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }// { dg-error "" }  member.*// ERROR -  member.*
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "16:virtual functions cannot be friends" }
 // { dg-error "private" "" { target *-*-* } .-1 }
 };
 
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
index b438543d445..bbc9e51aff6 100644
--- a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
+++ b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
@@ -12,5 +12,5 @@ int
 main()
 {
 	C<char*>	c;
-	char*		p = Z(c.O); //{ dg-error "13:'Z' was not declared" } ambiguous c.O
+	char*		p = Z(c.O); //{ dg-error "29:'Z' was not declared" } ambiguous c.O
 }
diff --git a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
index 6dc2c55be58..b98e8da6b1e 100644
--- a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
+++ b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
@@ -48,8 +48,8 @@ ostream& operator<<(ostream& os, Graph<VertexType,EdgeType>& G)
 
         // The compiler does not like this line!!!!!!
         typename Graph<VertexType, EdgeType>::Successor::iterator
-	  startN = G[i].second.begin(), // { dg-error "14:no match" } no index operator
-	  endN   = G[i].second.end();  // { dg-error "14:no match" } no index operator
+	  startN = G[i].second.begin(), // { dg-error "21:no match" } no index operator
+	  endN   = G[i].second.end();  // { dg-error "21:no match" } no index operator
 
         while(startN != endN)
         {
diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
index c5ff96e5644..51190c92391 100644
--- a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
+++ b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
@@ -288,7 +288,7 @@ int test_3 (int x, int y)
     |      |     ~~~~~~~~~~
     |      |     |
     |      |     (4) ...to here
-    |   NN |      to dereference it above
+    |   NN |                    to dereference it above
     |   NN |   return *ptr;
     |      |          ~~~~
     |      |          |
diff --git a/gcc/testsuite/gcc.dg/bad-binary-ops.c b/gcc/testsuite/gcc.dg/bad-binary-ops.c
index 46c158e6a5f..45668be0a29 100644
--- a/gcc/testsuite/gcc.dg/bad-binary-ops.c
+++ b/gcc/testsuite/gcc.dg/bad-binary-ops.c
@@ -35,10 +35,10 @@ int test_2 (void)
            ~~~~~~~~~~~~~~~~
            |
            struct s
-    + some_other_function ());
-    ^ ~~~~~~~~~~~~~~~~~~~~~~
-      |
-      struct t
+           + some_other_function ());
+           ^ ~~~~~~~~~~~~~~~~~~~~~~
+             |
+             struct t
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/format/branch-1.c b/gcc/testsuite/gcc.dg/format/branch-1.c
index 1782064645e..4ea39b52b2e 100644
--- a/gcc/testsuite/gcc.dg/format/branch-1.c
+++ b/gcc/testsuite/gcc.dg/format/branch-1.c
@@ -10,7 +10,7 @@ foo (long l, int nfoo)
 {
   printf ((nfoo > 1) ? "%d foos" : "%d foo", nfoo);
   printf ((l > 1) ? "%d foos" /* { dg-warning "23:int" "wrong type in conditional expr" } */
-	          : "%d foo", l); /* { dg-warning "16:int" "wrong type in conditional expr" } */
+	          : "%d foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%ld foos" : "%d foo", l); /* { dg-warning "36:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%d foos" : "%ld foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   /* Should allow one case to have extra arguments.  */
diff --git a/gcc/testsuite/gcc.dg/format/pr79210.c b/gcc/testsuite/gcc.dg/format/pr79210.c
index 71f5dd6e082..6bdabdf21ec 100644
--- a/gcc/testsuite/gcc.dg/format/pr79210.c
+++ b/gcc/testsuite/gcc.dg/format/pr79210.c
@@ -20,4 +20,4 @@ LPFC_VPORT_ATTR_R(peer_port_login,
 		  "Allow peer ports on the same physical port to login to each "
 		  "other.");
 
-/* { dg-warning "6: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
+/* { dg-warning "20: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
index 03b78042107..d7691e4be51 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
@@ -540,15 +540,15 @@ void test_builtin_types_compatible_p (unsigned long i)
   __emit_expression_range (0,
 			   f (i) + __builtin_types_compatible_p (long, int)); /* { dg-warning "range" } */
 /* { dg-begin-multiline-output "" }
-       f (i) + __builtin_types_compatible_p (long, int));
-       ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                            f (i) + __builtin_types_compatible_p (long, int));
+                            ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    { dg-end-multiline-output "" } */
 
   __emit_expression_range (0,
 			   __builtin_types_compatible_p (long, int) + f (i)); /* { dg-warning "range" } */
 /* { dg-begin-multiline-output "" }
-       __builtin_types_compatible_p (long, int) + f (i));
-       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
+                            __builtin_types_compatible_p (long, int) + f (i));
+                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
    { dg-end-multiline-output "" } */
 }
 
@@ -671,8 +671,8 @@ void test_multiple_ordinary_maps (void)
 /* { dg-begin-multiline-output "" }
    __emit_expression_range (0, foo (0,
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-        "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));
-        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));
+                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    { dg-end-multiline-output "" } */
 
   /* Another expression that transitions between ordinary maps; this
@@ -685,8 +685,8 @@ void test_multiple_ordinary_maps (void)
 /* { dg-begin-multiline-output "" }
    __emit_expression_range (0, foo (0, "01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789",
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-        0));
-        ~~                      
+                                    0));
+                                    ~~
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
index ac4fa1b52bd..4cba87be2ae 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
@@ -335,11 +335,11 @@ pr87652 (const char *stem, int counter)
 /* { dg-error "unable to read substring location: unable to read source line" "" { target c } 329 } */
 /* { dg-error "unable to read substring location: failed to get ordinary maps" "" { target c++ } 329 } */
 /* { dg-begin-multiline-output "" }
-     __emit_string_literal_range(__FILE__":%5d: " format, \
+     __emit_string_literal_range(__FILE__":%5d: " format,        \
                                  ^~~~~~~~
      { dg-end-multiline-output "" { target c } } */
 /* { dg-begin-multiline-output "" }
-     __emit_string_literal_range(__FILE__":%5d: " format, \
+     __emit_string_literal_range(__FILE__":%5d: " format,        \
                                  ^
      { dg-end-multiline-output "" { target c++ } } */
 
diff --git a/gcc/testsuite/gcc.dg/redecl-4.c b/gcc/testsuite/gcc.dg/redecl-4.c
index 8f124886da8..2c214bb02c7 100644
--- a/gcc/testsuite/gcc.dg/redecl-4.c
+++ b/gcc/testsuite/gcc.dg/redecl-4.c
@@ -15,7 +15,7 @@ f (void)
     /* Should get format warnings even though the built-in declaration
        isn't "visible".  */
     printf (
-	    "%s", 1); /* { dg-warning "8:format" } */
+	    "%s", 1); /* { dg-warning "15:format" } */
     /* The type of strcmp here should have no prototype.  */
     if (0)
       strcmp (1);
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
index 7fade1f65fc..606fe0f891a 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
@@ -8,17 +8,22 @@
 ! We can't rely on any ordering of the keys.
 
 ! { dg-regexp "\"kind\": \"error\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#error message\"" }
 
 ! { dg-regexp "\"caret\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 6" }
+! { dg-regexp "\"display-column\": 6" }
+! { dg-regexp "\"byte-column\": 6" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
index bebcf68d431..56615f0ca5a 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
@@ -8,6 +8,7 @@
 ! We can't rely on any ordering of the keys. 
 
 ! { dg-regexp "\"kind\": \"warning\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#warning message\"" }
 ! { dg-regexp "\"option\": \"-Wcpp\"" }
 ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }
@@ -16,11 +17,15 @@
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 8" }
+! { dg-regexp "\"display-column\": 8" }
+! { dg-regexp "\"byte-column\": 8" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
index 7ab78eb570b..50214759091 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
@@ -8,6 +8,7 @@
 ! We can't rely on any ordering of the keys.
 
 ! { dg-regexp "\"kind\": \"error\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#warning message\"" }
 ! { dg-regexp "\"option\": \"-Werror=cpp\"" }
 ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }
@@ -16,11 +17,15 @@
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 8" }
+! { dg-regexp "\"display-column\": 8" }
+! { dg-regexp "\"byte-column\": 8" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/go.dg/arrayclear.go b/gcc/testsuite/go.dg/arrayclear.go
index 6daebc0b8f5..aa5ba0761d7 100644
--- a/gcc/testsuite/go.dg/arrayclear.go
+++ b/gcc/testsuite/go.dg/arrayclear.go
@@ -1,5 +1,8 @@
 // { dg-do compile }
 // { dg-options "-fgo-debug-optimization" }
+// This comment is necessary to work around a dejagnu bug. Otherwise, the
+// column of the second error message would equal the row of the first one, and
+// since the errors are also identical, dejagnu is not able to distinguish them.
 
 package p
 
diff --git a/gcc/tree-diagnostic-path.cc b/gcc/tree-diagnostic-path.cc
index 381a49cb0b4..82b3c2d6b6a 100644
--- a/gcc/tree-diagnostic-path.cc
+++ b/gcc/tree-diagnostic-path.cc
@@ -493,7 +493,7 @@ default_tree_diagnostic_path_printer (diagnostic_context *context,
    doesn't have access to trees (for m_fndecl).  */
 
 json::value *
-default_tree_make_json_for_path (diagnostic_context *,
+default_tree_make_json_for_path (diagnostic_context *context,
 				 const diagnostic_path *path)
 {
   json::array *path_array = new json::array ();
@@ -504,7 +504,8 @@ default_tree_make_json_for_path (diagnostic_context *,
       json::object *event_obj = new json::object ();
       if (event.get_location ())
 	event_obj->set ("location",
-			json_from_expanded_location (event.get_location ()));
+			json_from_expanded_location (context,
+						     event.get_location ()));
       label_text event_text (event.get_desc (false));
       event_obj->set ("description", new json::string (event_text.m_buffer));
       event_text.maybe_free ();
diff --git a/libcpp/charset.c b/libcpp/charset.c
index d9281c5fb97..66a5f2b7f26 100644
--- a/libcpp/charset.c
+++ b/libcpp/charset.c
@@ -2276,49 +2276,105 @@ cpp_string_location_reader::get_next ()
   return result;
 }
 
-/* Helper for cpp_byte_column_to_display_column and its inverse.  Given a
-   pointer to a UTF-8-encoded character, compute its display width.  *INBUFP
-   points on entry to the start of the UTF-8 encoding of the character, and
-   is updated to point just after the last byte of the encoding.  *INBYTESLEFTP
-   contains on entry the remaining size of the buffer into which *INBUFP
-   points, and this is also updated accordingly.  If *INBUFP does not
+/* This is normally determined by the -ftabstop option.  We need to know it so
+   the display column computations below can expand tabs as well.  */
+
+static int global_tabstop = 8;
+
+int
+cpp_set_tabstop (int t)
+{
+  return global_tabstop = MAX (1, t);
+}
+
+int
+cpp_get_tabstop ()
+{
+  return global_tabstop;
+}
+
+cpp_display_width_computation::
+cpp_display_width_computation (const char *data, int data_length, int tabstop) :
+  m_begin (data),
+  m_next (m_begin),
+  m_bytes_left (data_length),
+  m_tabstop (tabstop > 0 ? tabstop : global_tabstop),
+  m_display_cols (0)
+{}
+
+
+/* The main implementation function for class cpp_display_width_computation.
+   m_next points on entry to the start of the UTF-8 encoding of the next
+   character, and is updated to point just after the last byte of the encoding.
+   m_bytes_left contains on entry the remaining size of the buffer into which
+   m_next points, and this is also updated accordingly.  If m_next does not
    point to a valid UTF-8-encoded sequence, then it will be treated as a single
-   byte with display width 1.  */
+   byte with display width 1.  m_cur_display_col is the current display column,
+   relative to which tab stops should be expanded.  Returns the display width of
+   the codepoint just processed.  */
 
-static inline int
-compute_next_display_width (const uchar **inbufp, size_t *inbytesleftp)
+int
+cpp_display_width_computation::process_next_codepoint ()
 {
   cppchar_t c;
-  if (one_utf8_to_cppchar (inbufp, inbytesleftp, &c) != 0)
+  int next_width;
+
+  if (*m_next == '\t')
+    {
+      ++m_next;
+      --m_bytes_left;
+      next_width = m_tabstop - (m_display_cols % m_tabstop);
+    }
+  else if (one_utf8_to_cppchar ((const uchar **) &m_next, &m_bytes_left, &c)
+	   != 0)
     {
       /* Input is not convertible to UTF-8.  This could be fine, e.g. in a
 	 string literal, so don't complain.  Just treat it as if it has a width
 	 of one.  */
-      ++*inbufp;
-      --*inbytesleftp;
-      return 1;
+      ++m_next;
+      --m_bytes_left;
+      next_width = 1;
     }
+  else
+    {
+      /*  one_utf8_to_cppchar() has updated m_next and m_bytes_left for us.  */
+      next_width = cpp_wcwidth (c);
+    }
+
+  m_display_cols += next_width;
+  return next_width;
+}
 
-  /*  one_utf8_to_cppchar() has updated inbufp and inbytesleftp for us.  */
-  return cpp_wcwidth (c);
+/*  Utility to advance the byte stream by the minimum amount needed to consume
+    N display columnns.  Returns the number of display columns that were
+    actually skipped.  This could be less than N, if there was not enough data,
+    or more than N, if the last character to be skipped had a sufficiently large
+    display width.  */
+int
+cpp_display_width_computation::advance_display_cols (int n)
+{
+  const int start = m_display_cols;
+  const int target = start + n;
+  while (m_display_cols < target && !done ())
+    process_next_codepoint ();
+  return m_display_cols - start;
 }
 
 /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute
     how many display columns are occupied by the first COLUMN bytes.  COLUMN
     may exceed DATA_LENGTH, in which case the phantom bytes at the end are
-    treated as if they have display width 1.  */
+    treated as if they have display width 1.  Tabs are expanded to the next tab
+    stop, relative to the start of DATA.  */
 
 int
 cpp_byte_column_to_display_column (const char *data, int data_length,
-				   int column)
+				   int column, int tabstop)
 {
-  int display_col = 0;
-  const uchar *udata = (const uchar *) data;
   const int offset = MAX (0, column - data_length);
-  size_t inbytesleft = column - offset;
-  while (inbytesleft)
-    display_col += compute_next_display_width (&udata, &inbytesleft);
-  return display_col + offset;
+  cpp_display_width_computation dw (data, column - offset, tabstop);
+  while (!dw.done ())
+    dw.process_next_codepoint ();
+  return dw.display_cols_processed () + offset;
 }
 
 /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute
@@ -2328,14 +2384,11 @@ cpp_byte_column_to_display_column (const char *data, int data_length,
 
 int
 cpp_display_column_to_byte_column (const char *data, int data_length,
-				   int display_col)
+				   int display_col, int tabstop)
 {
-  int column = 0;
-  const uchar *udata = (const uchar *) data;
-  size_t inbytesleft = data_length;
-  while (column < display_col && inbytesleft)
-      column += compute_next_display_width (&udata, &inbytesleft);
-  return data_length - inbytesleft + MAX (0, display_col - column);
+  cpp_display_width_computation dw (data, data_length, tabstop);
+  const int avail_display = dw.advance_display_cols (display_col);
+  return dw.bytes_processed () + MAX (0, display_col - avail_display);
 }
 
 /* Our own version of wcwidth().  We don't use the actual wcwidth() in glibc,
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 03cc72a12e2..9bf866ad7b6 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -312,9 +312,6 @@ enum cpp_normalize_level {
    carries all the options visible to the command line.  */
 struct cpp_options
 {
-  /* Characters between tab stops.  */
-  unsigned int tabstop;
-
   /* The language we're preprocessing.  */
   enum c_lang lang;
 
@@ -1322,14 +1319,48 @@ extern const char * cpp_get_userdef_suffix
   (const cpp_token *);
 
 /* In charset.c */
+
+/* A class to manage the state while converting a UTF-8 sequence to cppchar_t
+   and computing the display width one character at a time.  */
+class cpp_display_width_computation {
+ public:
+  /* TABSTOP <= 0 means to use cpp_get_tabstop().  */
+  cpp_display_width_computation (const char *data, int data_length,
+				 int tabstop = 0);
+  const char *next_byte () const { return m_next; }
+  int bytes_processed () const { return m_next - m_begin; }
+  int bytes_left () const { return m_bytes_left; }
+  bool done () const { return !bytes_left (); }
+  int display_cols_processed () const { return m_display_cols; }
+
+  int process_next_codepoint ();
+  int advance_display_cols (int n);
+
+ private:
+  const char *const m_begin;
+  const char *m_next;
+  size_t m_bytes_left;
+  const int m_tabstop;
+  int m_display_cols;
+};
+
+/* Convenience functions that are simple use cases for class
+   cpp_display_width_computation.  Tab characters will be expanded to spaces
+   as determined by TABSTOP.  If TABSTOP <= 0, the tab width is set to the
+   global default cpp_get_tabstop (), which is typically set with the
+   -ftabstop option.  */
 int cpp_byte_column_to_display_column (const char *data, int data_length,
-				       int column);
-inline int cpp_display_width (const char *data, int data_length)
+				       int column, int tabstop = 0);
+inline int cpp_display_width (const char *data, int data_length,
+			      int tabstop = 0)
 {
-    return cpp_byte_column_to_display_column (data, data_length, data_length);
+  return cpp_byte_column_to_display_column (data, data_length, data_length,
+					    tabstop);
 }
 int cpp_display_column_to_byte_column (const char *data, int data_length,
-				       int display_col);
+				       int display_col, int tabstop = 0);
 int cpp_wcwidth (cppchar_t c);
+int cpp_set_tabstop (int t);
+int cpp_get_tabstop ();
 
 #endif /* ! LIBCPP_CPPLIB_H */
diff --git a/libcpp/init.c b/libcpp/init.c
index a3cd8e28f62..cb0d5006339 100644
--- a/libcpp/init.c
+++ b/libcpp/init.c
@@ -190,7 +190,6 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, discard_comments) = 1;
   CPP_OPTION (pfile, discard_comments_in_macro_exp) = 1;
   CPP_OPTION (pfile, max_include_depth) = 200;
-  CPP_OPTION (pfile, tabstop) = 8;
   CPP_OPTION (pfile, operator_names) = 1;
   CPP_OPTION (pfile, warn_trigraphs) = 2;
   CPP_OPTION (pfile, warn_endif_labels) = 1;


More information about the Gcc-patches mailing list