[PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

Lewis Hyatt lhyatt@gmail.com
Thu Jun 11 15:26:28 GMT 2020


On Wed, Jun 10, 2020 at 12:11:00PM -0400, David Malcolm wrote:
> Thanks for the patch; sorry about the delay in reviewing it.
> 
> Some high-level review points
> 
> - I like the patch overall
> 
> - This will deserve an item in the release notes
> 
> - I don't like adding "global_tabstop" (I don't like global
> variables).  Is there nowhere else we can handle this? I believe
> there's a cluster of functions in the callgraph that make use of
> it; can we simply pass around the tabstop value instead?  "tabstop"
> seems to have several meanings.  If I'm reading the patch correctly
>   * "tabstop > 0" means to expand tabs so that column numbers are a
> multiple of tabstop
>   * "tabstop == 0" means "don't expand tabs"
>   * "tabstop < 0" in some places means: use the global_tabstop value
> Is it possible to eliminate global_tabstop value?  Or is there some
> deep reason I'm missing?
> 
> I'll do a more thorough review once that's addressed/resolved (since
> eliminating global_tabstop might touch a few places).
>

Thanks for the feedback! The attached updated patch addresses these
concerns. Regarding tabstop, I have removed the new static variable
global_tabstop in charset.c. FWIW, the usage of "tabstop" arguments in the
various new APIs did previously work a bit more consistently than you
described. In all cases "tabstop <= 0" meant to use the default value,
otherwise it specified the tabstop to use (with tabstop=1 naturally
restoring the old behavior of changing tabs to a single space). In order
for libcpp to provide this feature (callers can pass tabstop <= 0 to get a
default, and the default can in turn by configured when processing the
-ftabstop option), it does need to remember the default, and this has to
be a file-level static variable because the routines need to work
independent of any cpp_reader instance. (Some frontends don't use
libcpp to read their input, for instance.) Anyway, I see the point that
this file-level static, being accessible with cpp_set_tabstop() and
cpp_get_tabstop(), is effectively just a global variable, so I have
removed this feature, which just means that all callers need to pass the
tabstop they want to use. I am now rather using the diagnostic_context
object to remember the value passed to -ftabstop. The only place this
involves global variables is now in c-family/c-indentation.c, where if I
understood correctly, the only diagnostic_context available is global_dc,
so I am getting the tabstop value from there. Please let me know if
there's a better way to handle that? Prior to my patch, the tabstop was
obtained from a different global variable (extern cpp_options *cpp_opts),
so at least conservation of total globals is maintained. :)

Compared to the previous version, this one is a bit longer, since 25 or
so call sites had to be modified to know the value of -ftabstop. Most of
the churn is in diagnostic-show-locus.c, because there are a fair number of
static helper functions and helper classes there, which just needed to
receive the diagnostic_context object from their callers. I could
have made this simpler by letting the tabstop argument default to
something like 8 in all functions that require it... this would remove the
need to pass it in all the selftests that are indifferent to it. I figured
it would be better to force this argument to be passed, though, or else in
the future it may be easy to forget to pass it where it is needed. 

> Thanks for adding docs; some nits on them:
> 
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> 
> [...snip...]
> 
> > +@item -fdiagnostics-column-unit=@var{UNIT}
> > +@opindex fdiagnostics-column-unit
> > +Select the units for the column number.  This affects traditional diagnostics
> > +(in the absence of @option{-fno-show-column}), as well as JSON format
> > +diagnostics if requested.
> > +
> > +The default @var{UNIT}, @samp{display}, considers the number of display columns
> > +occupied by each character.  This may be larger than the number of bytes
> > +occupied, in the case of tab characters, or it may be smaller, in the case of
> > +multibyte characters.  For example, the UTF-8 character ``@U{03C0}'' occupies
> > +two bytes and one display column, while the character ``@U{1F642}'' occupies
> > +four bytes and two display columns.
> 
> This is imprecise.  A unicode code point occupies some number of display columns,
> and its *UTF-8 encoding* occupies some number of bytes.
> 
> [and my inner pedant is now thinking: what about combining diacritics? 
> But I don't think we can ever issue a diagnostic on a diacritic; I
> *think* we only ever care about the per-glyph level]
> 
> > +Setting @var{UNIT} to @samp{byte} changes the column number to the
> raw byte
> > +count in all cases, as was traditionally output by GCC prior to version 11.1.0.
> > +
> > +@item -fdiagnostics-column-origin=@var{ORIGIN}
> > +@opindex fdiagnostics-column-origin
> > +Select the origin for column numbers, i.e. the column number assigned to the
> > +first column.  The default value of 1 corresponds to traditional GCC
> > +behavior and to the GNU style guide.  Some utilities may perform better with an
> > +origin of 0; any non-negative value may be specified.
> > +
> >  @item -fdiagnostics-format=@var{FORMAT}
> >  @opindex fdiagnostics-format
> >  Select a different format for printing diagnostics.
> 
> [...snip...]
> 
> > +A diagnostic can contain zero or more locations.  Each location has an
> > +optional @code{label} string and up to three positions within it: a
> > +@code{caret} position and optional @code{start} and @code{finish} positions.
> > +A position is described by a @code{file} name, a @code{line} number, and
> > +three numbers indicating a column position: @code{display-column} counts
> > +display columns, accounting for tabs and multibyte characters;
> > +@code{byte-column} counts raw bytes; and @code{column} is equal to one of
> > +the previous two, as dictated by the @option{-fdiagnostics-column-unit}
> > +option.
> 
> Might be clearer to use an unordered list here for the three kinds of column.
> 
> > All three columns are relative to the origin specified by
> > +@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may
> > +be set, for instance, to 0 for compatibility with other utilities that
> > +number columns from 0.  The column origin is recorded in the JSON output in
> > +the @code{column-origin} tag.  In the remaining examples below, the extra
> > +column number outputs have been omitted for brevity.
> 
> [...snip...]
> 

I improved the docs along these lines.

> Thanks again for the patch; hope this is constructive
> Dave
>

Thanks for your time! BTW, I did bootstrap + regtest this version as well on
x86-64 Linux, it looks good, new tests pass and others are the same:

FAIL 97 97
PASS 476837 477297
UNRESOLVED 7 7
UNSUPPORTED 11726 11726
UNTESTED 195 195
XFAIL 1807 1807
XPASS 37 37

-Lewis
-------------- next part --------------
>From 7729ce3334b6768a25967a6dd4a0a5a2ed0923cc Mon Sep 17 00:00:00 2001
From: Lewis Hyatt <lhyatt@gmail.com>
Date: Wed, 10 Jun 2020 22:04:07 -0400
Subject: [PATCH] diagnostics: Support conversion of tabs to spaces [PR49973] [PR86904]

Supports conversion of tabs to spaces when outputting diagnostics. Also
adds -fdiagnostics-column-unit and -fdiagnostics-column-origin options to
control how the column number is output, thereby resolving the two PRs.

gcc/c-family/ChangeLog:

	PR other/86904
	* c-indentation.c (should_warn_for_misleading_indentation): Get
	global tabstop from the new source.
	* c-opts.c (c_common_handle_option): Remove handling of -ftabstop, which
	is now a common option.
	* c.opt: Likewise.

gcc/ChangeLog:

	PR preprocessor/49973
	PR other/86904
	* common.opt: Handle -ftabstop here instead of in c-family
	options.  Add -fdiagnostics-column-unit= and
	-fdiagnostics-column-origin= options.
	* opts.c (common_handle_option): Handle the new options.
	* diagnostic-format-json.cc (json_from_expanded_location): Add
	diagnostic_context argument.  Use it to convert column numbers as per
	the new options.
	(json_from_location_range): Likewise.
	(json_from_fixit_hint): Likewise.
	(json_end_diagnostic): Pass the new context argument to helper
	functions above.  Add "column-origin" field to the output.
	(test_unknown_location): Add the new context argument to calls to
	helper functions.
	(test_bad_endpoints): Likewise.
	* diagnostic-show-locus.c
	(exploc_with_display_col::exploc_with_display_col): Support
	tabstop parameter.
	(layout_point::layout_point): Make use of class
	exploc_with_display_col.
	(layout_range::layout_range): Likewise.
	(struct line_bounds): Clarify that the units are now always
	display columns.  Rename members accordingly.  Add constructor.
	(layout::print_source_line): Add support for tab expansion.
	(make_range): Adapt to class layout_range changes.
	(layout::maybe_add_location_range): Likewise.
	(layout::layout): Adapt to class exploc_with_display_col changes.
	(layout::calculate_x_offset_display): Support tabstop parameter.
	(layout::print_annotation_line): Adapt to struct line_bounds changes.
	(layout::print_line): Likewise.
	(line_label::line_label): Add diagnostic_context argument.
	(get_affected_range): Likewise.
	(get_printed_columns): Likewise.
	(layout::print_any_labels): Adapt to struct line_label changes.
	(class correction): Add m_tabstop member.
	(correction::correction): Add tabstop argument.
	(correction::compute_display_cols): Use m_tabstop.
	(class line_corrections): Add m_context member.
	(line_corrections::line_corrections): Add diagnostic_context argument.
	(line_corrections::add_hint): Use m_context to handle tabstops.
	(layout::print_trailing_fixits): Adapt to class line_corrections
	changes.
	(test_layout_x_offset_display_utf8): Support tabstop parameter.
	(test_layout_x_offset_display_tab): New selftest.
	(test_one_liner_colorized_utf8): Likewise.
	(test_tab_expansion): Likewise.
	(test_diagnostic_show_locus_one_liner_utf8): Call the new tests.
	(diagnostic_show_locus_c_tests): Likewise.
	(test_overlapped_fixit_printing): Adapt to helper class and
	function changes.
	(test_overlapped_fixit_printing_utf8): Likewise.
	(test_overlapped_fixit_printing_2): Likewise.
	* diagnostic.h (enum diagnostics_column_unit): New enum.
	(struct diagnostic_context): Add members for the new options.
	(diagnostic_converted_column): Declare.
	(json_from_expanded_location): Add new context argument.
	* diagnostic.c (diagnostic_initialize): Initialize new members.
	(diagnostic_converted_column): New function.
	(maybe_line_and_column): Be willing to output a column of 0.
	(diagnostic_get_location_text): Convert column number as per the new
	options.
	(diagnostic_report_current_module): Likewise.
	(assert_location_text): Add origin and column_unit arguments for
	testing the new functionality.
	(test_diagnostic_get_location_text): Test the new functionality.
	* doc/invoke.texi: Document the new options and behavior.
	* input.h (location_compute_display_column): Add tabstop argument.
	* input.c (location_compute_display_column): Likewise.
	(test_cpp_utf8): Add selftests for tab expansion.
	* tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the
	new context argument to json_from_expanded_location().

libcpp/ChangeLog:

	PR preprocessor/49973
	PR other/86904
	* include/cpplib.h (struct cpp_options):  Removed support for -ftabstop,
	which is now handled by diagnostic_context.
	(class cpp_display_width_computation): New class.
	(cpp_byte_column_to_display_column): Add optional tabstop argument.
	(cpp_display_width): Likewise.
	(cpp_display_column_to_byte_column): Likewise.
	* charset.c
	(cpp_display_width_computation::cpp_display_width_computation): New
	function.
	(cpp_display_width_computation::advance_display_cols): Likewise.
	(compute_next_display_width): Removed and implemented this
	functionality in a new function...
	(cpp_display_width_computation::process_next_codepoint): ...here.
	(cpp_byte_column_to_display_column): Added tabstop argument.
	Reimplemented in terms of class cpp_display_width_computation.
	(cpp_display_column_to_byte_column): Likewise.
	* init.c (cpp_create_reader): Remove handling of -ftabstop, which is now
	handled by diagnostic_context.

gcc/testsuite/ChangeLog:

	PR preprocessor/49973
	PR other/86904
	* c-c++-common/Wmisleading-indentation-3.c: Adjust expected output
	for new defaults.
	* c-c++-common/Wmisleading-indentation.c: Likewise.
	* c-c++-common/diagnostic-format-json-1.c: Likewise.
	* c-c++-common/diagnostic-format-json-2.c: Likewise.
	* c-c++-common/diagnostic-format-json-3.c: Likewise.
	* c-c++-common/diagnostic-format-json-4.c: Likewise.
	* c-c++-common/diagnostic-format-json-5.c: Likewise.
	* c-c++-common/missing-close-symbol.c: Likewise.
	* g++.dg/diagnostic/bad-binary-ops.C: Likewise.
	* g++.dg/parse/error4.C: Likewise.
	* g++.old-deja/g++.brendan/crash11.C: Likewise.
	* g++.old-deja/g++.pt/overload2.C: Likewise.
	* g++.old-deja/g++.robertl/eb109.C: Likewise.
	* gcc.dg/analyzer/malloc-paths-9.c: Likewise.
	* gcc.dg/bad-binary-ops.c: Likewise.
	* gcc.dg/format/branch-1.c: Likewise.
	* gcc.dg/format/pr79210.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-expressions-1.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: Likewise.
	* gcc.dg/redecl-4.c: Likewise.
	* gfortran.dg/diagnostic-format-json-1.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
	* go.dg/arrayclear.go: Add a comment explaining why adding a
	comment was necessary to work around a dejagnu bug.
	* c-c++-common/diagnostic-units-1.c: New test.
	* c-c++-common/diagnostic-units-2.c: New test.
	* c-c++-common/diagnostic-units-3.c: New test.
	* c-c++-common/diagnostic-units-4.c: New test.
	* c-c++-common/diagnostic-units-5.c: New test.
	* c-c++-common/diagnostic-units-6.c: New test.
	* c-c++-common/diagnostic-units-7.c: New test.
	* c-c++-common/diagnostic-units-8.c: New test.

diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index 9fba3bcc67c..d814f6f29e6 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -24,8 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "c-common.h"
 #include "c-indentation.h"
 #include "selftest.h"
-
-extern cpp_options *cpp_opts;
+#include "diagnostic.h"
 
 /* Round up VIS_COLUMN to nearest tab stop. */
 
@@ -299,7 +298,7 @@ should_warn_for_misleading_indentation (const token_indent_info &guard_tinfo,
   expanded_location next_stmt_exploc = expand_location (next_stmt_loc);
   expanded_location guard_exploc = expand_location (guard_loc);
 
-  const unsigned int tab_width = cpp_opts->tabstop;
+  const unsigned int tab_width = global_dc->tabstop;
 
   /* They must be in the same file.  */
   if (next_stmt_exploc.file != body_exploc.file)
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 8a5131b8ac6..f6588277565 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -504,12 +504,6 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,
 	cpp_opts->track_macro_expansion = 2;
       break;
 
-    case OPT_ftabstop_:
-      /* It is documented that we silently ignore silly values.  */
-      if (value >= 1 && value <= 100)
-	cpp_opts->tabstop = value;
-      break;
-
     case OPT_fexec_charset_:
       cpp_opts->narrow_charset = arg;
       break;
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 89a58282b3f..913f91d818a 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1876,10 +1876,6 @@ Enum(strong_eval_order) String(some) Value(1)
 EnumValue
 Enum(strong_eval_order) String(all) Value(2)
 
-ftabstop=
-C ObjC C++ ObjC++ Joined RejectNegative UInteger
--ftabstop=<number>	Distance between tab stops for column reporting.
-
 ftemplate-backtrace-limit=
 C++ ObjC++ Joined RejectNegative UInteger Var(template_backtrace_limit) Init(10)
 Set the maximum number of template instantiation notes for a single warning or error.
diff --git a/gcc/common.opt b/gcc/common.opt
index df8af365d1b..a3893a4725e 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1328,6 +1328,14 @@ Enum(diagnostic_url_rule) String(always) Value(DIAGNOSTICS_URL_YES)
 EnumValue
 Enum(diagnostic_url_rule) String(auto) Value(DIAGNOSTICS_URL_AUTO)
 
+fdiagnostics-column-unit=
+Common Joined RejectNegative Enum(diagnostics_column_unit)
+-fdiagnostics-column-unit=[display|byte]	Select whether column numbers are output as display columns (default) or raw bytes.
+
+fdiagnostics-column-origin=
+Common Joined RejectNegative UInteger
+-fdiagnostics-column-origin=<number>	Set the number of the first column.  The default is 1-based as per GNU style, but some utilities may expect 0-based, for example.
+
 fdiagnostics-format=
 Common Joined RejectNegative Enum(diagnostics_output_format)
 -fdiagnostics-format=[text|json]	Select output format.
@@ -1336,6 +1344,15 @@ Common Joined RejectNegative Enum(diagnostics_output_format)
 SourceInclude
 diagnostic.h
 
+Enum
+Name(diagnostics_column_unit) Type(int)
+
+EnumValue
+Enum(diagnostics_column_unit) String(display) Value(DIAGNOSTICS_COLUMN_UNIT_DISPLAY)
+
+EnumValue
+Enum(diagnostics_column_unit) String(byte) Value(DIAGNOSTICS_COLUMN_UNIT_BYTE)
+
 Enum
 Name(diagnostics_output_format) Type(int)
 
@@ -1365,6 +1382,10 @@ fdiagnostics-path-format=
 Common Joined RejectNegative Var(flag_diagnostics_path_format) Enum(diagnostic_path_format) Init(DPF_INLINE_EVENTS)
 Specify how to print any control-flow path associated with a diagnostic.
 
+ftabstop=
+Common Joined RejectNegative UInteger
+-ftabstop=<number>      Distance between tab stops for column reporting.
+
 Enum
 Name(diagnostic_path_format) Type(int)
 
diff --git a/gcc/diagnostic-format-json.cc b/gcc/diagnostic-format-json.cc
index 7bda5c4ba83..465c42fdfde 100644
--- a/gcc/diagnostic-format-json.cc
+++ b/gcc/diagnostic-format-json.cc
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "diagnostic.h"
+#include "selftest-diagnostic.h"
 #include "diagnostic-metadata.h"
 #include "json.h"
 #include "selftest.h"
@@ -43,21 +44,43 @@ static json::array *cur_children_array;
 /* Generate a JSON object for LOC.  */
 
 json::value *
-json_from_expanded_location (location_t loc)
+json_from_expanded_location (diagnostic_context *context, location_t loc)
 {
   expanded_location exploc = expand_location (loc);
   json::object *result = new json::object ();
   if (exploc.file)
     result->set ("file", new json::string (exploc.file));
   result->set ("line", new json::integer_number (exploc.line));
-  result->set ("column", new json::integer_number (exploc.column));
+
+  const enum diagnostics_column_unit orig_unit = context->column_unit;
+  struct
+  {
+    const char *name;
+    enum diagnostics_column_unit unit;
+  } column_fields[] = {
+    {"display-column", DIAGNOSTICS_COLUMN_UNIT_DISPLAY},
+    {"byte-column", DIAGNOSTICS_COLUMN_UNIT_BYTE}
+  };
+  int the_column = INT_MIN;
+  for (int i = 0; i != sizeof column_fields / sizeof (*column_fields); ++i)
+    {
+      context->column_unit = column_fields[i].unit;
+      const int col = diagnostic_converted_column (context, exploc);
+      result->set (column_fields[i].name, new json::integer_number (col));
+      if (column_fields[i].unit == orig_unit)
+	the_column = col;
+    }
+  gcc_assert (the_column != INT_MIN);
+  result->set ("column", new json::integer_number (the_column));
+  context->column_unit = orig_unit;
   return result;
 }
 
 /* Generate a JSON object for LOC_RANGE.  */
 
 static json::object *
-json_from_location_range (const location_range *loc_range, unsigned range_idx)
+json_from_location_range (diagnostic_context *context,
+			  const location_range *loc_range, unsigned range_idx)
 {
   location_t caret_loc = get_pure_location (loc_range->m_loc);
 
@@ -68,13 +91,13 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)
   location_t finish_loc = get_finish (loc_range->m_loc);
 
   json::object *result = new json::object ();
-  result->set ("caret", json_from_expanded_location (caret_loc));
+  result->set ("caret", json_from_expanded_location (context, caret_loc));
   if (start_loc != caret_loc
       && start_loc != UNKNOWN_LOCATION)
-    result->set ("start", json_from_expanded_location (start_loc));
+    result->set ("start", json_from_expanded_location (context, start_loc));
   if (finish_loc != caret_loc
       && finish_loc != UNKNOWN_LOCATION)
-    result->set ("finish", json_from_expanded_location (finish_loc));
+    result->set ("finish", json_from_expanded_location (context, finish_loc));
 
   if (loc_range->m_label)
     {
@@ -91,14 +114,14 @@ json_from_location_range (const location_range *loc_range, unsigned range_idx)
 /* Generate a JSON object for HINT.  */
 
 static json::object *
-json_from_fixit_hint (const fixit_hint *hint)
+json_from_fixit_hint (diagnostic_context *context, const fixit_hint *hint)
 {
   json::object *fixit_obj = new json::object ();
 
   location_t start_loc = hint->get_start_loc ();
-  fixit_obj->set ("start", json_from_expanded_location (start_loc));
+  fixit_obj->set ("start", json_from_expanded_location (context, start_loc));
   location_t next_loc = hint->get_next_loc ();
-  fixit_obj->set ("next", json_from_expanded_location (next_loc));
+  fixit_obj->set ("next", json_from_expanded_location (context, next_loc));
   fixit_obj->set ("string", new json::string (hint->get_string ()));
 
   return fixit_obj;
@@ -190,11 +213,13 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
   else
     {
       /* Otherwise, make diag_obj be the top-level object within the group;
-	 add a "children" array.  */
+	 add a "children" array and record the column origin.  */
       toplevel_array->append (diag_obj);
       cur_group = diag_obj;
       cur_children_array = new json::array ();
       diag_obj->set ("children", cur_children_array);
+      diag_obj->set ("column-origin",
+		     new json::integer_number (context->column_origin));
     }
 
   const rich_location *richloc = diagnostic->richloc;
@@ -205,7 +230,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
   for (unsigned int i = 0; i < richloc->get_num_locations (); i++)
     {
       const location_range *loc_range = richloc->get_range (i);
-      json::object *loc_obj = json_from_location_range (loc_range, i);
+      json::object *loc_obj = json_from_location_range (context, loc_range, i);
       if (loc_obj)
 	loc_array->append (loc_obj);
     }
@@ -217,7 +242,7 @@ json_end_diagnostic (diagnostic_context *context, diagnostic_info *diagnostic,
       for (unsigned int i = 0; i < richloc->get_num_fixit_hints (); i++)
 	{
 	  const fixit_hint *hint = richloc->get_fixit_hint (i);
-	  json::object *fixit_obj = json_from_fixit_hint (hint);
+	  json::object *fixit_obj = json_from_fixit_hint (context, hint);
 	  fixit_array->append (fixit_obj);
 	}
     }
@@ -320,7 +345,8 @@ namespace selftest {
 static void
 test_unknown_location ()
 {
-  delete json_from_expanded_location (UNKNOWN_LOCATION);
+  test_diagnostic_context dc;
+  delete json_from_expanded_location (&dc, UNKNOWN_LOCATION);
 }
 
 /* Verify that we gracefully handle attempts to serialize bad
@@ -338,7 +364,8 @@ test_bad_endpoints ()
   loc_range.m_range_display_kind = SHOW_RANGE_WITH_CARET;
   loc_range.m_label = NULL;
 
-  json::object *obj = json_from_location_range (&loc_range, 0);
+  test_diagnostic_context dc;
+  json::object *obj = json_from_location_range (&dc, &loc_range, 0);
   /* We should have a "caret" value, but no "start" or "finish" values.  */
   ASSERT_TRUE (obj != NULL);
   ASSERT_TRUE (obj->get ("caret") != NULL);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 4618b4edb7d..da3c5b6a92d 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -175,9 +175,10 @@ enum column_unit {
 class exploc_with_display_col : public expanded_location
 {
  public:
-  exploc_with_display_col (const expanded_location &exploc)
+  exploc_with_display_col (const expanded_location &exploc, int tabstop)
     : expanded_location (exploc),
-      m_display_col (location_compute_display_column (exploc)) {}
+      m_display_col (location_compute_display_column (exploc, tabstop))
+  {}
 
   int m_display_col;
 };
@@ -189,11 +190,11 @@ class exploc_with_display_col : public expanded_location
 class layout_point
 {
  public:
-  layout_point (const expanded_location &exploc)
+  layout_point (const exploc_with_display_col &exploc)
     : m_line (exploc.line)
   {
     m_columns[CU_BYTES] = exploc.column;
-    m_columns[CU_DISPLAY_COLS] = location_compute_display_column (exploc);
+    m_columns[CU_DISPLAY_COLS] = exploc.m_display_col;
   }
 
   linenum_type m_line;
@@ -205,10 +206,10 @@ class layout_point
 class layout_range
 {
  public:
-  layout_range (const expanded_location *start_exploc,
-		const expanded_location *finish_exploc,
+  layout_range (const exploc_with_display_col &start_exploc,
+		const exploc_with_display_col &finish_exploc,
 		enum range_display_kind range_display_kind,
-		const expanded_location *caret_exploc,
+		const exploc_with_display_col &caret_exploc,
 		unsigned original_idx,
 		const range_label *label);
 
@@ -226,22 +227,18 @@ class layout_range
 
 /* A struct for use by layout::print_source_line for telling
    layout::print_annotation_line the extents of the source line that
-   it printed, so that underlines can be clipped appropriately.  */
+   it printed, so that underlines can be clipped appropriately.  Units
+   are 1-based display columns.  */
 
 struct line_bounds
 {
-  int m_first_non_ws;
-  int m_last_non_ws;
+  int m_first_non_ws_disp_col;
+  int m_last_non_ws_disp_col;
 
-  void convert_to_display_cols (char_span line)
+  line_bounds ()
   {
-    m_first_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),
-							line.length (),
-							m_first_non_ws);
-
-    m_last_non_ws = cpp_byte_column_to_display_column (line.get_buffer (),
-						       line.length (),
-						       m_last_non_ws);
+    m_first_non_ws_disp_col = INT_MAX;
+    m_last_non_ws_disp_col = 0;
   }
 };
 
@@ -351,8 +348,8 @@ class layout
  private:
   bool will_show_line_p (linenum_type row) const;
   void print_leading_fixits (linenum_type row);
-  void print_source_line (linenum_type row, const char *line, int line_bytes,
-			  line_bounds *lbounds_out);
+  line_bounds print_source_line (linenum_type row, const char *line,
+				 int line_bytes);
   bool should_print_annotation_line_p (linenum_type row) const;
   void start_annotation_line (char margin_char = ' ') const;
   void print_annotation_line (linenum_type row, const line_bounds lbounds);
@@ -513,16 +510,16 @@ colorizer::get_color_by_name (const char *name)
    Initialize various layout_point fields from expanded_location
    equivalents; we've already filtered on file.  */
 
-layout_range::layout_range (const expanded_location *start_exploc,
-			    const expanded_location *finish_exploc,
+layout_range::layout_range (const exploc_with_display_col &start_exploc,
+			    const exploc_with_display_col &finish_exploc,
 			    enum range_display_kind range_display_kind,
-			    const expanded_location *caret_exploc,
+			    const exploc_with_display_col &caret_exploc,
 			    unsigned original_idx,
 			    const range_label *label)
-: m_start (*start_exploc),
-  m_finish (*finish_exploc),
+: m_start (start_exploc),
+  m_finish (finish_exploc),
   m_range_display_kind (range_display_kind),
-  m_caret (*caret_exploc),
+  m_caret (caret_exploc),
   m_original_idx (original_idx),
   m_label (label)
 {
@@ -646,6 +643,9 @@ layout_range::intersects_line_p (linenum_type row) const
 
 #if CHECKING_P
 
+/* Default for when we don't care what the tab expansion is set to.  */
+static const int def_tabstop = 8;
+
 /* Create some expanded locations for testing layout_range.  The filename
    member of the explocs is set to the empty string.  This member will only be
    inspected by the calls to location_compute_display_column() made from the
@@ -662,8 +662,11 @@ make_range (int start_line, int start_col, int end_line, int end_col)
     = {"", start_line, start_col, NULL, false};
   const expanded_location finish_exploc
     = {"", end_line, end_col, NULL, false};
-  return layout_range (&start_exploc, &finish_exploc, SHOW_RANGE_WITHOUT_CARET,
-		       &start_exploc, 0, NULL);
+  return layout_range (exploc_with_display_col (start_exploc, def_tabstop),
+		       exploc_with_display_col (finish_exploc, def_tabstop),
+		       SHOW_RANGE_WITHOUT_CARET,
+		       exploc_with_display_col (start_exploc, def_tabstop),
+		       0, NULL);
 }
 
 /* Selftests for layout_range::contains_point and
@@ -964,7 +967,7 @@ layout::layout (diagnostic_context * context,
 : m_context (context),
   m_pp (context->printer),
   m_primary_loc (richloc->get_range (0)->m_loc),
-  m_exploc (richloc->get_expanded_location (0)),
+  m_exploc (richloc->get_expanded_location (0), context->tabstop),
   m_colorizer (context, diagnostic_kind),
   m_colorize_source_p (context->colorize_source_p),
   m_show_labels_p (context->show_labels_p),
@@ -1060,7 +1063,10 @@ layout::maybe_add_location_range (const location_range *loc_range,
 
   /* Everything is now known to be in the correct source file,
      but it may require further sanitization.  */
-  layout_range ri (&start, &finish, loc_range->m_range_display_kind, &caret,
+  layout_range ri (exploc_with_display_col (start, m_context->tabstop),
+		   exploc_with_display_col (finish, m_context->tabstop),
+		   loc_range->m_range_display_kind,
+		   exploc_with_display_col (caret, m_context->tabstop),
 		   original_idx, loc_range->m_label);
 
   /* If we have a range that finishes before it starts (perhaps
@@ -1394,7 +1400,7 @@ layout::calculate_x_offset_display ()
     = get_line_bytes_without_trailing_whitespace (line.get_buffer (),
 						  line.length ());
   int eol_display_column
-    = cpp_display_width (line.get_buffer (), line_bytes);
+    = cpp_display_width (line.get_buffer (), line_bytes, m_context->tabstop);
   if (caret_display_column > eol_display_column
       || !caret_display_column)
     {
@@ -1445,16 +1451,13 @@ layout::calculate_x_offset_display ()
 }
 
 /* Print line ROW of source code, potentially colorized at any ranges, and
-   populate *LBOUNDS_OUT.
-   LINE is the source line (not necessarily 0-terminated) and LINE_BYTES
-   is its length in bytes.
-   This function deals only with byte offsets, not display columns, so
-   m_x_offset_display must be converted from display to byte units.  In
-   particular, LINE_BYTES and LBOUNDS_OUT are in bytes.  */
+   return the line bounds.  LINE is the source line (not necessarily
+   0-terminated) and LINE_BYTES is its length in bytes.  In order to handle both
+   colorization and tab expansion, this function tracks the line position in
+   both byte and display column units.  */
 
-void
-layout::print_source_line (linenum_type row, const char *line, int line_bytes,
-			   line_bounds *lbounds_out)
+line_bounds
+layout::print_source_line (linenum_type row, const char *line, int line_bytes)
 {
   m_colorizer.set_normal_text ();
 
@@ -1469,30 +1472,29 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
   else
     pp_space (m_pp);
 
-  /* We will stop printing the source line at any trailing whitespace, and start
-     printing it as per m_x_offset_display.  */
+  /* We will stop printing the source line at any trailing whitespace.  */
   line_bytes = get_line_bytes_without_trailing_whitespace (line,
 							   line_bytes);
-  int x_offset_bytes = 0;
-  if (m_x_offset_display)
-    {
-      x_offset_bytes = cpp_display_column_to_byte_column (line, line_bytes,
-							  m_x_offset_display);
-      /* In case the leading portion of the line that will be skipped over ends
-	 with a character with wcwidth > 1, then it is possible we skipped too
-	 much, so account for that by padding with spaces.  */
-      const int overage
-	= cpp_byte_column_to_display_column (line, line_bytes, x_offset_bytes)
-	- m_x_offset_display;
-      for (int column = 0; column < overage; ++column)
-	pp_space (m_pp);
-      line += x_offset_bytes;
-    }
 
-  /* Print the line.  */
-  int first_non_ws = INT_MAX;
-  int last_non_ws = 0;
-  for (int col_byte = 1 + x_offset_bytes; col_byte <= line_bytes; col_byte++)
+  /* This object helps to keep track of which display column we are at, which is
+     necessary for computing the line bounds in display units, for doing
+     tab expansion, and for implementing m_x_offset_display.  */
+  cpp_display_width_computation dw (line, line_bytes, m_context->tabstop);
+
+  /* Skip the first m_x_offset_display display columns.  In case the leading
+     portion that will be skipped ends with a character with wcwidth > 1, then
+     it is possible we skipped too much, so account for that by padding with
+     spaces.  Note that this does the right thing too in case a tab was the last
+     character to be skipped over; the tab is effectively replaced by the
+     correct number of trailing spaces needed to offset by the desired number of
+     display columns.  */
+  for (int skipped_display_cols = dw.advance_display_cols (m_x_offset_display);
+       skipped_display_cols > m_x_offset_display; --skipped_display_cols)
+    pp_space (m_pp);
+
+  /* Print the line and compute the line_bounds.  */
+  line_bounds lbounds;
+  while (!dw.done ())
     {
       /* Assuming colorization is enabled for the caret and underline
 	 characters, we may also colorize the associated characters
@@ -1510,7 +1512,8 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
 	{
 	  bool in_range_p;
 	  point_state state;
-	  in_range_p = get_state_at_point (row, col_byte,
+	  const int start_byte_col = dw.bytes_processed () + 1;
+	  in_range_p = get_state_at_point (row, start_byte_col,
 					   0, INT_MAX,
 					   CU_BYTES,
 					   &state);
@@ -1519,22 +1522,44 @@ layout::print_source_line (linenum_type row, const char *line, int line_bytes,
 	  else
 	    m_colorizer.set_normal_text ();
 	}
-      char c = *line;
-      if (c == '\0' || c == '\t' || c == '\r')
-	c = ' ';
-      if (c != ' ')
+
+      /* Get the display width of the next character to be output, expanding
+	 tabs and replacing some control bytes with spaces as necessary.  */
+      const char *c = dw.next_byte ();
+      const int start_disp_col = dw.display_cols_processed () + 1;
+      const int this_display_width = dw.process_next_codepoint ();
+      if (*c == '\t')
+	{
+	  /* The returned display width is the number of spaces into which the
+	     tab should be expanded.  */
+	  for (int i = 0; i != this_display_width; ++i)
+	    pp_space (m_pp);
+	  continue;
+	}
+      if (*c == '\0' || *c == '\r')
 	{
-	  last_non_ws = col_byte;
-	  if (first_non_ws == INT_MAX)
-	    first_non_ws = col_byte;
+	  /* cpp_wcwidth() promises to return 1 for all control bytes, and we
+	     want to output these as a single space too, so this case is
+	     actually the same as the '\t' case.  */
+	  gcc_assert (this_display_width == 1);
+	  pp_space (m_pp);
+	  continue;
 	}
-      pp_character (m_pp, c);
-      line++;
+
+      /* We have a (possibly multibyte) character to output; update the line
+	 bounds if it is not whitespace.  */
+      if (*c != ' ')
+	{
+	  lbounds.m_last_non_ws_disp_col = dw.display_cols_processed ();
+	  if (lbounds.m_first_non_ws_disp_col == INT_MAX)
+	    lbounds.m_first_non_ws_disp_col = start_disp_col;
+	}
+
+      /* Output the character.  */
+      while (c != dw.next_byte ()) pp_character (m_pp, *c++);
     }
   print_newline ();
-
-  lbounds_out->m_first_non_ws = first_non_ws;
-  lbounds_out->m_last_non_ws = last_non_ws;
+  return lbounds;
 }
 
 /* Determine if we should print an annotation line for ROW.
@@ -1576,14 +1601,13 @@ layout::start_annotation_line (char margin_char) const
 }
 
 /* Print a line consisting of the caret/underlines for the given
-   source line.  This function works with display columns, rather than byte
-   counts; in particular, LBOUNDS should be in display column units.  */
+   source line.  */
 
 void
 layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
 {
   int x_bound = get_x_bound_for_row (row, m_exploc.m_display_col,
-				     lbounds.m_last_non_ws);
+				     lbounds.m_last_non_ws_disp_col);
 
   start_annotation_line ();
   pp_space (m_pp);
@@ -1593,8 +1617,8 @@ layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
       bool in_range_p;
       point_state state;
       in_range_p = get_state_at_point (row, column,
-				       lbounds.m_first_non_ws,
-				       lbounds.m_last_non_ws,
+				       lbounds.m_first_non_ws_disp_col,
+				       lbounds.m_last_non_ws_disp_col,
 				       CU_DISPLAY_COLS,
 				       &state);
       if (in_range_p)
@@ -1631,12 +1655,14 @@ layout::print_annotation_line (linenum_type row, const line_bounds lbounds)
 class line_label
 {
 public:
-  line_label (int state_idx, int column, label_text text)
+  line_label (diagnostic_context *context, int state_idx, int column,
+	      label_text text)
   : m_state_idx (state_idx), m_column (column),
     m_text (text), m_label_line (0), m_has_vbar (true)
   {
     const int bytes = strlen (text.m_buffer);
-    m_display_width = cpp_display_width (text.m_buffer, bytes);
+    m_display_width
+      = cpp_display_width (text.m_buffer, bytes, context->tabstop);
   }
 
   /* Sorting is primarily by column, then by state index.  */
@@ -1696,7 +1722,7 @@ layout::print_any_labels (linenum_type row)
 	if (text.m_buffer == NULL)
 	  continue;
 
-	labels.safe_push (line_label (i, disp_col, text));
+	labels.safe_push (line_label (m_context, i, disp_col, text));
       }
   }
 
@@ -1976,7 +2002,8 @@ public:
 
 /* Get the range of bytes or display columns that HINT would affect.  */
 static column_range
-get_affected_range (const fixit_hint *hint, enum column_unit col_unit)
+get_affected_range (diagnostic_context *context,
+		    const fixit_hint *hint, enum column_unit col_unit)
 {
   expanded_location exploc_start = expand_location (hint->get_start_loc ());
   expanded_location exploc_finish = expand_location (hint->get_next_loc ());
@@ -1986,11 +2013,13 @@ get_affected_range (const fixit_hint *hint, enum column_unit col_unit)
   int finish_column;
   if (col_unit == CU_DISPLAY_COLS)
     {
-      start_column = location_compute_display_column (exploc_start);
+      start_column
+	= location_compute_display_column (exploc_start, context->tabstop);
       if (hint->insertion_p ())
 	finish_column = start_column - 1;
       else
-	finish_column = location_compute_display_column (exploc_finish);
+	finish_column
+	  = location_compute_display_column (exploc_finish, context->tabstop);
     }
   else
     {
@@ -2003,12 +2032,12 @@ get_affected_range (const fixit_hint *hint, enum column_unit col_unit)
 /* Get the range of display columns that would be printed for HINT.  */
 
 static column_range
-get_printed_columns (const fixit_hint *hint)
+get_printed_columns (diagnostic_context *context, const fixit_hint *hint)
 {
   expanded_location exploc = expand_location (hint->get_start_loc ());
-  int start_column = location_compute_display_column (exploc);
-  int hint_width = cpp_display_width (hint->get_string (),
-				      hint->get_length ());
+  int start_column = location_compute_display_column (exploc, context->tabstop);
+  int hint_width = cpp_display_width (hint->get_string (), hint->get_length (),
+				      context->tabstop);
   int final_hint_column = start_column + hint_width - 1;
   if (hint->insertion_p ())
     {
@@ -2018,7 +2047,8 @@ get_printed_columns (const fixit_hint *hint)
     {
       exploc = expand_location (hint->get_next_loc ());
       --exploc.column;
-      int finish_column = location_compute_display_column (exploc);
+      int finish_column
+	= location_compute_display_column (exploc, context->tabstop);
       return column_range (start_column,
 			   MAX (finish_column, final_hint_column));
     }
@@ -2035,12 +2065,14 @@ public:
   correction (column_range affected_bytes,
 	      column_range affected_columns,
 	      column_range printed_columns,
-	      const char *new_text, size_t new_text_len)
+	      const char *new_text, size_t new_text_len,
+	      int tabstop)
   : m_affected_bytes (affected_bytes),
     m_affected_columns (affected_columns),
     m_printed_columns (printed_columns),
     m_text (xstrdup (new_text)),
     m_byte_length (new_text_len),
+    m_tabstop (tabstop),
     m_alloc_sz (new_text_len + 1)
   {
     compute_display_cols ();
@@ -2058,7 +2090,7 @@ public:
 
   void compute_display_cols ()
   {
-    m_display_cols = cpp_display_width (m_text, m_byte_length);
+    m_display_cols = cpp_display_width (m_text, m_byte_length, m_tabstop);
   }
 
   void overwrite (int dst_offset, const char_span &src_span)
@@ -2086,6 +2118,7 @@ public:
   char *m_text;
   size_t m_byte_length; /* Not including null-terminator.  */
   int m_display_cols;
+  int m_tabstop;
   size_t m_alloc_sz;
 };
 
@@ -2121,13 +2154,15 @@ correction::ensure_terminated ()
 class line_corrections
 {
 public:
-  line_corrections (const char *filename, linenum_type row)
-  : m_filename (filename), m_row (row)
+  line_corrections (diagnostic_context *context, const char *filename,
+		    linenum_type row)
+    : m_context (context), m_filename (filename), m_row (row)
   {}
   ~line_corrections ();
 
   void add_hint (const fixit_hint *hint);
 
+  diagnostic_context *m_context;
   const char *m_filename;
   linenum_type m_row;
   auto_vec <correction *> m_corrections;
@@ -2173,9 +2208,10 @@ source_line::source_line (const char *filename, int line)
 void
 line_corrections::add_hint (const fixit_hint *hint)
 {
-  column_range affected_bytes = get_affected_range (hint, CU_BYTES);
-  column_range affected_columns = get_affected_range (hint, CU_DISPLAY_COLS);
-  column_range printed_columns = get_printed_columns (hint);
+  column_range affected_bytes = get_affected_range (m_context, hint, CU_BYTES);
+  column_range affected_columns = get_affected_range (m_context, hint,
+						      CU_DISPLAY_COLS);
+  column_range printed_columns = get_printed_columns (m_context, hint);
 
   /* Potentially consolidate.  */
   if (!m_corrections.is_empty ())
@@ -2243,7 +2279,8 @@ line_corrections::add_hint (const fixit_hint *hint)
 					   affected_columns,
 					   printed_columns,
 					   hint->get_string (),
-					   hint->get_length ()));
+					   hint->get_length (),
+					   m_context->tabstop));
 }
 
 /* If there are any fixit hints on source line ROW, print them.
@@ -2257,7 +2294,7 @@ layout::print_trailing_fixits (linenum_type row)
 {
   /* Build a list of correction instances for the line,
      potentially consolidating hints (for the sake of readability).  */
-  line_corrections corrections (m_exploc.file, row);
+  line_corrections corrections (m_context, m_exploc.file, row);
   for (unsigned int i = 0; i < m_fixit_hints.length (); i++)
     {
       const fixit_hint *hint = m_fixit_hints[i];
@@ -2499,15 +2536,11 @@ layout::print_line (linenum_type row)
   if (!line)
     return;
 
-  line_bounds lbounds;
   print_leading_fixits (row);
-  print_source_line (row, line.get_buffer (), line.length (), &lbounds);
+  const line_bounds lbounds
+    = print_source_line (row, line.get_buffer (), line.length ());
   if (should_print_annotation_line_p (row))
-    {
-      if (lbounds.m_first_non_ws != INT_MAX)
-	lbounds.convert_to_display_cols (line);
-      print_annotation_line (row, lbounds);
-    }
+    print_annotation_line (row, lbounds);
   if (m_show_labels_p)
     print_any_labels (row);
   print_trailing_fixits (row);
@@ -2670,9 +2703,11 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
 
   char_span lspan = location_get_source_line (tmp.get_filename (), 1);
   ASSERT_EQ (line_display_cols,
-	     cpp_display_width (lspan.get_buffer (), lspan.length ()));
+	     cpp_display_width (lspan.get_buffer (), lspan.length (),
+				def_tabstop));
   ASSERT_EQ (line_display_cols,
-	     location_compute_display_column (expand_location (line_end)));
+	     location_compute_display_column (expand_location (line_end),
+					      def_tabstop));
   ASSERT_EQ (0, memcmp (lspan.get_buffer () + (emoji_col - 1),
 			"\xf0\x9f\x98\x82\xf0\x9f\x98\x82", 8));
 
@@ -2774,6 +2809,111 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
 
 }
 
+static void
+test_layout_x_offset_display_tab (const line_table_case &case_)
+{
+  const char *content
+    = "This line is very long, so that we can use it to test the logic for "
+      "clipping long lines.  Also this: `\t' is a tab that occupies 1 byte and "
+      "a variable number of display columns, starting at column #103.\n";
+
+  /* Number of bytes in the line, subtracting one to remove the newline.  */
+  const int line_bytes = strlen (content) - 1;
+
+ /* The column where the tab begins.  Byte or display is the same as there are
+    no multibyte characters earlier on the line.  */
+  const int tab_col = 103;
+
+  /* Effective extra size of the tab beyond what a single space would have taken
+     up, indexed by tabstop.  */
+  static const int num_tabstops = 11;
+  int extra_width[num_tabstops];
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      const int this_tab_size = tabstop - (tab_col - 1) % tabstop;
+      extra_width[tabstop] = this_tab_size - 1;
+    }
+  /* Example of this calculation: if tabstop is 10, the tab starting at column
+     #103 has to expand into 8 spaces, covering columns 103-110, so that the
+     next character is at column #111.  So it takes up 7 more columns than
+     a space would have taken up.  */
+  ASSERT_EQ (7, extra_width[10]);
+
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+  line_table_test ltt (case_);
+
+  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+
+  location_t line_end = linemap_position_for_column (line_table, line_bytes);
+
+  /* Don't attempt to run the tests if column data might be unavailable.  */
+  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return;
+
+  /* Check that cpp_display_width handles the tabs as expected.  */
+  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  ASSERT_EQ ('\t', *(lspan.get_buffer () + (tab_col - 1)));
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      ASSERT_EQ (line_bytes + extra_width[tabstop],
+		 cpp_display_width (lspan.get_buffer (), lspan.length (),
+				    tabstop));
+      ASSERT_EQ (line_bytes + extra_width[tabstop],
+		 location_compute_display_column (expand_location (line_end),
+						  tabstop));
+    }
+
+  /* Check that the tab is expanded to the expected number of spaces.  */
+  rich_location richloc (line_table,
+			 linemap_position_for_column (line_table,
+						      tab_col + 1));
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      test_diagnostic_context dc;
+      dc.tabstop = tabstop;
+      layout test_layout (&dc, &richloc, DK_ERROR);
+      test_layout.print_line (1);
+      const char *out = pp_formatted_text (dc.printer);
+      ASSERT_EQ (NULL, strchr (out, '\t'));
+      const char *left_quote = strchr (out, '`');
+      const char *right_quote = strchr (out, '\'');
+      ASSERT_NE (NULL, left_quote);
+      ASSERT_NE (NULL, right_quote);
+      ASSERT_EQ (right_quote - left_quote, extra_width[tabstop] + 2);
+    }
+
+  /* Check that the line is offset properly and that the tab is broken up
+     into the expected number of spaces when it is the last character skipped
+     over.  */
+  for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
+    {
+      test_diagnostic_context dc;
+      dc.tabstop = tabstop;
+      static const int small_width = 24;
+      dc.caret_max_width = small_width - 4;
+      dc.min_margin_width = test_left_margin - test_linenum_sep + 1;
+      dc.show_line_numbers_p = true;
+      layout test_layout (&dc, &richloc, DK_ERROR);
+      test_layout.print_line (1);
+
+      /* We have arranged things so that two columns will be printed before
+	 the caret.  If the tab results in more than one space, this should
+	 produce two spaces in the output; otherwise, it will be a single space
+	 preceded by the opening quote before the tab character.  */
+      const char *output1
+	= "   1 |   ' is a tab that occupies 1 byte and a variable number of "
+	  "display columns, starting at column #103.\n"
+	  "     |   ^\n\n";
+      const char *output2
+	= "   1 | ` ' is a tab that occupies 1 byte and a variable number of "
+	  "display columns, starting at column #103.\n"
+	  "     |   ^\n\n";
+      const char *expected_output = (extra_width[tabstop] ? output1 : output2);
+      ASSERT_STREQ (expected_output, pp_formatted_text (dc.printer));
+    }
+}
+
+
 /* Verify that diagnostic_show_locus works sanely on UNKNOWN_LOCATION.  */
 
 static void
@@ -3854,6 +3994,27 @@ test_one_liner_labels_utf8 ()
   }
 }
 
+/* Make sure that colorization codes don't interrupt a multibyte
+   sequence, which would corrupt it.  */
+static void
+test_one_liner_colorized_utf8 ()
+{
+  test_diagnostic_context dc;
+  dc.colorize_source_p = true;
+  diagnostic_color_init (&dc, DIAGNOSTICS_COLOR_YES);
+  const location_t pi = linemap_position_for_column (line_table, 12);
+  rich_location richloc (line_table, pi);
+  diagnostic_show_locus (&dc, &richloc, DK_ERROR);
+
+  /* In order to avoid having the test depend on exactly how the colorization
+     was effected, just confirm there are two pi characters in the output.  */
+  const char *result = pp_formatted_text (dc.printer);
+  const char *null_term = result + strlen (result);
+  const char *first_pi = strstr (result, "\xcf\x80");
+  ASSERT_TRUE (first_pi && first_pi <= null_term - 2);
+  ASSERT_STR_CONTAINS (first_pi + 2, "\xcf\x80");
+}
+
 /* Run the various one-liner tests.  */
 
 static void
@@ -3884,8 +4045,10 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
   ASSERT_EQ (31, LOCATION_COLUMN (line_end));
 
   char_span lspan = location_get_source_line (tmp.get_filename (), 1);
-  ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length ()));
-  ASSERT_EQ (25, location_compute_display_column (expand_location (line_end)));
+  ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length (),
+				    def_tabstop));
+  ASSERT_EQ (25, location_compute_display_column (expand_location (line_end),
+						  def_tabstop));
 
   test_one_liner_simple_caret_utf8 ();
   test_one_liner_caret_and_range_utf8 ();
@@ -3900,6 +4063,7 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
   test_one_liner_many_fixits_1_utf8 ();
   test_one_liner_many_fixits_2_utf8 ();
   test_one_liner_labels_utf8 ();
+  test_one_liner_colorized_utf8 ();
 }
 
 /* Verify that gcc_rich_location::add_location_if_nearby works.  */
@@ -4272,25 +4436,28 @@ test_overlapped_fixit_printing (const line_table_case &case_)
     /* Unit-test the line_corrections machinery.  */
     ASSERT_EQ (3, richloc.get_num_fixit_hints ());
     const fixit_hint *hint_0 = richloc.get_fixit_hint (0);
-    ASSERT_EQ (column_range (12, 12), get_affected_range (hint_0, CU_BYTES));
     ASSERT_EQ (column_range (12, 12),
-			   get_affected_range (hint_0, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (12, 22), get_printed_columns (hint_0));
+	       get_affected_range (&dc, hint_0, CU_BYTES));
+    ASSERT_EQ (column_range (12, 12),
+	       get_affected_range (&dc, hint_0, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (12, 22), get_printed_columns (&dc, hint_0));
     const fixit_hint *hint_1 = richloc.get_fixit_hint (1);
-    ASSERT_EQ (column_range (18, 18), get_affected_range (hint_1, CU_BYTES));
     ASSERT_EQ (column_range (18, 18),
-			   get_affected_range (hint_1, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (18, 20), get_printed_columns (hint_1));
+	       get_affected_range (&dc, hint_1, CU_BYTES));
+    ASSERT_EQ (column_range (18, 18),
+	       get_affected_range (&dc, hint_1, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (18, 20), get_printed_columns (&dc, hint_1));
     const fixit_hint *hint_2 = richloc.get_fixit_hint (2);
-    ASSERT_EQ (column_range (29, 28), get_affected_range (hint_2, CU_BYTES));
     ASSERT_EQ (column_range (29, 28),
-			   get_affected_range (hint_2, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (29, 29), get_printed_columns (hint_2));
+	       get_affected_range (&dc, hint_2, CU_BYTES));
+    ASSERT_EQ (column_range (29, 28),
+	       get_affected_range (&dc, hint_2, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (29, 29), get_printed_columns (&dc, hint_2));
 
     /* Add each hint in turn to a line_corrections instance,
        and verify that they are consolidated into one correction instance
        as expected.  */
-    line_corrections lc (tmp.get_filename (), 1);
+    line_corrections lc (&dc, tmp.get_filename (), 1);
 
     /* The first replace hint by itself.  */
     lc.add_hint (hint_0);
@@ -4484,25 +4651,28 @@ test_overlapped_fixit_printing_utf8 (const line_table_case &case_)
     /* Unit-test the line_corrections machinery.  */
     ASSERT_EQ (3, richloc.get_num_fixit_hints ());
     const fixit_hint *hint_0 = richloc.get_fixit_hint (0);
-    ASSERT_EQ (column_range (14, 14), get_affected_range (hint_0, CU_BYTES));
+    ASSERT_EQ (column_range (14, 14),
+	       get_affected_range (&dc, hint_0, CU_BYTES));
     ASSERT_EQ (column_range (12, 12),
-			   get_affected_range (hint_0, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (12, 22), get_printed_columns (hint_0));
+	       get_affected_range (&dc, hint_0, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (12, 22), get_printed_columns (&dc, hint_0));
     const fixit_hint *hint_1 = richloc.get_fixit_hint (1);
-    ASSERT_EQ (column_range (22, 22), get_affected_range (hint_1, CU_BYTES));
+    ASSERT_EQ (column_range (22, 22),
+	       get_affected_range (&dc, hint_1, CU_BYTES));
     ASSERT_EQ (column_range (18, 18),
-			   get_affected_range (hint_1, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (18, 20), get_printed_columns (hint_1));
+	       get_affected_range (&dc, hint_1, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (18, 20), get_printed_columns (&dc, hint_1));
     const fixit_hint *hint_2 = richloc.get_fixit_hint (2);
-    ASSERT_EQ (column_range (35, 34), get_affected_range (hint_2, CU_BYTES));
+    ASSERT_EQ (column_range (35, 34),
+	       get_affected_range (&dc, hint_2, CU_BYTES));
     ASSERT_EQ (column_range (30, 29),
-			   get_affected_range (hint_2, CU_DISPLAY_COLS));
-    ASSERT_EQ (column_range (30, 30), get_printed_columns (hint_2));
+	       get_affected_range (&dc, hint_2, CU_DISPLAY_COLS));
+    ASSERT_EQ (column_range (30, 30), get_printed_columns (&dc, hint_2));
 
     /* Add each hint in turn to a line_corrections instance,
        and verify that they are consolidated into one correction instance
        as expected.  */
-    line_corrections lc (tmp.get_filename (), 1);
+    line_corrections lc (&dc, tmp.get_filename (), 1);
 
     /* The first replace hint by itself.  */
     lc.add_hint (hint_0);
@@ -4689,6 +4859,8 @@ test_overlapped_fixit_printing_2 (const line_table_case &case_)
 
   /* Two insertions, in the wrong order.  */
   {
+    test_diagnostic_context dc;
+
     rich_location richloc (line_table, col_20);
     richloc.add_fixit_insert_before (col_23, "{");
     richloc.add_fixit_insert_before (col_21, "}");
@@ -4696,14 +4868,15 @@ test_overlapped_fixit_printing_2 (const line_table_case &case_)
     /* These fixits should be accepted; they can't be consolidated.  */
     ASSERT_EQ (2, richloc.get_num_fixit_hints ());
     const fixit_hint *hint_0 = richloc.get_fixit_hint (0);
-    ASSERT_EQ (column_range (23, 22), get_affected_range (hint_0, CU_BYTES));
-    ASSERT_EQ (column_range (23, 23), get_printed_columns (hint_0));
+    ASSERT_EQ (column_range (23, 22),
+	       get_affected_range (&dc, hint_0, CU_BYTES));
+    ASSERT_EQ (column_range (23, 23), get_printed_columns (&dc, hint_0));
     const fixit_hint *hint_1 = richloc.get_fixit_hint (1);
-    ASSERT_EQ (column_range (21, 20), get_affected_range (hint_1, CU_BYTES));
-    ASSERT_EQ (column_range (21, 21), get_printed_columns (hint_1));
+    ASSERT_EQ (column_range (21, 20),
+	       get_affected_range (&dc, hint_1, CU_BYTES));
+    ASSERT_EQ (column_range (21, 21), get_printed_columns (&dc, hint_1));
 
     /* Verify that they're printed correctly.  */
-    test_diagnostic_context dc;
     diagnostic_show_locus (&dc, &richloc, DK_ERROR);
     ASSERT_STREQ (" int a5[][0][0] = { 1, 2 };\n"
 		  "                    ^\n"
@@ -4955,6 +5128,65 @@ test_fixit_deletion_affecting_newline (const line_table_case &case_)
 		pp_formatted_text (dc.printer));
 }
 
+static void
+test_tab_expansion (const line_table_case &case_)
+{
+  /* Create a tempfile and write some text to it.  This example uses a tabstop
+     of 8, as the column numbers attempt to indicate:
+
+    .....................000.01111111111.22222333333  display
+    .....................123.90123456789.56789012345  columns  */
+  const char *content = "  \t   This: `\t' is a tab.\n";
+  /* ....................000 00000011111 11111222222  byte
+     ....................123 45678901234 56789012345  columns  */
+
+  const int tabstop = 8;
+  const int first_non_ws_byte_col = 7;
+  const int right_quote_byte_col = 15;
+  const int last_byte_col = 25;
+  ASSERT_EQ (35, cpp_display_width (content, last_byte_col, tabstop));
+
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+  line_table_test ltt (case_);
+  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+
+  /* Don't attempt to run the tests if column data might be unavailable.  */
+  location_t line_end = linemap_position_for_column (line_table, last_byte_col);
+  if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return;
+
+  /* Check that the leading whitespace with mixed tabs and spaces is expanded
+     into 11 spaces.  Recall that print_line() also puts one space before
+     everything too.  */
+  {
+    test_diagnostic_context dc;
+    dc.tabstop = tabstop;
+    rich_location richloc (line_table,
+			   linemap_position_for_column (line_table,
+							first_non_ws_byte_col));
+    layout test_layout (&dc, &richloc, DK_ERROR);
+    test_layout.print_line (1);
+    ASSERT_STREQ ("            This: `      ' is a tab.\n"
+		  "            ^\n",
+		  pp_formatted_text (dc.printer));
+  }
+
+  /* Confirm the display width was tracked correctly across the internal tab
+     as well.  */
+  {
+    test_diagnostic_context dc;
+    dc.tabstop = tabstop;
+    rich_location richloc (line_table,
+			   linemap_position_for_column (line_table,
+							right_quote_byte_col));
+    layout test_layout (&dc, &richloc, DK_ERROR);
+    test_layout.print_line (1);
+    ASSERT_STREQ ("            This: `      ' is a tab.\n"
+		  "                         ^\n",
+		  pp_formatted_text (dc.printer));
+  }
+}
+
 /* Verify that line numbers are correctly printed for the case of
    a multiline range in which the width of the line numbers changes
    (e.g. from "9" to "10").  */
@@ -5012,6 +5244,7 @@ diagnostic_show_locus_c_tests ()
   test_layout_range_for_multiple_lines ();
 
   for_each_line_table_case (test_layout_x_offset_display_utf8);
+  for_each_line_table_case (test_layout_x_offset_display_tab);
 
   test_get_line_bytes_without_trailing_whitespace ();
 
@@ -5029,6 +5262,7 @@ diagnostic_show_locus_c_tests ()
   for_each_line_table_case (test_fixit_insert_containing_newline_2);
   for_each_line_table_case (test_fixit_replace_containing_newline);
   for_each_line_table_case (test_fixit_deletion_affecting_newline);
+  for_each_line_table_case (test_tab_expansion);
 
   test_line_numbers_multiline_range ();
 }
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index ed52bc03d17..1b6c9845892 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "selftest.h"
 #include "selftest-diagnostic.h"
 #include "opts.h"
+#include "cpplib.h"
 
 #ifdef HAVE_TERMIOS_H
 # include <termios.h>
@@ -219,6 +220,9 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
   context->min_margin_width = 0;
   context->show_ruler_p = false;
   context->parseable_fixits_p = false;
+  context->column_unit = DIAGNOSTICS_COLUMN_UNIT_DISPLAY;
+  context->column_origin = 1;
+  context->tabstop = 8;
   context->edit_context_ptr = NULL;
   context->diagnostic_group_nesting_depth = 0;
   context->diagnostic_group_emission_count = 0;
@@ -353,8 +357,37 @@ diagnostic_get_color_for_kind (diagnostic_t kind)
   return diagnostic_kind_color[kind];
 }
 
+/* Given an expanded_location, convert the column (which is in 1-based bytes)
+   to the requested units and origin.  Return -1 if the column is
+   invalid (<= 0).  */
+int
+diagnostic_converted_column (diagnostic_context *context, expanded_location s)
+{
+  if (s.column <= 0)
+    return -1;
+
+  int one_based_col;
+  switch (context->column_unit)
+    {
+    case DIAGNOSTICS_COLUMN_UNIT_DISPLAY:
+      one_based_col = location_compute_display_column (s, context->tabstop);
+      break;
+
+    case DIAGNOSTICS_COLUMN_UNIT_BYTE:
+      one_based_col = s.column;
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return one_based_col + (context->column_origin - 1);
+}
+
 /* Return a formatted line and column ':%line:%column'.  Elided if
-   zero.  The result is a statically allocated buffer.  */
+   line == 0 or col < 0.  (A column of 0 may be valid due to the
+   -fdiagnostics-column-origin option.)
+   The result is a statically allocated buffer.  */
 
 static const char *
 maybe_line_and_column (int line, int col)
@@ -363,8 +396,9 @@ maybe_line_and_column (int line, int col)
 
   if (line)
     {
-      size_t l = snprintf (result, sizeof (result),
-			   col ? ":%d:%d" : ":%d", line, col);
+      size_t l
+	= snprintf (result, sizeof (result),
+		    col >= 0 ? ":%d:%d" : ":%d", line, col);
       gcc_checking_assert (l < sizeof (result));
     }
   else
@@ -383,8 +417,14 @@ diagnostic_get_location_text (diagnostic_context *context,
   const char *locus_cs = colorize_start (pp_show_color (pp), "locus");
   const char *locus_ce = colorize_stop (pp_show_color (pp));
   const char *file = s.file ? s.file : progname;
-  int line = strcmp (file, N_("<built-in>")) ? s.line : 0;
-  int col = context->show_column ? s.column : 0;
+  int line = 0;
+  int col = -1;
+  if (strcmp (file, N_("<built-in>")))
+    {
+      line = s.line;
+      if (context->show_column)
+	col = diagnostic_converted_column (context, s);
+    }
 
   const char *line_col = maybe_line_and_column (line, col);
   return build_message_string ("%s%s%s:%s", locus_cs, file,
@@ -650,14 +690,20 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
       if (! MAIN_FILE_P (map))
 	{
 	  bool first = true;
+	  expanded_location s = {};
 	  do
 	    {
 	      where = linemap_included_from (map);
 	      map = linemap_included_from_linemap (line_table, map);
-	      const char *line_col
-		= maybe_line_and_column (SOURCE_LINE (map, where),
-					 first && context->show_column
-					 ? SOURCE_COLUMN (map, where) : 0);
+	      s.file = LINEMAP_FILE (map);
+	      s.line = SOURCE_LINE (map, where);
+	      int col = -1;
+	      if (first && context->show_column)
+		{
+		  s.column = SOURCE_COLUMN (map, where);
+		  col = diagnostic_converted_column (context, s);
+		}
+	      const char *line_col = maybe_line_and_column (s.line, col);
 	      static const char *const msgs[] =
 		{
 		 N_("In file included from"),
@@ -666,7 +712,7 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
 	      unsigned index = !first;
 	      pp_verbatim (context->printer, "%s%s %r%s%s%R",
 			   first ? "" : ",\n", _(msgs[index]),
-			   "locus", LINEMAP_FILE (map), line_col);
+			   "locus", s.file, line_col);
 	      first = false;
 	    }
 	  while (! MAIN_FILE_P (map));
@@ -2042,10 +2088,15 @@ test_print_parseable_fixits_replace ()
 static void
 assert_location_text (const char *expected_loc_text,
 		      const char *filename, int line, int column,
-		      bool show_column)
+		      bool show_column,
+		      int origin = 1,
+		      enum diagnostics_column_unit column_unit
+			= DIAGNOSTICS_COLUMN_UNIT_BYTE)
 {
   test_diagnostic_context dc;
   dc.show_column = show_column;
+  dc.column_unit = column_unit;
+  dc.column_origin = origin;
 
   expanded_location xloc;
   xloc.file = filename;
@@ -2069,7 +2120,10 @@ test_diagnostic_get_location_text ()
   assert_location_text ("PROGNAME:", NULL, 0, 0, true);
   assert_location_text ("<built-in>:", "<built-in>", 42, 10, true);
   assert_location_text ("foo.c:42:10:", "foo.c", 42, 10, true);
-  assert_location_text ("foo.c:42:", "foo.c", 42, 0, true);
+  assert_location_text ("foo.c:42:9:", "foo.c", 42, 10, true, 0);
+  assert_location_text ("foo.c:42:1010:", "foo.c", 42, 10, true, 1001);
+  for (int origin = 0; origin != 2; ++origin)
+    assert_location_text ("foo.c:42:", "foo.c", 42, 0, true, origin);
   assert_location_text ("foo.c:", "foo.c", 0, 10, true);
   assert_location_text ("foo.c:42:", "foo.c", 42, 10, false);
   assert_location_text ("foo.c:", "foo.c", 0, 10, false);
@@ -2077,6 +2131,41 @@ test_diagnostic_get_location_text ()
   maybe_line_and_column (INT_MAX, INT_MAX);
   maybe_line_and_column (INT_MIN, INT_MIN);
 
+  {
+    /* In order to test display columns vs byte columns, we need to create a
+       file for location_get_source_line() to read.  */
+
+    const char *const content = "smile \xf0\x9f\x98\x82\n";
+    const int line_bytes = strlen (content) - 1;
+    const int def_tabstop = 8;
+    const int display_width = cpp_display_width (content, line_bytes,
+						 def_tabstop);
+    ASSERT_EQ (line_bytes - 2, display_width);
+    temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+    const char *const fname = tmp.get_filename ();
+    const int buf_len = strlen (fname) + 16;
+    char *const expected = XNEWVEC (char, buf_len);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, line_bytes - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_BYTE);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  1, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    snprintf (expected, buf_len, "%s:1:%d:", fname, display_width - 1);
+    assert_location_text (expected, fname, 1, line_bytes, true,
+			  0, DIAGNOSTICS_COLUMN_UNIT_DISPLAY);
+
+    XDELETEVEC (expected);
+  }
+
+
   progname = old_progname;
 }
 
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 307dbcfb34a..75706c5f4d8 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -24,6 +24,20 @@ along with GCC; see the file COPYING3.  If not see
 #include "pretty-print.h"
 #include "diagnostic-core.h"
 
+/* An enum for controlling what units to use for the column number
+   when diagnostics are output, used by the -fdiagnostics-column-unit option.
+   Tabs will be expanded or not according to the value of -ftabstop.  The origin
+   (default 1) is controlled by -fdiagnostics-column-origin.  */
+
+enum diagnostics_column_unit
+{
+  /* The new default: display columns.  */
+  DIAGNOSTICS_COLUMN_UNIT_DISPLAY,
+
+  /* The historical behavior: simple bytes.  */
+  DIAGNOSTICS_COLUMN_UNIT_BYTE
+};
+
 /* Enum for overriding the standard output format.  */
 
 enum diagnostics_output_format
@@ -280,6 +294,15 @@ struct diagnostic_context
      rest of the diagnostic.  */
   bool parseable_fixits_p;
 
+  /* What units to use when outputting the column number.  */
+  enum diagnostics_column_unit column_unit;
+
+  /* The origin for the column number (1-based or 0-based typically).  */
+  int column_origin;
+
+  /* The size of the tabstop for tab expansion.  */
+  int tabstop;
+
   /* If non-NULL, an edit_context to which fix-it hints should be
      applied, for generating patches.  */
   edit_context *edit_context_ptr;
@@ -458,6 +481,8 @@ diagnostic_same_line (const diagnostic_context *context,
 }
 
 extern const char *diagnostic_get_color_for_kind (diagnostic_t kind);
+extern int diagnostic_converted_column (diagnostic_context *context,
+					expanded_location s);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
@@ -470,6 +495,7 @@ extern void diagnostic_output_format_init (diagnostic_context *,
 /* Compute the number of digits in the decimal representation of an integer.  */
 extern int num_digits (int);
 
-extern json::value *json_from_expanded_location (location_t loc);
+extern json::value *json_from_expanded_location (diagnostic_context *context,
+						 location_t loc);
 
 #endif /* ! GCC_DIAGNOSTIC_H */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 06a04e3d7dd..f463275bc8b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -292,7 +292,9 @@ Objective-C and Objective-C++ Dialects}.
 -fdiagnostics-show-template-tree  -fno-elide-type @gol
 -fdiagnostics-path-format=@r{[}none@r{|}separate-events@r{|}inline-events@r{]} @gol
 -fdiagnostics-show-path-depths @gol
--fno-show-column}
+-fno-show-column @gol
+-fdiagnostics-column-unit=@r{[}display@r{|}byte@r{]} @gol
+-fdiagnostics-column-origin=@var{origin}}
 
 @item Warning Options
 @xref{Warning Options,,Options to Request or Suppress Warnings}.
@@ -4729,6 +4731,31 @@ Do not print column numbers in diagnostics.  This may be necessary if
 diagnostics are being scanned by a program that does not understand the
 column numbers, such as @command{dejagnu}.
 
+@item -fdiagnostics-column-unit=@var{UNIT}
+@opindex fdiagnostics-column-unit
+Select the units for the column number.  This affects traditional diagnostics
+(in the absence of @option{-fno-show-column}), as well as JSON format
+diagnostics if requested.
+
+The default @var{UNIT}, @samp{display}, considers the number of display
+columns occupied by each character.  This may be larger than the number
+of bytes required to encode the character, in the case of tab
+characters, or it may be smaller, in the case of multibyte characters.
+For example, the character ``@U{03C0}'' occupies one display column,
+and its UTF-8 encoding requires two bytes; the character ``@U{1F642}''
+occupies two display columns, and its UTF-8 encoding requires four
+bytes.
+
+Setting @var{UNIT} to @samp{byte} changes the column number to the raw byte
+count in all cases, as was traditionally output by GCC prior to version 11.1.0.
+
+@item -fdiagnostics-column-origin=@var{ORIGIN}
+@opindex fdiagnostics-column-origin
+Select the origin for column numbers, i.e. the column number assigned to the
+first column.  The default value of 1 corresponds to traditional GCC
+behavior and to the GNU style guide.  Some utilities may perform better with an
+origin of 0; any non-negative value may be specified.
+
 @item -fdiagnostics-format=@var{FORMAT}
 @opindex fdiagnostics-format
 Select a different format for printing diagnostics.
@@ -4764,11 +4791,15 @@ might be printed in JSON form (after formatting) like this:
         "locations": [
             @{
                 "caret": @{
+		    "display-column": 3,
+		    "byte-column": 3,
                     "column": 3,
                     "file": "misleading-indentation.c",
                     "line": 15
                 @},
                 "finish": @{
+		    "display-column": 4,
+		    "byte-column": 4,
                     "column": 4,
                     "file": "misleading-indentation.c",
                     "line": 15
@@ -4784,6 +4815,8 @@ might be printed in JSON form (after formatting) like this:
                 "locations": [
                     @{
                         "caret": @{
+			    "display-column": 5,
+			    "byte-column": 5,
                             "column": 5,
                             "file": "misleading-indentation.c",
                             "line": 17
@@ -4793,6 +4826,7 @@ might be printed in JSON form (after formatting) like this:
                 "message": "...this statement, but the latter is @dots{}"
             @}
         ]
+	"column-origin": 1,
     @},
     @dots{}
 ]
@@ -4805,10 +4839,34 @@ A diagnostic has a @code{kind}.  If this is @code{warning}, then there is
 an @code{option} key describing the command-line option controlling the
 warning.
 
-A diagnostic can contain zero or more locations.  Each location has up
-to three positions within it: a @code{caret} position and optional
-@code{start} and @code{finish} positions.  A location can also have
-an optional @code{label} string.  For example, this error:
+A diagnostic can contain zero or more locations.  Each location has an
+optional @code{label} string and up to three positions within it: a
+@code{caret} position and optional @code{start} and @code{finish} positions.
+A position is described by a @code{file} name, a @code{line} number, and
+three numbers indicating a column position:
+@itemize @bullet
+
+@item
+@code{display-column} counts display columns, accounting for tabs and
+multibyte characters.
+
+@item
+@code{byte-column} counts raw bytes.
+
+@item
+@code{column} is equal to one of
+the previous two, as dictated by the @option{-fdiagnostics-column-unit}
+option.
+
+@end itemize
+All three columns are relative to the origin specified by
+@option{-fdiagnostics-column-origin}, which is typically equal to 1 but may
+be set, for instance, to 0 for compatibility with other utilities that
+number columns from 0.  The column origin is recorded in the JSON output in
+the @code{column-origin} tag.  In the remaining examples below, the extra
+column number outputs have been omitted for brevity.
+
+For example, this error:
 
 @smallexample
 bad-binary-ops.c:64:23: error: invalid operands to binary + (have 'S' @{aka
diff --git a/gcc/input.c b/gcc/input.c
index dd1d23df2f7..d573b90341a 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -913,7 +913,7 @@ make_location (location_t caret, source_range src_range)
    source line in order to calculate the display width.  If that cannot be done
    for any reason, then returns the byte column as a fallback.  */
 int
-location_compute_display_column (expanded_location exploc)
+location_compute_display_column (expanded_location exploc, int tabstop)
 {
   if (!(exploc.file && *exploc.file && exploc.line && exploc.column))
     return exploc.column;
@@ -921,7 +921,7 @@ location_compute_display_column (expanded_location exploc)
   /* If line is NULL, this function returns exploc.column which is the
      desired fallback.  */
   return cpp_byte_column_to_display_column (line.get_buffer (), line.length (),
-					    exploc.column);
+					    exploc.column, tabstop);
 }
 
 /* Dump statistics to stderr about the memory usage of the line_table
@@ -3608,33 +3608,46 @@ test_line_offset_overflow ()
 
 void test_cpp_utf8 ()
 {
+  const int def_tabstop = 8;
   /* Verify that wcwidth of invalid UTF-8 or control bytes is 1.  */
   {
-    int w_bad = cpp_display_width ("\xf0!\x9f!\x98!\x82!", 8);
+    int w_bad = cpp_display_width ("\xf0!\x9f!\x98!\x82!", 8, def_tabstop);
     ASSERT_EQ (8, w_bad);
-    int w_ctrl = cpp_display_width ("\r\t\n\v\0\1", 6);
-    ASSERT_EQ (6, w_ctrl);
+    int w_ctrl = cpp_display_width ("\r\n\v\0\1", 5, def_tabstop);
+    ASSERT_EQ (5, w_ctrl);
   }
 
   /* Verify that wcwidth of valid UTF-8 is as expected.  */
   {
-    const int w_pi = cpp_display_width ("\xcf\x80", 2);
+    const int w_pi = cpp_display_width ("\xcf\x80", 2, def_tabstop);
     ASSERT_EQ (1, w_pi);
-    const int w_emoji = cpp_display_width ("\xf0\x9f\x98\x82", 4);
+    const int w_emoji = cpp_display_width ("\xf0\x9f\x98\x82", 4, def_tabstop);
     ASSERT_EQ (2, w_emoji);
-    const int w_umlaut_precomposed = cpp_display_width ("\xc3\xbf", 2);
+    const int w_umlaut_precomposed = cpp_display_width ("\xc3\xbf", 2,
+							def_tabstop);
     ASSERT_EQ (1, w_umlaut_precomposed);
-    const int w_umlaut_combining = cpp_display_width ("y\xcc\x88", 3);
+    const int w_umlaut_combining = cpp_display_width ("y\xcc\x88", 3,
+						      def_tabstop);
     ASSERT_EQ (1, w_umlaut_combining);
-    const int w_han = cpp_display_width ("\xe4\xb8\xba", 3);
+    const int w_han = cpp_display_width ("\xe4\xb8\xba", 3, def_tabstop);
     ASSERT_EQ (2, w_han);
-    const int w_ascii = cpp_display_width ("GCC", 3);
+    const int w_ascii = cpp_display_width ("GCC", 3, def_tabstop);
     ASSERT_EQ (3, w_ascii);
     const int w_mixed = cpp_display_width ("\xcf\x80 = 3.14 \xf0\x9f\x98\x82"
-					   "\x9f! \xe4\xb8\xba y\xcc\x88", 24);
+					   "\x9f! \xe4\xb8\xba y\xcc\x88",
+					   24, def_tabstop);
     ASSERT_EQ (18, w_mixed);
   }
 
+  /* Verify that display width properly expands tabs.  */
+  {
+    const char *tstr = "\tabc\td";
+    ASSERT_EQ (6, cpp_display_width (tstr, 6, 1));
+    ASSERT_EQ (10, cpp_display_width (tstr, 6, 3));
+    ASSERT_EQ (17, cpp_display_width (tstr, 6, 8));
+    ASSERT_EQ (1, cpp_display_column_to_byte_column (tstr, 6, 7, 8));
+  }
+
   /* Verify that cpp_byte_column_to_display_column can go past the end,
      and similar edge cases.  */
   {
@@ -3645,10 +3658,13 @@ void test_cpp_utf8 ()
       /* 111122223456
 	 Byte columns.  */
 
-    ASSERT_EQ (5, cpp_display_width (str, 6));
-    ASSERT_EQ (105, cpp_byte_column_to_display_column (str, 6, 106));
-    ASSERT_EQ (10000, cpp_byte_column_to_display_column (NULL, 0, 10000));
-    ASSERT_EQ (0, cpp_byte_column_to_display_column (NULL, 10000, 0));
+    ASSERT_EQ (5, cpp_display_width (str, 6, def_tabstop));
+    ASSERT_EQ (105,
+	       cpp_byte_column_to_display_column (str, 6, 106, def_tabstop));
+    ASSERT_EQ (10000,
+	       cpp_byte_column_to_display_column (NULL, 0, 10000, def_tabstop));
+    ASSERT_EQ (0,
+	       cpp_byte_column_to_display_column (NULL, 10000, 0, def_tabstop));
   }
 
   /* Verify that cpp_display_column_to_byte_column can go past the end,
@@ -3662,21 +3678,25 @@ void test_cpp_utf8 ()
       /* 000000000000000000000000000000000111111
 	 111122223333444456666777788889999012345
 	 Byte columns.  */
-    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 2));
-    ASSERT_EQ (15, cpp_display_column_to_byte_column (str, 15, 11));
-    ASSERT_EQ (115, cpp_display_column_to_byte_column (str, 15, 111));
-    ASSERT_EQ (10000, cpp_display_column_to_byte_column (NULL, 0, 10000));
-    ASSERT_EQ (0, cpp_display_column_to_byte_column (NULL, 10000, 0));
+    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 2, def_tabstop));
+    ASSERT_EQ (15,
+	       cpp_display_column_to_byte_column (str, 15, 11, def_tabstop));
+    ASSERT_EQ (115,
+	       cpp_display_column_to_byte_column (str, 15, 111, def_tabstop));
+    ASSERT_EQ (10000,
+	       cpp_display_column_to_byte_column (NULL, 0, 10000, def_tabstop));
+    ASSERT_EQ (0,
+	       cpp_display_column_to_byte_column (NULL, 10000, 0, def_tabstop));
 
     /* Verify that we do not interrupt a UTF-8 sequence.  */
-    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 1));
+    ASSERT_EQ (4, cpp_display_column_to_byte_column (str, 15, 1, def_tabstop));
 
     for (int byte_col = 1; byte_col <= 15; ++byte_col)
       {
-	const int disp_col = cpp_byte_column_to_display_column (str, 15,
-								byte_col);
-	const int byte_col2 = cpp_display_column_to_byte_column (str, 15,
-								 disp_col);
+	const int disp_col
+	  = cpp_byte_column_to_display_column (str, 15, byte_col, def_tabstop);
+	const int byte_col2
+	  = cpp_display_column_to_byte_column (str, 15, disp_col, def_tabstop);
 
 	/* If we ask for the display column in the middle of a UTF-8
 	   sequence, it will return the length of the partial sequence,
diff --git a/gcc/input.h b/gcc/input.h
index df48ce63ef9..4790a571c6a 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -38,7 +38,9 @@ STATIC_ASSERT (BUILTINS_LOCATION < RESERVED_LOCATION_COUNT);
 
 extern bool is_location_from_builtin_token (location_t);
 extern expanded_location expand_location (location_t);
-extern int location_compute_display_column (expanded_location);
+
+extern int location_compute_display_column (expanded_location exploc,
+					    int tabstop);
 
 /* A class capturing the bounds of a buffer, to allow for run-time
    bounds-checking in a checked build.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 340d99434b3..525f44d079f 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "opt-suggestions.h"
 #include "diagnostic-color.h"
 #include "selftest.h"
+#include "cpplib.h"
 
 static void set_Wstrict_aliasing (struct gcc_options *opts, int onoff);
 
@@ -2404,6 +2405,14 @@ common_handle_option (struct gcc_options *opts,
       dc->parseable_fixits_p = value;
       break;
 
+    case OPT_fdiagnostics_column_unit_:
+      dc->column_unit = (enum diagnostics_column_unit)value;
+      break;
+
+    case OPT_fdiagnostics_column_origin_:
+      dc->column_origin = value;
+      break;
+
     case OPT_fdiagnostics_show_cwe:
       dc->show_cwe = value;
       break;
@@ -2792,6 +2801,12 @@ common_handle_option (struct gcc_options *opts,
       check_alignment_argument (loc, arg, "functions");
       break;
 
+    case OPT_ftabstop_:
+      /* It is documented that we silently ignore silly values.  */
+      if (value >= 1 && value <= 100)
+	dc->tabstop = value;
+      break;
+
     default:
       /* If the flag was handled in a standard way, assume the lack of
 	 processing here is intentional.  */
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
index 870ba720c5f..2314ad42402 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
@@ -36,20 +36,20 @@ int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
 
 /* { dg-begin-multiline-output "" }
-  if ((err = foo (b)) != 0)
-  ^~
+         if ((err = foo (b)) != 0)
+         ^~
    { dg-end-multiline-output "" } */
 /* { dg-begin-multiline-output "" }
-   goto fail;
-   ^~~~
+                 goto fail;
+                 ^~~~
    { dg-end-multiline-output "" } */
 
 fail:
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
index 5cdeba1cbba..202c6bc7fdf 100644
--- a/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation.c
@@ -65,9 +65,9 @@ int fn_6 (int a, int b, int c)
 	/* ... */
 	if ((err = foo (a)) != 0)
 		goto fail;
-	if ((err = foo (b)) != 0) /* { dg-message "2: this 'if' clause does not guard..." } */
+	if ((err = foo (b)) != 0) /* { dg-message "9: this 'if' clause does not guard..." } */
 		goto fail;
-		goto fail; /* { dg-message "3: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+		goto fail; /* { dg-message "17: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 	if ((err = foo (c)) != 0)
 		goto fail;
 	/* ... */
@@ -178,7 +178,7 @@ void fn_16_tabs (void)
     while (flagA)
       if (flagB) /* { dg-message "7: this 'if' clause does not guard..." } */
 	foo (0);
-	foo (1);/* { dg-message "2: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
+	foo (1);/* { dg-message "9: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'" } */
 }
 
 void fn_17_spaces (void)
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
index 9359db48c17..740becb5548 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
@@ -8,17 +8,22 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#error message\"" } */
 
 /* { dg-regexp "\"caret\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 6" } */
+/* { dg-regexp "\"display-column\": 6" } */
+/* { dg-regexp "\"byte-column\": 6" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
index 557ccf8378b..2f24a6c6596 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
@@ -8,6 +8,7 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"warning\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#warning message\"" } */
 /* { dg-regexp "\"option\": \"-Wcpp\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */
@@ -16,11 +17,15 @@
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 8" } */
+/* { dg-regexp "\"display-column\": 8" } */
+/* { dg-regexp "\"byte-column\": 8" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
index 378205c5bf5..afe96a9048f 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
@@ -8,6 +8,7 @@
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"#warning message\"" } */
 /* { dg-regexp "\"option\": \"-Werror=cpp\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wcpp\"" } */
@@ -16,11 +17,15 @@
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 2" } */
+/* { dg-regexp "\"display-column\": 2" } */
+/* { dg-regexp "\"byte-column\": 2" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.c\"" } */
 /* { dg-regexp "\"line\": 4" } */
 /* { dg-regexp "\"column\": 8" } */
+/* { dg-regexp "\"display-column\": 8" } */
+/* { dg-regexp "\"byte-column\": 8" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
index 2738be6548f..ae51091e0ea 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
@@ -24,15 +24,20 @@ int test (void)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 5" } */
+/* { dg-regexp "\"display-column\": 5" } */
+/* { dg-regexp "\"byte-column\": 5" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 10" } */
+/* { dg-regexp "\"display-column\": 10" } */
+/* { dg-regexp "\"byte-column\": 10" } */
 
 /* The outer diagnostic.  */
 
 /* { dg-regexp "\"kind\": \"warning\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \"this 'if' clause does not guard...\"" } */
 /* { dg-regexp "\"option\": \"-Wmisleading-indentation\"" } */
 /* { dg-regexp "\"option_url\": \"https:\[^\n\r\"\]*#index-Wmisleading-indentation\"" } */
@@ -41,11 +46,15 @@ int test (void)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 6" } */
 /* { dg-regexp "\"column\": 3" } */
+/* { dg-regexp "\"display-column\": 3" } */
+/* { dg-regexp "\"byte-column\": 3" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-4.c\"" } */
 /* { dg-regexp "\"line\": 6" } */
 /* { dg-regexp "\"column\": 4" } */
+/* { dg-regexp "\"display-column\": 4" } */
+/* { dg-regexp "\"byte-column\": 4" } */
 
 /* More from the nested diagnostic (we can't guarantee what order the
    "file" keys are consumed).  */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
index f36e896d228..e0e9ce4be98 100644
--- a/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-json-5.c
@@ -13,6 +13,7 @@ int test (struct s *ptr)
    We can't rely on any ordering of the keys.  */
 
 /* { dg-regexp "\"kind\": \"error\"" } */
+/* { dg-regexp "\"column-origin\": 1" } */
 /* { dg-regexp "\"message\": \".*\"" } */
 
 /* Verify fix-it hints.  */
@@ -23,11 +24,15 @@ int test (struct s *ptr)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 15" } */
+/* { dg-regexp "\"display-column\": 15" } */
+/* { dg-regexp "\"byte-column\": 15" } */
 
 /* { dg-regexp "\"next\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 21" } */
+/* { dg-regexp "\"display-column\": 21" } */
+/* { dg-regexp "\"byte-column\": 21" } */
 
 /* { dg-regexp "\"fixits\": \[\[\{\}, \]*\]" } */
 
@@ -35,11 +40,15 @@ int test (struct s *ptr)
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 15" } */
+/* { dg-regexp "\"display-column\": 15" } */
+/* { dg-regexp "\"byte-column\": 15" } */
 
 /* { dg-regexp "\"finish\": \{" } */
 /* { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-5.c\"" } */
 /* { dg-regexp "\"line\": 8" } */
 /* { dg-regexp "\"column\": 20" } */
+/* { dg-regexp "\"display-column\": 20" } */
+/* { dg-regexp "\"byte-column\": 20" } */
 
 /* { dg-regexp "\"locations\": \[\[\{\}, \]*\]" } */
 /* { dg-regexp "\"children\": \[\[\]\[\]\]" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-1.c b/gcc/testsuite/c-c++-common/diagnostic-units-1.c
new file mode 100644
index 00000000000..8d38b7de03e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-1.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via default)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-2.c b/gcc/testsuite/c-c++-common/diagnostic-units-2.c
new file mode 100644
index 00000000000..29a2edefd9f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-2.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -Wmultichar" } */
+
+/* column units: display (via arg)
+   column origin: 1 (via default)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "25: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-3.c b/gcc/testsuite/c-c++-common/diagnostic-units-3.c
new file mode 100644
index 00000000000..714ee8f2de4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-3.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=200 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via fallback from overly large argument)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-4.c b/gcc/testsuite/c-c++-common/diagnostic-units-4.c
new file mode 100644
index 00000000000..f9c9da914b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-4.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 0 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "10: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "18: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-5.c b/gcc/testsuite/c-c++-common/diagnostic-units-5.c
new file mode 100644
index 00000000000..99d5299a732
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=display -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=0 -Wmultichar" } */
+
+/* column units: display (via arg)
+   column origin: 0 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "17: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "24: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-6.c b/gcc/testsuite/c-c++-common/diagnostic-units-6.c
new file mode 100644
index 00000000000..c1e6e4ed477
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-6.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -fdiagnostics-column-origin=100 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 100 (via arg)
+   tabstop: 8 (via default) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "110: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c1 = 'c1';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+        int c2 = 'c2'; /* { dg-warning "117: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c2 = 'c2';
+                  ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+        int c3 = 	'c3'; /* { dg-warning "118: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+         int c3 =        'c3';
+                         ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-7.c b/gcc/testsuite/c-c++-common/diagnostic-units-7.c
new file mode 100644
index 00000000000..dab221ae235
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-7.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fdiagnostics-column-unit=byte -fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */
+
+/* column units: bytes (via arg)
+   column origin: 1 (via default)
+   tabstop: 9 (via arg) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "11: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c1 = 'c1';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c2 = 'c2';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+         int c3 = 	'c3'; /* { dg-warning "20: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c3 =          'c3';
+                            ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/diagnostic-units-8.c b/gcc/testsuite/c-c++-common/diagnostic-units-8.c
new file mode 100644
index 00000000000..d713b32dabc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-units-8.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fshow-column -fdiagnostics-show-caret -ftabstop=9 -Wmultichar" } */
+
+/* column units: display (via default)
+   column origin: 1 (via default)
+   tabstop: 9 (via arg) */
+
+/* This line starts with a tab.  */
+	int c1 = 'c1'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c1 = 'c1';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces.  */
+         int c2 = 'c2'; /* { dg-warning "19: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c2 = 'c2';
+                   ^~~~
+   { dg-end-multiline-output "" } */
+
+/* This line starts with <tabstop> spaces and has an internal tab after
+   a space.  */
+         int c3 = 	'c3'; /* { dg-warning "28: multi-character character constant" } */
+/* { dg-begin-multiline-output "" }
+          int c3 =          'c3';
+                            ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/missing-close-symbol.c b/gcc/testsuite/c-c++-common/missing-close-symbol.c
index abeb83748c1..9f1de3d0c47 100644
--- a/gcc/testsuite/c-c++-common/missing-close-symbol.c
+++ b/gcc/testsuite/c-c++-common/missing-close-symbol.c
@@ -24,9 +24,9 @@ void test_static_assert_different_line (void)
   _Static_assert(sizeof(int) >= sizeof(char), /* { dg-message "to match this '\\('" } */
 		 "msg"; /* { dg-error "expected '\\)' before ';' token" } */
   /* { dg-begin-multiline-output "" }
-    "msg";
-         ^
-         )
+                  "msg";
+                       ^
+                       )
      { dg-end-multiline-output "" } */
   /* { dg-begin-multiline-output "" }
    _Static_assert(sizeof(int) >= sizeof(char),
diff --git a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
index fab5849dfc7..ebbf3001055 100644
--- a/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
+++ b/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C
@@ -33,10 +33,10 @@ int test_2 (void)
            ~~~~~~~~~~~~~~~~
                          |
                          s
-    + some_other_function ());
-    ^ ~~~~~~~~~~~~~~~~~~~~~~
-                          |
-                          t
+           + some_other_function ());
+           ^ ~~~~~~~~~~~~~~~~~~~~~~
+                                 |
+                                 t
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/g++.dg/parse/error4.C b/gcc/testsuite/g++.dg/parse/error4.C
index 792bf4dc063..fe8de73790d 100644
--- a/gcc/testsuite/g++.dg/parse/error4.C
+++ b/gcc/testsuite/g++.dg/parse/error4.C
@@ -7,4 +7,4 @@ struct X {
 		 int);
 };
 
-// { dg-error "4:'itn' has not been declared" "" { target *-*-* } 6 }
+// { dg-error "18:'itn' has not been declared" "" { target *-*-* } 6 }
diff --git a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
index 96ebb71645c..d2b37a5122d 100644
--- a/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
+++ b/gcc/testsuite/g++.old-deja/g++.brendan/crash11.C
@@ -9,13 +9,13 @@ class A {
 	int	h;
 	A() { i=10; j=20; }
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); } // { dg-error "16:virtual functions cannot be friends" }
 };
 
 class B : public A {
     public:
 	virtual void f1() { printf("i=%d j=%d\n",i,j); }// { dg-error "" }  member.*// ERROR -  member.*
-	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "9:virtual functions cannot be friends" }
+	friend virtual void f2() { printf("i=%d j=%d\n",i,j); }  // { dg-error "16:virtual functions cannot be friends" }
 // { dg-error "private" "" { target *-*-* } .-1 }
 };
 
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
index b438543d445..bbc9e51aff6 100644
--- a/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
+++ b/gcc/testsuite/g++.old-deja/g++.pt/overload2.C
@@ -12,5 +12,5 @@ int
 main()
 {
 	C<char*>	c;
-	char*		p = Z(c.O); //{ dg-error "13:'Z' was not declared" } ambiguous c.O
+	char*		p = Z(c.O); //{ dg-error "29:'Z' was not declared" } ambiguous c.O
 }
diff --git a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
index 6dc2c55be58..b98e8da6b1e 100644
--- a/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
+++ b/gcc/testsuite/g++.old-deja/g++.robertl/eb109.C
@@ -48,8 +48,8 @@ ostream& operator<<(ostream& os, Graph<VertexType,EdgeType>& G)
 
         // The compiler does not like this line!!!!!!
         typename Graph<VertexType, EdgeType>::Successor::iterator
-	  startN = G[i].second.begin(), // { dg-error "14:no match" } no index operator
-	  endN   = G[i].second.end();  // { dg-error "14:no match" } no index operator
+	  startN = G[i].second.begin(), // { dg-error "21:no match" } no index operator
+	  endN   = G[i].second.end();  // { dg-error "21:no match" } no index operator
 
         while(startN != endN)
         {
diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
index c5ff96e5644..51190c92391 100644
--- a/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
+++ b/gcc/testsuite/gcc.dg/analyzer/malloc-paths-9.c
@@ -288,7 +288,7 @@ int test_3 (int x, int y)
     |      |     ~~~~~~~~~~
     |      |     |
     |      |     (4) ...to here
-    |   NN |      to dereference it above
+    |   NN |                    to dereference it above
     |   NN |   return *ptr;
     |      |          ~~~~
     |      |          |
diff --git a/gcc/testsuite/gcc.dg/bad-binary-ops.c b/gcc/testsuite/gcc.dg/bad-binary-ops.c
index 46c158e6a5f..45668be0a29 100644
--- a/gcc/testsuite/gcc.dg/bad-binary-ops.c
+++ b/gcc/testsuite/gcc.dg/bad-binary-ops.c
@@ -35,10 +35,10 @@ int test_2 (void)
            ~~~~~~~~~~~~~~~~
            |
            struct s
-    + some_other_function ());
-    ^ ~~~~~~~~~~~~~~~~~~~~~~
-      |
-      struct t
+           + some_other_function ());
+           ^ ~~~~~~~~~~~~~~~~~~~~~~
+             |
+             struct t
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/format/branch-1.c b/gcc/testsuite/gcc.dg/format/branch-1.c
index 1782064645e..4ea39b52b2e 100644
--- a/gcc/testsuite/gcc.dg/format/branch-1.c
+++ b/gcc/testsuite/gcc.dg/format/branch-1.c
@@ -10,7 +10,7 @@ foo (long l, int nfoo)
 {
   printf ((nfoo > 1) ? "%d foos" : "%d foo", nfoo);
   printf ((l > 1) ? "%d foos" /* { dg-warning "23:int" "wrong type in conditional expr" } */
-	          : "%d foo", l); /* { dg-warning "16:int" "wrong type in conditional expr" } */
+	          : "%d foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%ld foos" : "%d foo", l); /* { dg-warning "36:int" "wrong type in conditional expr" } */
   printf ((l > 1) ? "%d foos" : "%ld foo", l); /* { dg-warning "23:int" "wrong type in conditional expr" } */
   /* Should allow one case to have extra arguments.  */
diff --git a/gcc/testsuite/gcc.dg/format/pr79210.c b/gcc/testsuite/gcc.dg/format/pr79210.c
index 71f5dd6e082..6bdabdf21ec 100644
--- a/gcc/testsuite/gcc.dg/format/pr79210.c
+++ b/gcc/testsuite/gcc.dg/format/pr79210.c
@@ -20,4 +20,4 @@ LPFC_VPORT_ATTR_R(peer_port_login,
 		  "Allow peer ports on the same physical port to login to each "
 		  "other.");
 
-/* { dg-warning "6: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
+/* { dg-warning "20: format .%d. expects argument of type .int., but argument 4 has type .unsigned int. " "" { target *-*-* } .-12 } */
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
index 03b78042107..d7691e4be51 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
@@ -540,15 +540,15 @@ void test_builtin_types_compatible_p (unsigned long i)
   __emit_expression_range (0,
 			   f (i) + __builtin_types_compatible_p (long, int)); /* { dg-warning "range" } */
 /* { dg-begin-multiline-output "" }
-       f (i) + __builtin_types_compatible_p (long, int));
-       ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                            f (i) + __builtin_types_compatible_p (long, int));
+                            ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    { dg-end-multiline-output "" } */
 
   __emit_expression_range (0,
 			   __builtin_types_compatible_p (long, int) + f (i)); /* { dg-warning "range" } */
 /* { dg-begin-multiline-output "" }
-       __builtin_types_compatible_p (long, int) + f (i));
-       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
+                            __builtin_types_compatible_p (long, int) + f (i));
+                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
    { dg-end-multiline-output "" } */
 }
 
@@ -671,8 +671,8 @@ void test_multiple_ordinary_maps (void)
 /* { dg-begin-multiline-output "" }
    __emit_expression_range (0, foo (0,
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-        "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));
-        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                    "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"));
+                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    { dg-end-multiline-output "" } */
 
   /* Another expression that transitions between ordinary maps; this
@@ -685,8 +685,8 @@ void test_multiple_ordinary_maps (void)
 /* { dg-begin-multiline-output "" }
    __emit_expression_range (0, foo (0, "01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789",
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-        0));
-        ~~                      
+                                    0));
+                                    ~~
    { dg-end-multiline-output "" } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
index ac4fa1b52bd..4cba87be2ae 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
@@ -335,11 +335,11 @@ pr87652 (const char *stem, int counter)
 /* { dg-error "unable to read substring location: unable to read source line" "" { target c } 329 } */
 /* { dg-error "unable to read substring location: failed to get ordinary maps" "" { target c++ } 329 } */
 /* { dg-begin-multiline-output "" }
-     __emit_string_literal_range(__FILE__":%5d: " format, \
+     __emit_string_literal_range(__FILE__":%5d: " format,        \
                                  ^~~~~~~~
      { dg-end-multiline-output "" { target c } } */
 /* { dg-begin-multiline-output "" }
-     __emit_string_literal_range(__FILE__":%5d: " format, \
+     __emit_string_literal_range(__FILE__":%5d: " format,        \
                                  ^
      { dg-end-multiline-output "" { target c++ } } */
 
diff --git a/gcc/testsuite/gcc.dg/redecl-4.c b/gcc/testsuite/gcc.dg/redecl-4.c
index 8f124886da8..2c214bb02c7 100644
--- a/gcc/testsuite/gcc.dg/redecl-4.c
+++ b/gcc/testsuite/gcc.dg/redecl-4.c
@@ -15,7 +15,7 @@ f (void)
     /* Should get format warnings even though the built-in declaration
        isn't "visible".  */
     printf (
-	    "%s", 1); /* { dg-warning "8:format" } */
+	    "%s", 1); /* { dg-warning "15:format" } */
     /* The type of strcmp here should have no prototype.  */
     if (0)
       strcmp (1);
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
index 7fade1f65fc..606fe0f891a 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-1.F90
@@ -8,17 +8,22 @@
 ! We can't rely on any ordering of the keys.
 
 ! { dg-regexp "\"kind\": \"error\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#error message\"" }
 
 ! { dg-regexp "\"caret\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-1.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 6" }
+! { dg-regexp "\"display-column\": 6" }
+! { dg-regexp "\"byte-column\": 6" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
index bebcf68d431..56615f0ca5a 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-2.F90
@@ -8,6 +8,7 @@
 ! We can't rely on any ordering of the keys. 
 
 ! { dg-regexp "\"kind\": \"warning\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#warning message\"" }
 ! { dg-regexp "\"option\": \"-Wcpp\"" }
 ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }
@@ -16,11 +17,15 @@
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-2.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 8" }
+! { dg-regexp "\"display-column\": 8" }
+! { dg-regexp "\"byte-column\": 8" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90 b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
index 7ab78eb570b..50214759091 100644
--- a/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
+++ b/gcc/testsuite/gfortran.dg/diagnostic-format-json-3.F90
@@ -8,6 +8,7 @@
 ! We can't rely on any ordering of the keys.
 
 ! { dg-regexp "\"kind\": \"error\"" }
+! { dg-regexp "\"column-origin\": 1" }
 ! { dg-regexp "\"message\": \"#warning message\"" }
 ! { dg-regexp "\"option\": \"-Werror=cpp\"" }
 ! { dg-regexp "\"option_url\": \"\[^\n\r\"\]*#index-Wcpp\"" }
@@ -16,11 +17,15 @@
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 2" }
+! { dg-regexp "\"display-column\": 2" }
+! { dg-regexp "\"byte-column\": 2" }
 
 ! { dg-regexp "\"finish\": \{" }
 ! { dg-regexp "\"file\": \"\[^\n\r\"\]*diagnostic-format-json-3.F90\"" }
 ! { dg-regexp "\"line\": 4" }
 ! { dg-regexp "\"column\": 8" }
+! { dg-regexp "\"display-column\": 8" }
+! { dg-regexp "\"byte-column\": 8" }
 
 ! { dg-regexp "\"locations\": \[\[\{\}, \]*\]" }
 ! { dg-regexp "\"children\": \[\[\]\[\]\]" }
diff --git a/gcc/testsuite/go.dg/arrayclear.go b/gcc/testsuite/go.dg/arrayclear.go
index 6daebc0b8f5..aa5ba0761d7 100644
--- a/gcc/testsuite/go.dg/arrayclear.go
+++ b/gcc/testsuite/go.dg/arrayclear.go
@@ -1,5 +1,8 @@
 // { dg-do compile }
 // { dg-options "-fgo-debug-optimization" }
+// This comment is necessary to work around a dejagnu bug. Otherwise, the
+// column of the second error message would equal the row of the first one, and
+// since the errors are also identical, dejagnu is not able to distinguish them.
 
 package p
 
diff --git a/gcc/tree-diagnostic-path.cc b/gcc/tree-diagnostic-path.cc
index 381a49cb0b4..82b3c2d6b6a 100644
--- a/gcc/tree-diagnostic-path.cc
+++ b/gcc/tree-diagnostic-path.cc
@@ -493,7 +493,7 @@ default_tree_diagnostic_path_printer (diagnostic_context *context,
    doesn't have access to trees (for m_fndecl).  */
 
 json::value *
-default_tree_make_json_for_path (diagnostic_context *,
+default_tree_make_json_for_path (diagnostic_context *context,
 				 const diagnostic_path *path)
 {
   json::array *path_array = new json::array ();
@@ -504,7 +504,8 @@ default_tree_make_json_for_path (diagnostic_context *,
       json::object *event_obj = new json::object ();
       if (event.get_location ())
 	event_obj->set ("location",
-			json_from_expanded_location (event.get_location ()));
+			json_from_expanded_location (context,
+						     event.get_location ()));
       label_text event_text (event.get_desc (false));
       event_obj->set ("description", new json::string (event_text.m_buffer));
       event_text.maybe_free ();
diff --git a/libcpp/charset.c b/libcpp/charset.c
index db47235b847..28b81c9c864 100644
--- a/libcpp/charset.c
+++ b/libcpp/charset.c
@@ -2276,49 +2276,90 @@ cpp_string_location_reader::get_next ()
   return result;
 }
 
-/* Helper for cpp_byte_column_to_display_column and its inverse.  Given a
-   pointer to a UTF-8-encoded character, compute its display width.  *INBUFP
-   points on entry to the start of the UTF-8 encoding of the character, and
-   is updated to point just after the last byte of the encoding.  *INBYTESLEFTP
-   contains on entry the remaining size of the buffer into which *INBUFP
-   points, and this is also updated accordingly.  If *INBUFP does not
+cpp_display_width_computation::
+cpp_display_width_computation (const char *data, int data_length, int tabstop) :
+  m_begin (data),
+  m_next (m_begin),
+  m_bytes_left (data_length),
+  m_tabstop (tabstop),
+  m_display_cols (0)
+{
+  gcc_assert (m_tabstop > 0);
+}
+
+
+/* The main implementation function for class cpp_display_width_computation.
+   m_next points on entry to the start of the UTF-8 encoding of the next
+   character, and is updated to point just after the last byte of the encoding.
+   m_bytes_left contains on entry the remaining size of the buffer into which
+   m_next points, and this is also updated accordingly.  If m_next does not
    point to a valid UTF-8-encoded sequence, then it will be treated as a single
-   byte with display width 1.  */
+   byte with display width 1.  m_cur_display_col is the current display column,
+   relative to which tab stops should be expanded.  Returns the display width of
+   the codepoint just processed.  */
 
-static inline int
-compute_next_display_width (const uchar **inbufp, size_t *inbytesleftp)
+int
+cpp_display_width_computation::process_next_codepoint ()
 {
   cppchar_t c;
-  if (one_utf8_to_cppchar (inbufp, inbytesleftp, &c) != 0)
+  int next_width;
+
+  if (*m_next == '\t')
+    {
+      ++m_next;
+      --m_bytes_left;
+      next_width = m_tabstop - (m_display_cols % m_tabstop);
+    }
+  else if (one_utf8_to_cppchar ((const uchar **) &m_next, &m_bytes_left, &c)
+	   != 0)
     {
       /* Input is not convertible to UTF-8.  This could be fine, e.g. in a
 	 string literal, so don't complain.  Just treat it as if it has a width
 	 of one.  */
-      ++*inbufp;
-      --*inbytesleftp;
-      return 1;
+      ++m_next;
+      --m_bytes_left;
+      next_width = 1;
+    }
+  else
+    {
+      /*  one_utf8_to_cppchar() has updated m_next and m_bytes_left for us.  */
+      next_width = cpp_wcwidth (c);
     }
 
-  /*  one_utf8_to_cppchar() has updated inbufp and inbytesleftp for us.  */
-  return cpp_wcwidth (c);
+  m_display_cols += next_width;
+  return next_width;
+}
+
+/*  Utility to advance the byte stream by the minimum amount needed to consume
+    N display columns.  Returns the number of display columns that were
+    actually skipped.  This could be less than N, if there was not enough data,
+    or more than N, if the last character to be skipped had a sufficiently large
+    display width.  */
+int
+cpp_display_width_computation::advance_display_cols (int n)
+{
+  const int start = m_display_cols;
+  const int target = start + n;
+  while (m_display_cols < target && !done ())
+    process_next_codepoint ();
+  return m_display_cols - start;
 }
 
 /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute
     how many display columns are occupied by the first COLUMN bytes.  COLUMN
     may exceed DATA_LENGTH, in which case the phantom bytes at the end are
-    treated as if they have display width 1.  */
+    treated as if they have display width 1.  Tabs are expanded to the next tab
+    stop, relative to the start of DATA.  */
 
 int
 cpp_byte_column_to_display_column (const char *data, int data_length,
-				   int column)
+				   int column, int tabstop)
 {
-  int display_col = 0;
-  const uchar *udata = (const uchar *) data;
   const int offset = MAX (0, column - data_length);
-  size_t inbytesleft = column - offset;
-  while (inbytesleft)
-    display_col += compute_next_display_width (&udata, &inbytesleft);
-  return display_col + offset;
+  cpp_display_width_computation dw (data, column - offset, tabstop);
+  while (!dw.done ())
+    dw.process_next_codepoint ();
+  return dw.display_cols_processed () + offset;
 }
 
 /*  For the string of length DATA_LENGTH bytes that begins at DATA, compute
@@ -2328,14 +2369,11 @@ cpp_byte_column_to_display_column (const char *data, int data_length,
 
 int
 cpp_display_column_to_byte_column (const char *data, int data_length,
-				   int display_col)
+				   int display_col, int tabstop)
 {
-  int column = 0;
-  const uchar *udata = (const uchar *) data;
-  size_t inbytesleft = data_length;
-  while (column < display_col && inbytesleft)
-      column += compute_next_display_width (&udata, &inbytesleft);
-  return data_length - inbytesleft + MAX (0, display_col - column);
+  cpp_display_width_computation dw (data, data_length, tabstop);
+  const int avail_display = dw.advance_display_cols (display_col);
+  return dw.bytes_processed () + MAX (0, display_col - avail_display);
 }
 
 /* Our own version of wcwidth().  We don't use the actual wcwidth() in glibc,
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 544735a51af..c18f455f82a 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -312,9 +312,6 @@ enum cpp_normalize_level {
    carries all the options visible to the command line.  */
 struct cpp_options
 {
-  /* Characters between tab stops.  */
-  unsigned int tabstop;
-
   /* The language we're preprocessing.  */
   enum c_lang lang;
 
@@ -1334,14 +1331,43 @@ extern const char * cpp_get_userdef_suffix
   (const cpp_token *);
 
 /* In charset.c */
+
+/* A class to manage the state while converting a UTF-8 sequence to cppchar_t
+   and computing the display width one character at a time.  */
+class cpp_display_width_computation {
+ public:
+  cpp_display_width_computation (const char *data, int data_length,
+				 int tabstop);
+  const char *next_byte () const { return m_next; }
+  int bytes_processed () const { return m_next - m_begin; }
+  int bytes_left () const { return m_bytes_left; }
+  bool done () const { return !bytes_left (); }
+  int display_cols_processed () const { return m_display_cols; }
+
+  int process_next_codepoint ();
+  int advance_display_cols (int n);
+
+ private:
+  const char *const m_begin;
+  const char *m_next;
+  size_t m_bytes_left;
+  const int m_tabstop;
+  int m_display_cols;
+};
+
+/* Convenience functions that are simple use cases for class
+   cpp_display_width_computation.  Tab characters will be expanded to spaces
+   as determined by TABSTOP.  */
 int cpp_byte_column_to_display_column (const char *data, int data_length,
-				       int column);
-inline int cpp_display_width (const char *data, int data_length)
+				       int column, int tabstop);
+inline int cpp_display_width (const char *data, int data_length,
+			      int tabstop)
 {
-    return cpp_byte_column_to_display_column (data, data_length, data_length);
+  return cpp_byte_column_to_display_column (data, data_length, data_length,
+					    tabstop);
 }
 int cpp_display_column_to_byte_column (const char *data, int data_length,
-				       int display_col);
+				       int display_col, int tabstop);
 int cpp_wcwidth (cppchar_t c);
 
 #endif /* ! LIBCPP_CPPLIB_H */
diff --git a/libcpp/init.c b/libcpp/init.c
index 63124c8161e..6e94c486059 100644
--- a/libcpp/init.c
+++ b/libcpp/init.c
@@ -190,7 +190,6 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, discard_comments) = 1;
   CPP_OPTION (pfile, discard_comments_in_macro_exp) = 1;
   CPP_OPTION (pfile, max_include_depth) = 200;
-  CPP_OPTION (pfile, tabstop) = 8;
   CPP_OPTION (pfile, operator_names) = 1;
   CPP_OPTION (pfile, warn_trigraphs) = 2;
   CPP_OPTION (pfile, warn_endif_labels) = 1;


More information about the Gcc-patches mailing list