[PATCH 1/2] Flag CPP_W_BIDIRECTIONAL so that source lines are escaped
David Malcolm
dmalcolm@redhat.com
Tue Nov 2 21:07:25 GMT 2021
On Tue, 2021-11-02 at 16:58 -0400, David Malcolm wrote:
> Before:
>
> Wbidirectional-1.c: In function ‘main’:
> Wbidirectional-1.c:6:43: warning: unpaired UTF-8 bidirectional
> character detected [-Wbidirectional=]
> 6 | /* } if (isAdmin) begin admins only */
> | ^
> Wbidirectional-1.c:9:28: warning: unpaired UTF-8 bidirectional
> character detected [-Wbidirectional=]
> 9 | /* end admins only { */
> | ^
>
> Wbidirectional-11.c:6:15: warning: UTF-8 vs UCN mismatch when
> closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-
> Wbidirectional=]
> 6 | int LRE__PDF_\u202c;
> | ^
>
> After setting rich_loc.set_escape_on_output (true):
>
> Wbidirectional-1.c:6:43: warning: unpaired UTF-8 bidirectional
> character detected [-Wbidirectional=]
> 6 | /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066>
> begin admins only */
>
> |
> ^
> Wbidirectional-1.c:9:28: warning: unpaired UTF-8 bidirectional
> character detected [-Wbidirectional=]
> 9 | /* end admins only <U+202E> { <U+2066>*/
> | ^
>
> Wbidirectional-11.c:6:15: warning: UTF-8 vs UCN mismatch when
> closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-
> Wbidirectional=]
> 6 | int LRE_<U+202A>_PDF_\u202c;
> | ^
>
> libcpp/ChangeLog:
> * lex.c (maybe_warn_bidi_on_close): Use a rich_location
> and call set_escape_on_output (true) on it.
> (maybe_warn_bidi_on_char): Likewise.
>
> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
[...snip...]
To be more explicit: part of the benefit of escaping non-ASCII bytes in
the source line is that it further mitigates against CVE-2021-42574,
since it "defangs" the bidi control characters - turning everything
into ASCII, so that the user can see the logical ordering of the
characters directly. A similar consideration applies to homoglyph
attacks.
Dave
More information about the Gcc-patches
mailing list