[Bug c++/92919] New: invalid memory access in wide_str_to_charconst when running ucn2.C testcase (caught by hwasan)
matmal01 at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Thu Dec 12 12:02:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919
Bug ID: 92919
Summary: invalid memory access in wide_str_to_charconst when
running ucn2.C testcase (caught by hwasan)
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: matmal01 at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
Target Milestone: ---
Target: aarch64-none-linux-gnu
When running the ucn2.C testcase, hwasan catches an invalid access in the
function `wide_str_to_charconst`.
The problematic line is:
const char16_t p = u'\U00110003';
It seems this is to do with the size of the constant, since the line below does
not trigger this invalid access.
const char16_t j = u'\U0001F914';
yet changing that constant to the below does.
const char16_t j = u'\U0011F914';
HWASAN output is below.
==9608==ERROR: HWAddressSanitizer: tag-mismatch on address 0xefdf000080bf at pc
0x000000651270
READ of size 1 at 0xefdf000080bf tags: 5f/79 (ptr/mem) in thread T0
#0 0x65126c in SigTrap<0>
../../../../gcc-pdtl/libsanitizer/hwasan/hwasan_checks.h:27
#1 0x65126c in CheckAddress<(__hwasan::ErrorAction)0,
(__hwasan::AccessType)0, 0>
../../../../gcc-pdtl/libsanitizer/hwasan/hwasan_checks.h:88
#2 0x65126c in __hwasan_load1
../../../../gcc-pdtl/libsanitizer/hwasan/hwasan.cpp:469
#3 0x2b143dc in wide_str_to_charconst ../../gcc-pdtl/libcpp/charset.c:1980
#4 0x2b143dc in cpp_interpret_charconst(cpp_reader*, cpp_token const*,
unsigned int*, int*) ../../gcc-pdtl/libcpp/charset.c:2045
#5 0xb31a48 in lex_charconst ../../gcc-pdtl/gcc/c-family/c-lex.c:1368
#6 0xb35964 in c_lex_with_flags(tree_node**, unsigned int*, unsigned char*,
int) ../../gcc-pdtl/gcc/c-family/c-lex.c:617
#7 0x89c6bc in cp_lexer_get_preprocessor_token
../../gcc-pdtl/gcc/cp/parser.c:807
#8 0x943cc0 in cp_lexer_new_main ../../gcc-pdtl/gcc/cp/parser.c:654
#9 0x943cc0 in cp_parser_new ../../gcc-pdtl/gcc/cp/parser.c:3968
#10 0x943cc0 in c_parse_file() ../../gcc-pdtl/gcc/cp/parser.c:42963
#11 0xb50c90 in c_common_parse_file()
../../gcc-pdtl/gcc/c-family/c-opts.c:1185
#12 0x16a49fc in compile_file ../../gcc-pdtl/gcc/toplev.c:458
#13 0x6466bc in do_compile ../../gcc-pdtl/gcc/toplev.c:2280
#14 0x6466bc in toplev::main(int, char**) ../../gcc-pdtl/gcc/toplev.c:2419
#15 0x649468 in main ../../gcc-pdtl/gcc/main.c:39
#16 0xffff93dd689c in __libc_start_main
(/lib/aarch64-linux-gnu/libc.so.6+0x1f89c)
[0xefdf000080a0,0xefdf000080c0) is a small unallocated heap chunk; size: 32
offset: 31
0xefdf000080bf is located 1 bytes to the left of 2-byte region
[0xefdf000080c0,0xefdf000080c2)
allocated here:
#0 0x652bc0 in __sanitizer_realloc
../../../../gcc-pdtl/libsanitizer/hwasan/hwasan_interceptors.cpp:146
#1 0x2b95f40 in xrealloc ../../gcc-pdtl/libiberty/xmalloc.c:179
#2 0x2b122ec in cpp_interpret_string_1 ../../gcc-pdtl/libcpp/charset.c:1753
#3 0x2b14284 in cpp_interpret_string(cpp_reader*, cpp_string const*,
unsigned long, cpp_string*, cpp_ttype) ../../gcc-pdtl/libcpp/charset.c:1784
#4 0x2b14284 in cpp_interpret_charconst(cpp_reader*, cpp_token const*,
unsigned int*, int*) ../../gcc-pdtl/libcpp/charset.c:2036
#5 0xb31a48 in lex_charconst ../../gcc-pdtl/gcc/c-family/c-lex.c:1368
#6 0xb35964 in c_lex_with_flags(tree_node**, unsigned int*, unsigned char*,
int) ../../gcc-pdtl/gcc/c-family/c-lex.c:617
#7 0x89c6bc in cp_lexer_get_preprocessor_token
../../gcc-pdtl/gcc/cp/parser.c:807
#8 0x943cc0 in cp_lexer_new_main ../../gcc-pdtl/gcc/cp/parser.c:654
#9 0x943cc0 in cp_parser_new ../../gcc-pdtl/gcc/cp/parser.c:3968
#10 0x943cc0 in c_parse_file() ../../gcc-pdtl/gcc/cp/parser.c:42963
#11 0xb50c90 in c_common_parse_file()
../../gcc-pdtl/gcc/c-family/c-opts.c:1185
#12 0x16a49fc in compile_file ../../gcc-pdtl/gcc/toplev.c:458
#13 0x6466bc in do_compile ../../gcc-pdtl/gcc/toplev.c:2280
#14 0x6466bc in toplev::main(int, char**) ../../gcc-pdtl/gcc/toplev.c:2419
#15 0x649468 in main ../../gcc-pdtl/gcc/main.c:39
#16 0xffff93dd689c in __libc_start_main
(/lib/aarch64-linux-gnu/libc.so.6+0x1f89c)
#17 0x64cb24
(/home/ubuntu/working-directory/gcc-hwasan-install/libexec/gcc/aarch64-unknown-linux-gnu/10.0.0/cc1plus+0x64cb24)
Thread: T0 0xeffe00002000 stack: [0xffffe544a000,0xffffe944a000) sz: 67108864
tls: [0xffff94020000,0xffff94020850)
Memory tags around the buggy address (one tag corresponds to 16 bytes):
0d 00 09 00 09 00 e7 09 09 00 e2 0c 9a 0c 0a 4a
e7 0c 0d 00 0d 00 05 00 0d 00 08 00 08 00 08 00
08 00 0b 00 0b 00 0b 00 0b 00 0e 00 0e 00 05 00
0e 00 08 00 08 00 09 00 08 00 0c 00 0c 00 09 00
0c 00 0c 00 0c 00 08 00 0c 00 0b 00 0b 00 07 00
0b 00 0a 00 0a 00 09 00 0a 00 0c 00 0c 00 ec 0f
0c 00 08 00 07 00 58 03 0d 00 5b 0f 08 00 4f 4f
08 00 ab ab 09 00 09 00 09 00 09 00 09 00 09 00
=> 09 00 09 00 28 08 0e 00 cd 0b 79 [79] 02 72 71 71 <=
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Tags for short granules around the buggy address (one tag corresponds to 16
bytes):
af .. .. .. 4b .. 9d .. 45 .. 3f .. 7b .. 11 ..
=> c9 .. 74 .. .. 28 d8 .. .. cd .. [..] 5f .. .. .. <=
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
See
https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#short-granules
for a description of short granule tags
SUMMARY: HWAddressSanitizer: tag-mismatch
../../../../gcc-pdtl/libsanitizer/hwasan/hwasan_checks.h:27 in SigTrap<0>
When running the testcase (with just the problematic line) under GDB, we can
stop on entry to the function `wide_str_to_charconst` and inspect the relevant
variables.
It seems that `str.len` is 2, `cwidth` is 8, `bigend` is false, and `width` is
16. Hence the access on line 1980
c = bigend ? str.text[off + i] : str.text[off + nbwc - i - 1];
becomes
nbwc = width / 8
off = 2 - (nbwc * 2)
c = str.text[off + nbwc - i - 1]
c = str.text[2 - nbwc - 1]
c = str.text[2 - (width / 8) - 1]
c = str.text[2 - (16 / 8) - 1]
c = str.text[-1]
Which is accessing one byte before the text buffer (as mentioned in the HWASAN
dump).
(The inspection in GDB was largely to demonstrate this isn't a bug in HWASAN).
More information about the Gcc-bugs
mailing list