The following code incorrectly triggers undefined behavior sanitizer: // ------------------------------------- #include <iostream> #include <string_view> template<typename Derived> struct Base1 { char c1; }; template<typename Derived> struct Base2 { char c2; auto& get2() const { return static_cast<const Derived&>(*this); } }; struct X : public Base1<X>, public Base2<X> { X(std::string_view d) : data{d} {} std::string_view data; }; int main() { auto x = X{"cheesecake"}; std::cout << x.get2().data << std::endl; } // ------------------------------------- See also discussion here: https://stackoverflow.com/a/65178320/2173029
Here is the actual error message: // ----- example.cpp:16:49: runtime error: reference binding to misaligned address 0x7ffdc34cab61 for type 'const struct X', which requires 8 byte alignment 0x7ffdc34cab61: note: pointer points here 00 00 00 02 00 00 00 00 00 00 00 0a 00 00 00 00 00 00 00 af 20 40 00 00 00 00 00 20 17 40 00 00 ^ // ----- This is referring to return static_cast<const Derived&>(*this);
Confirmed, can you Jakub please take a look?
Testcase without any headers: template <typename Derived> struct Base1 { char c1; }; template <typename Derived> struct Base2 { char c2; auto &get2 () const { return static_cast<const Derived &> (*this); } }; struct X : public Base1<X>, public Base2<X> { X (const char *d) : data{d} {} const char *data; }; int main () { X x = X{"cheesecake"}; const char *p = x.get2 ().data; } The problem is that ubsan_maybe_instrument_reference is done only during genericization, but we fold: (const struct X &) ((const struct Base2 *) this + 18446744073709551615) to (const struct X &) this + 18446744073709551615 much earlier than that (during parsing even!), and if it wasn't during parsing, it would be during fully folding of the function which is also before genericization. In fold-const.c, it is the: /* Convert (T1)(X p+ Y) into ((T1)X p+ Y), for pointer type, when the new cast (T1)X will fold away. We assume that this happens when X itself is a cast. */ if (POINTER_TYPE_P (type) && TREE_CODE (arg0) == POINTER_PLUS_EXPR && CONVERT_EXPR_P (TREE_OPERAND (arg0, 0))) { tree arg00 = TREE_OPERAND (arg0, 0); tree arg01 = TREE_OPERAND (arg0, 1); return fold_build_pointer_plus_loc (loc, fold_convert_loc (loc, type, arg00), arg01); } optimization that does that. So, one way around this is for !in_gimple_form && sanitize_flags_p (SANITIZE_ALIGNMENT) to avoid the above optimization if type has higher alignment than the p+ first operand's type.
Created attachment 49855 [details] gcc11-pr98206.patch Untested fix.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:a9ec9902d7f1a9bf7a2778c3fb8fc75bc2df2cef commit r11-6375-ga9ec9902d7f1a9bf7a2778c3fb8fc75bc2df2cef Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Dec 31 10:20:39 2020 +0100 fold-const: Avoid (cast) ((cast2) x p+ y) folding for -fsanitize=alignment [PR98206] The following testcase is diagnosed by UBSan as invalid, even when it is valid. We have a derived type Base2 at offset 1 with alignment 1 and do: (const Derived &) ((const Base2 *) this + -1) but the folder before ubsan in the FE gets a chance to instrument it optimizes that into: (const Derived &) this + -1 and so we require that this has 8-byte alignment which Derived class needs. Fixed by avoiding such an optimization when -fsanitize=alignment is in effect if it would affect the alignments (and guarded with !in_gimple_form because we don't really care during GIMPLE, though pointer conversions are useless then and so such folding isn't needed very much during GIMPLE). 2020-12-31 Jakub Jelinek <jakub@redhat.com> PR c++/98206 * fold-const.c: Include asan.h. (fold_unary_loc): Don't optimize (ptr_type) (((ptr_type2) x) p+ y) into ((ptr_type) x) p+ y if sanitizing alignment in GENERIC and ptr_type points to type with higher alignment than ptr_type2. * g++.dg/ubsan/align-4.C: New test.
Great! This also works well in the original (more complex) scenario (https://github.com/rbock/sqlpp11/issues/355).