Bug 98206 - UBSan: Casting from multiple inheritance base to derived class triggers undefined behavior sanitizer
Summary: UBSan: Casting from multiple inheritance base to derived class triggers undef...
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: sanitizer (show other bugs)
Version: 11.0
: P3 normal
Target Milestone: ---
Assignee: Jakub Jelinek
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-08 19:30 UTC by Roland B
Modified: 2021-01-01 12:13 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-12-08 00:00:00


Attachments
gcc11-pr98206.patch (1.09 KB, patch)
2020-12-29 14:44 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Roland B 2020-12-08 19:30:36 UTC
The following code incorrectly triggers undefined behavior sanitizer:

// -------------------------------------
#include <iostream>
#include <string_view>

template<typename Derived>
struct Base1
{
    char c1;
};

template<typename Derived>
struct Base2
{
    char c2;
    auto& get2() const
    {
        return static_cast<const Derived&>(*this);
    }
};

struct X : public Base1<X>, public Base2<X>
{
    X(std::string_view d) : data{d} {}

    std::string_view data;
};


int main()
{
    auto x = X{"cheesecake"};

    std::cout << x.get2().data << std::endl;
}
// -------------------------------------


See also discussion here:
https://stackoverflow.com/a/65178320/2173029
Comment 1 Roland B 2020-12-08 19:40:31 UTC
Here is the actual error message:


// -----
example.cpp:16:49: runtime error: reference binding to misaligned address 0x7ffdc34cab61 for type 'const struct X', which requires 8 byte alignment
0x7ffdc34cab61: note: pointer points here
 00 00 00  02 00 00 00 00 00 00 00  0a 00 00 00 00 00 00 00  af 20 40 00 00 00 00 00  20 17 40 00 00
              ^ 
// -----

This is referring to 

return static_cast<const Derived&>(*this);
Comment 2 Martin Liška 2020-12-08 20:51:13 UTC
Confirmed, can you Jakub please take a look?
Comment 3 Jakub Jelinek 2020-12-29 14:10:53 UTC
Testcase without any headers:
template <typename Derived>
struct Base1
{
  char c1;
};

template <typename Derived>
struct Base2
{
  char c2;
  auto &get2 () const { return static_cast<const Derived &> (*this); }
};

struct X : public Base1<X>, public Base2<X>
{
  X (const char *d) : data{d} {}
  const char *data;
};

int
main ()
{
  X x = X{"cheesecake"};
  const char *p = x.get2 ().data;
}

The problem is that ubsan_maybe_instrument_reference is done only during genericization, but we fold:
(const struct X &) ((const struct Base2 *) this + 18446744073709551615)
to
(const struct X &) this + 18446744073709551615
much earlier than that (during parsing even!), and if it wasn't during parsing, it would be during fully folding of the function which is also before genericization.
In fold-const.c, it is the:
      /* Convert (T1)(X p+ Y) into ((T1)X p+ Y), for pointer type, when the new
         cast (T1)X will fold away.  We assume that this happens when X itself
         is a cast.  */
      if (POINTER_TYPE_P (type)
          && TREE_CODE (arg0) == POINTER_PLUS_EXPR
          && CONVERT_EXPR_P (TREE_OPERAND (arg0, 0)))
        {
          tree arg00 = TREE_OPERAND (arg0, 0);
          tree arg01 = TREE_OPERAND (arg0, 1);

          return fold_build_pointer_plus_loc
                   (loc, fold_convert_loc (loc, type, arg00), arg01);
        }
optimization that does that.
So, one way around this is for
!in_gimple_form && sanitize_flags_p (SANITIZE_ALIGNMENT)
to avoid the above optimization if type has higher alignment than the p+ first operand's type.
Comment 4 Jakub Jelinek 2020-12-29 14:44:04 UTC
Created attachment 49855 [details]
gcc11-pr98206.patch

Untested fix.
Comment 5 GCC Commits 2020-12-31 09:21:13 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a9ec9902d7f1a9bf7a2778c3fb8fc75bc2df2cef

commit r11-6375-ga9ec9902d7f1a9bf7a2778c3fb8fc75bc2df2cef
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Dec 31 10:20:39 2020 +0100

    fold-const: Avoid (cast) ((cast2) x p+ y) folding for -fsanitize=alignment [PR98206]
    
    The following testcase is diagnosed by UBSan as invalid, even when it is
    valid.
    We have a derived type Base2 at offset 1 with alignment 1 and do:
    (const Derived &) ((const Base2 *) this + -1)
    but the folder before ubsan in the FE gets a chance to instrument it
    optimizes that into:
    (const Derived &) this + -1
    and so we require that this has 8-byte alignment which Derived class needs.
    
    Fixed by avoiding such an optimization when -fsanitize=alignment is in
    effect if it would affect the alignments (and guarded with !in_gimple_form
    because we don't really care during GIMPLE, though pointer conversions are
    useless then and so such folding isn't needed very much during GIMPLE).
    
    2020-12-31  Jakub Jelinek  <jakub@redhat.com>
    
            PR c++/98206
            * fold-const.c: Include asan.h.
            (fold_unary_loc): Don't optimize (ptr_type) (((ptr_type2) x) p+ y)
            into ((ptr_type) x) p+ y if sanitizing alignment in GENERIC and
            ptr_type points to type with higher alignment than ptr_type2.
    
            * g++.dg/ubsan/align-4.C: New test.
Comment 6 Roland B 2021-01-01 12:13:00 UTC
Great!

This also works well in the original (more complex) scenario (https://github.com/rbock/sqlpp11/issues/355).