Bug 86173

Summary: Default construction of a union (in std::optional)
Product: gcc Reporter: Marc Glisse <glisse>
Component: c++Assignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: ebotcazou, sebastian.flothow, webrown.cpp
Priority: P3 Keywords: missed-optimization
Version: 9.0   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2018-06-18 00:00:00

Description Marc Glisse 2018-06-16 09:58:57 UTC
Default construction of std::optional<A> always starts with a memset of the whole optional to 0, while it doesn't with clang using the same libstdc++.

#include <optional>
struct AA {
  double a[1024];
#ifndef TRIVIAL
  AA(); AA(AA const&); AA& operator=(AA const&); ~AA();
#endif
};
typedef std::optional<AA> O;
// O fff(){ return {}; }
O fff(){ O o; return o; }

The .original dump has

*<retval> = {.D.34926={._M_payload={.D.34026={._M_empty={}}, ._M_engaged=0}}}

which looks good, it says it is initializing the small _M_empty part of the union, but the gimple dump has

*<retval> = {};

which eagerly zeroes everything.
Comment 1 Marc Glisse 2018-06-16 10:08:35 UTC
Note that constructing optional from std::nullopt does avoid the memset.
Comment 2 Richard Biener 2018-06-18 07:39:12 UTC
This is because of gimplification interpreting a CONSTURCTOR with missing elements as to clear them unless CONSTRUCTOR_NO_CLEARING is set (which isn't).

So the GENERIC _doesn't_ look good since it says (implicitely) that 'a' is
zeroed.

Confirmed as C++ issue.

There's also the following weak heuristic that might kick in if CONSTRUCTOR_NO_CLEARING would be set:

        else if (num_ctor_elements - num_nonzero_elements
                 > CLEAR_RATIO (optimize_function_for_speed_p (cfun))
                 && num_nonzero_elements < num_ctor_elements / 4)
          /* If there are "lots" of zeros, it's more efficient to clear
             the memory and then set the nonzero elements.  */
          cleared = true;

with CONSTRUCTOR_NO_CLEARING this heuristic is off by not honoring
the constructor elements being not present (but for the testcase it
doesn't matter).  CCing Eric for this specific issue (not the C++ one).
Comment 3 Eric Botcazou 2018-06-18 08:16:16 UTC
> There's also the following weak heuristic that might kick in if
> CONSTRUCTOR_NO_CLEARING would be set:
> 
>         else if (num_ctor_elements - num_nonzero_elements
>                  > CLEAR_RATIO (optimize_function_for_speed_p (cfun))
>                  && num_nonzero_elements < num_ctor_elements / 4)
>           /* If there are "lots" of zeros, it's more efficient to clear
>              the memory and then set the nonzero elements.  */
>           cleared = true;
> 
> with CONSTRUCTOR_NO_CLEARING this heuristic is off by not honoring
> the constructor elements being not present (but for the testcase it
> doesn't matter).  CCing Eric for this specific issue (not the C++ one).

Ugh.  I didn't write this but, yes, there is an oversight, the check on the flag should probably be on entry instead:

  if (CONSTRUCTOR_NO_CLEARING (ctor))
    cleared = false;
  else if (int_size_in_bytes (TREE_TYPE (ctor)) < 0)
   ...
  else if
Comment 4 Marc Glisse 2018-06-20 12:41:36 UTC
Recent related commits: r261758 r261735 (they don't fix the issue).
Comment 5 Marc Glisse 2019-09-27 18:08:31 UTC
A similar example was reported on https://stackoverflow.com/q/57964217/1918193