Bug 69534

Summary: [6 Regression] openjade is miscompiled
Product: gcc Reporter: Richard Biener <rguenth>
Component: c++Assignee: Richard Biener <rguenth>
Status: RESOLVED INVALID    
Severity: normal CC: jason, webrown.cpp
Priority: P3    
Version: 6.0   
Target Milestone: 6.0   
Host: Target: x86_64-*-*, i?86-*-*, ppc64le-*-*
Build: Known to work:
Known to fail: Last reconfirmed: 2016-02-15 00:00:00

Description Richard Biener 2016-01-28 12:21:13 UTC
openjade is miscompiled at -O[12], -O0 works.  Running it on its testsuite segfaults, running it on files from docbook produces segfaults with backtraces
from its garbage collector 'Collector' class invoking a virtual destructor.

Not further tracked down yet.
Comment 1 Andrew Pinski 2016-01-30 07:37:59 UTC
Try using -fno-delete-null-pointer-checks, a lot of C++ code violates the rule about calling a class method with a null pointer.
Comment 2 rguenther@suse.de 2016-01-30 10:35:54 UTC
On January 30, 2016 8:37:59 AM GMT+01:00, "pinskia at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69534
>
>--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>Try using -fno-delete-null-pointer-checks, a lot of C++ code violates
>the rule
>about calling a class method with a null pointer.

I did. It doesn't help. Neither does disabling devirtualization.
Comment 3 Pavel Raiskup 2016-02-11 12:34:47 UTC
Issue seen on Fedora 24 too:
https://bugzilla.redhat.com/show_bug.cgi?id=1306162
Comment 4 Pavel Raiskup 2016-02-11 13:27:29 UTC
Seems like -fno-tree-dse helps on Fedora.
Comment 5 rguenther@suse.de 2016-02-11 13:29:11 UTC
On Thu, 11 Feb 2016, praiskup at redhat dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69534
> 
> --- Comment #4 from Pavel Raiskup <praiskup at redhat dot com> ---
> Seems like -fno-tree-dse helps on Fedora.

Can you bisect to a file?
Comment 6 Kamil Dudka 2016-02-12 14:22:24 UTC
(In reply to rguenther@suse.de from comment #5)
> Can you bisect to a file?

The cause seems to be incorrectly compiled code of the following method:

inline
void *Collector::allocateObject(bool hasFinalizer)
{
  if (freePtr_ == &allObjectsList_)
    makeSpace();
  Object *tem = freePtr_;
  freePtr_ = freePtr_->next();
  tem->setColor(currentColor_);
  tem->hasFinalizer_ = hasFinalizer;
  if (hasFinalizer)
    tem->moveAfter(&allObjectsList_);
  return tem;
}

If I move the method from Collector.h to Collector.cxx and make it non-inline, the problem goes away because the value of hasFinalizer is no longer known at compile time.

Otherwise, from the modules where Collector::allocateObject() is called with hasFinalizer equal to zero (I inspected the assembler code generated out of primitive.cxx with/without -fno-tree-dse), the optimizer correctly drops the code:

  if (hasFinalizer)
    tem->moveAfter(&allObjectsList_);

... but at the same time it incorrectly drops also the two assignments above the condition:

  tem->setColor(currentColor_);
  tem->hasFinalizer_ = hasFinalizer;

Consequently one can see many uses of uninitialized values in valgrind's output.  If I qualify the 'tem' pointer as volatile in order to suppress the optimization, the code of openjade starts to work reliably, even if compiled with default compilation flags, and with completely clean valgrind's output:

--- a/style/Collector.h
+++ b/style/Collector.h
@@ -138,11 +138,11 @@ void Collector::Object::moveAfter(Object *tail)
 inline
 void *Collector::allocateObject(bool hasFinalizer)
 {
   if (freePtr_ == &allObjectsList_)
     makeSpace();
-  Object *tem = freePtr_;
+  Object *volatile tem = freePtr_;
   freePtr_ = freePtr_->next();
   tem->setColor(currentColor_);
   tem->hasFinalizer_ = hasFinalizer;
   if (hasFinalizer)
     tem->moveAfter(&allObjectsList_);

Unless there is some undefined behavior in the code, this looks like a bug in the optimizer.
Comment 7 Richard Biener 2016-02-15 10:41:57 UTC
Thanks for the detective work, I'll investigate further.
Comment 8 Richard Biener 2016-02-15 11:54:46 UTC
We for example have

  tem_23->hasFinalizer_ = 1;
...
  MEM[(struct Object *)interp_8(D) + 16B].next_ = tem_23;
  MEM[(struct  &)tem_23] ={v} {CLOBBER};

which means that tem_23 escapes through memory pointed to by a paramter but
is marked as lifetime ended.

This looks like it is because the new expression in

DEFPRIMITIVE(CurrentNodePageNumberSosofo, argc, argv, context, interp, loc)
{
  if (!context.currentNode)
    return noCurrentNodeError(interp, loc);
  return new (interp) CurrentNodePageNumberSosofoObj(context.currentNode);
}

ends up generating


;; Function OpenJade_DSSSL::CurrentNodePageNumberSosofoObj::CurrentNodePageNumberSosofoObj(const OpenJade_Grove::NodePtr&) (null)
;; enabled by -tree-original

<<cleanup_point <<< Unknown tree: expr_stmt
  (void) (*(struct
  {
    struct
    {
      struct
      {
        struct
        {
          int (*__vtbl_ptr_type) () * _vptr.Object;
          struct Object * prev_;
          struct Object * next_;
          char color_;
          char hasFinalizer_;
          char hasSubObjects_;
          char readOnly_;
        } D.16565;
      } D.31225;
    } D.31630;
    struct NodePtr node_;
  } &) this = {CLOBBER}) >>>>>;
{
  <<cleanup_point <<< Unknown tree: expr_stmt
  OpenJade_DSSSL::SosofoObj::SosofoObj (&((struct CurrentNodePageNumberSosofoObj *) this)->D.31629) >>>>>;
  try
    {
      <<cleanup_point <<< Unknown 
...

and thus first clobbering the memory state of the object!  That of course
interferes with the allocator GCC sees.

Remember those objects have new operators that first call

inline
void *Collector::allocateObject(bool hasFinalizer)
{
  if (freePtr_ == &allObjectsList_)
    makeSpace();
  Object *tem = freePtr_;
  freePtr_ = freePtr_->next();
  tem->setColor(currentColor_);
  tem->hasFinalizer_ = hasFinalizer;
  if (hasFinalizer)
    tem->moveAfter(&allObjectsList_);
  return tem;
}
Comment 9 Richard Biener 2016-02-15 12:05:00 UTC
So the issue here is that the allocator state is embedded into the actual objects
but the C++ standard defines that the old object content becomes undefined after it is re-used (allocation) or destructed.

openjade is simply non-conforming here.  -fno-lifetime-dse helps.

"Caused" by r222135.

I suppose this warrants some notice in porting_to.html.
Comment 10 Kamil Dudka 2016-02-15 17:02:23 UTC
It makes sense to me.  Thanks for the explanation!