31268 – Non-deterministic bug producing a run-time infinite loop

Bug 31268 - Non-deterministic bug producing a run-time infinite loop

Summary: Non-deterministic bug producing a run-time infinite loop

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	middle-end (show other bugs)
Version:	4.2.0

Importance:	P1 normal
Target Milestone:	4.2.5
Assignee:	Not yet assigned to anyone

URL:
Keywords:	alias, wrong-code

Depends on:
Blocks:

Reported:	2007-03-19 16:15 UTC by Sylvain Pion
Modified:	2008-09-03 02:19 UTC (History)
CC List:	4 users (show)

See Also:
Host:	i686-pc-linux-gnu
Target:
Build:
Known to work:	4.3.0
Known to fail:
Last reconfirmed:

Attachments
pre-processed source file (161.33 KB, application/octet-stream) 2007-03-19 19:37 UTC, Sylvain Pion	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Sylvain Pion 2007-03-19 16:15:19 UTC

It took me several hours to try to extract a usable test-case for this issue,
so I hope you will be able to make sense out of it.

The problem : compiling with current g++ 4.2, with -O2, makes the attached
program loop infinitely, while it is not expected to.

It does not loop with g++ 4.3 or older versions than 4.2.  It also does not
loop when adding -fno-strict-aliasing with g++ 4.2, or when compiling with
-O only.

Even more strange : it does not loop when I remove some unused bits of
the program (which is why I had a hard time shrinking it down), for example
unused typedefs (look for "limb2").

The relevant part of the program is a small class (MP_Float) containing an
std::vector<short>, and some code around it, namely the operator_minus()
function which is called, and which loops.

I attach the pre-processed file, as well as the main small file so that you
can see what is the relevant part of it, and decide if it is a compiler bug
or an issue with my program.  My program does some type conversions, which
may be the source of the problem (triggering undefined behavior ?), but they
seem fine to me.

Comment 1 Sylvain Pion 2007-03-19 16:27:54 UTC

I will happily create the attachements when bugzilla will work...

Comment 2 Richard Biener 2007-03-19 16:32:16 UTC

Just wild guessing - try -fwrapv.

Comment 3 Andrew Pinski 2007-03-19 16:45:15 UTC

(In reply to comment #2)
> Just wild guessing - try -fwrapv.

Well it already does not look for -fno-strict-aliasing so I am assuming an aliasing bug in your code until you provide the sources.

Comment 4 Sylvain Pion 2007-03-19 16:50:50 UTC

(sorry I still can't create attachments)

-ftrapw makes the program work (no loop).

Let me copy-paste here the non-preprocessed source files which show everything
which is executed, while waiting for bugzilla to allow me to add the large
pre-processed file.

-----------------------------
#ifndef CGAL_MP_FLOAT_H
#define CGAL_MP_FLOAT_H

#include <vector>

typedef short      limb;   // unused
typedef int        limb2;  // unused

struct MP_Float
{
  typedef std::vector<short>  V;
  typedef V::iterator        iterator; // unused

  V v;
  int exp; // unused

  MP_Float(short i)
    : v(1)
  {
    v[0] = i;
    canonicalize();
  }

  void remove_leading_zeros()
  {
    while ((!v.empty()) && (v.back() == 0))
      v.pop_back();
  }

  void remove_trailing_zeros()
  {
    if (v.empty() || (v.front() != 0))
      return;

    V::iterator i = v.begin();
    for (++i; *i == 0; ++i) ;
    //v.erase(v.begin(), i);
  }

  void canonicalize()
  {
    remove_leading_zeros();
    remove_trailing_zeros();
  }

  // replacing int by std::size_t appears to also fix the loop...
  int max_exp() const
  {
    return v.size();
  }

  short of_exp(int i) const
  {
    if (i >= max_exp()) return 0;
    return v[i];
  }
};

// This union is used to convert an unsigned short to a short with
// the same binary representation, without invoking implementation-defined
// behavior (standard 4.7.3).
union to_signed {
    unsigned short us;
    short s;
};

inline
void split(int l, short & high, short & low)
{
    to_signed l2 = {l};
    low = l2.s;
    high = (l - low) >> 16;
}

MP_Float
operator_minus(const MP_Float &a, const MP_Float & b /* unused */)
{
  int max_exp = std::max(a.max_exp(), b.max_exp());

  MP_Float r(0);
  r.v.resize(2);
  for(int i = 0; i < max_exp ; ++i)
  {
    int tmp = r.v[i] + (a.of_exp(i) - b.of_exp(i));
    split(tmp, r.v[i+1], r.v[i]);
  }
  r.canonicalize();
  return r;
}

#endif // CGAL_MP_FLOAT_H

// #include <CGAL/MP_Float_loop.h>
#include <CGAL/number_type_basic_loop.h> // this one pulls up unrelated stuff but necessary for the bug to show up

int main()
{
  MP_Float a=2, b=1;
  MP_Float d= operator_minus(a, a);
}

-----------------------------

Comment 5 Sylvain Pion 2007-03-19 16:55:12 UTC

Subject: Re:  Non-deterministic bug producing a run-time infinite
 loop

Let me try to attach the pre-processed file through an email
attachement.

Comment 6 Sylvain Pion 2007-03-19 19:37:38 UTC

Created attachment 13235 [details]
pre-processed source file

Comment 7 Mark Mitchell 2007-03-26 01:44:39 UTC

Richard, are you able to confirm this bug?

Comment 8 Andrew Pinski 2007-03-26 01:59:12 UTC

This works on the trunk at least on powerpc-darwin.

Comment 9 Richard Biener 2007-03-26 09:46:20 UTC

Seems to work for me on x86_64 with -m32.

Comment 10 Sylvain Pion 2007-03-26 18:03:37 UTC

Let me mention that this is against 4.2.  The trunk works well for me.
I tried several times during March (including today), and the bug is still here.

Note that varrying the conditions slightly (even removing an innocent unused
typedef) makes it work, so I'm not surprised that testing on x86_64 or powerpc
makes it also work.  It sounds like usage of uninitialized memory or some bad
non-deterministic bug of this kind, although a quick test with valgrind did not
show anything.

Here are more details about my configuration: Fedora Core 5 on x86, compiler built with:
> .../g++ -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /proj/geometrica/home/GCC/gcc-4_2-branch/configure --disable-shared --disable-nls --enable-languages=c++ --prefix=/proj/geometrica/home/GCC/Linux-fc5-4.2
Thread model: posix
gcc version 4.2.0 20070324 (prerelease)

Comment 11 Andrew Pinski 2007-04-15 06:32:01 UTC

I can reproduce this with -O2, with -O2 -fno-strict-aliasing, the infinite loop goes away so this might be an aliasing violation.

Comment 12 Andrew Pinski 2007-04-15 06:32:50 UTC

But note this was with a compiler from March 9th so this might already be fixed.

Comment 13 Sylvain Pion 2007-04-17 13:17:45 UTC

I just built g++ 4.2 yesterday, and the failure is still there.
Note that if you want to check for an aliasing violation, even though the preprocessed code is huge, the parts which are executed are relatively small
(see comment #4).

Comment 14 Richard Biener 2007-12-11 18:58:26 UTC

I cannot reproduce this on native i686 either with

g++-4.2 --version
g++-4.2 (GCC) 4.2.3 20071123 (prerelease) (Debian 4.2.2-4)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Comment 15 Sylvain Pion 2007-12-23 13:34:18 UTC

I also cannot reproduce it with todays' g++ 4.2.

That said, the original code from which the test-case is extracted is still
failing.  The original code is from the CGAL library.  I can give a way
to reproduce it if someone is interested in digging this issue.

As original submitter, I do not care that much about it, since g++ 4.3 is not
affected, and there is an easy workaround for g++ 4.2 (-fno-strict-aliasing).

Comment 16 Andrew Pinski 2008-09-03 02:19:33 UTC

Lets mark this as fixed, there have been some aliasing fixes between the release of 4.2.0 and now.