Bug 40757 - gcc 4.4.0 miscompiles mpfr-2.4.1
Summary: gcc 4.4.0 miscompiles mpfr-2.4.1
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-15 00:36 UTC by Paul Zimmermann
Modified: 2009-10-04 11:01 UTC (History)
5 users (show)

See Also:
Host: sparc-sun-solaris2.10
Target: sparc-sun-solaris2.10
Build: sparc-sun-solaris2.10
Known to work:
Known to fail:
Last reconfirmed:


Attachments
preprocessed version of the file mpn_exp.c from mpfr-2.4.1 (10.58 KB, text/plain)
2009-07-16 07:52 UTC, Paul Zimmermann
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Zimmermann 2009-07-15 00:36:44 UTC
See http://websympa.loria.fr/wwsympa/arc/mpfr/2009-07/msg00031.html
and the following discussion.

This was on t2.math.washington.edu with /usr/local/gcc-4.4.0-sun-linker/bin/gcc:
zimmerma@t2:/tmp/mpfr-2.4.1$ /usr/local/gcc-4.4.0-sun-linker/bin/gcc -v
Using built-in specs.
Target: sparc-sun-solaris2.10
Configured with: /home/kirkby/gcc-4.4.0/configure CC=/usr/sfw/bin/gcc --prefix=/usr/local/gcc-4.4.0-sun-linker --without-gnu-as --without-gnu-ld --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld --enable-languages=c,c++,fortran --with-mpfr-lib=/usr/local/lib --with-mpfr-include=/usr/local/include --with-gmp-include=/usr/local/include --with-gmp-lib=/usr/local/lib --with-libiconv-prefix=/usr/lib/iconv
Thread model: posix
gcc version 4.4.0 (GCC)
Comment 1 Paul Zimmermann 2009-07-15 02:02:00 UTC
Note this bug was noticed on a Sun T5240, and might be specific to T2+.
David Kirkby offers access to the machine for gcc developers who might want
to reproduce/isolate/fix the bug.
Comment 2 Eric Botcazou 2009-07-15 06:27:11 UTC
Thanks for the report, but we need a preprocessed testcase, see instructions at
  http://gcc.gnu.org/bugs.html
Comment 3 Richard Biener 2009-07-15 09:55:11 UTC
I would also recommend to try a newer snapshot from the gcc 4.4 release branch.
Comment 4 Mikael Pettersson 2009-07-15 13:15:32 UTC
mpfr-2.4.1 compiles and tests Ok for me on an Ultra5 (USIIi) running sparc64-linux, with gmp-4.2.4 (compiled by gcc-4.3.4) and gcc 4.3.4, 4.4.0, and 4.4.1 20090630.

I don't have a T2, but could possibly do some tests on USIIIi/Solaris 9.
Comment 5 Paul Zimmermann 2009-07-16 07:52:40 UTC
Created attachment 18203 [details]
preprocessed version of the file mpn_exp.c from mpfr-2.4.1

Note that replacing line 74:
   MPN_ZERO (a, n - 1);
by:
   { int n1 = n - 1; MPN_ZERO (a, n1); }
fixes the problem, where MPN_ZERO is defined as:
#define MPN_ZERO(dst, n) memset((dst), 0, (n)*BYTES_PER_MP_LIMB)
and BYTES_PER_MP_LIMB is 4.

If I write "size_t n1" or "unsigned int n1" above instead of "int n1",
the bug reappears.
Comment 6 Mikael Pettersson 2009-07-16 08:31:14 UTC
(In reply to comment #5)
> Created an attachment (id=18203) [edit]
> preprocessed version of the file mpn_exp.c from mpfr-2.4.1
> 
> Note that replacing line 74:
>    MPN_ZERO (a, n - 1);
> by:
>    { int n1 = n - 1; MPN_ZERO (a, n1); }
> fixes the problem, where MPN_ZERO is defined as:
> #define MPN_ZERO(dst, n) memset((dst), 0, (n)*BYTES_PER_MP_LIMB)
> and BYTES_PER_MP_LIMB is 4.
> 
> If I write "size_t n1" or "unsigned int n1" above instead of "int n1",
> the bug reappears.

Sounds a lot like PR39867 and PR40747 are hitting you. Can you grab those fixes, apply them to your 4.4.0, rebuild it, and test mpfr again? Or get the 4.4.1-RC and test that instead.

I just finished building 4.3.4 and 4.4.0 on USIIIi/Solaris 9, and they built gmp-4.2.4 and mpfr-2.4.1 fine, with both passing make check.
Comment 7 Dr. David Kirkby 2009-07-16 10:19:09 UTC
(In reply to comment #4)
> mpfr-2.4.1 compiles and tests Ok for me on an Ultra5 (USIIi) running
> sparc64-linux, with gmp-4.2.4 (compiled by gcc-4.3.4) and gcc 4.3.4, 4.4.0, and
> 4.4.1 20090630.
> 
> I don't have a T2, but could possibly do some tests on USIIIi/Solaris 9.
> 

I believe the problem is likely to only be seen on the T2+ or similar processors. Someone noticed the library code in OpenSolaris is different on the T2+ processor to what it would be on a more common SPARC processor. 

I have built gcc 4.4.0 on Solaris 10 on a Sun Blade 2000 (UltraSPARC II processors) and have no problem with mpfr, but on the T2+ processors of the T5240 server, it does not work.  

I'll try a later snapshot, but any serious gcc developer would be welcome to an account on the machine where it fails. That does not mean anyone that just happens to be interested and fancies playing on a 16-core machine, but if you are a serious gcc developer, then I could give you an account. 

Dave 

Comment 8 Dr. David Kirkby 2009-07-16 10:24:21 UTC
(In reply to comment #4)
 
> Sounds a lot like PR39867 and PR40747 are hitting you. Can you grab those
> fixes, apply them to your 4.4.0, rebuild it, and test mpfr again? Or get the
> 4.4.1-RC and test that instead.
> 
> I just finished building 4.3.4 and 4.4.0 on USIIIi/Solaris 9, and they built
> gmp-4.2.4 and mpfr-2.4.1 fine, with both passing make check.
 
I can't see how it is similar to PR39867 and PR40747 only occurs at an optimisation of 1 or higher. This occurs with no optimisation at all. 

Dave 
Comment 9 Jakub Jelinek 2009-07-16 10:31:11 UTC
folding happens even at -O0 and both bugs are in the folder.  So, please try
ftp://sources.redhat.com/pub/gcc/snapshots/4.4.1-RC-20090715/
first.
Comment 10 Dr. David Kirkby 2009-07-16 12:32:37 UTC
(In reply to comment #9)
> folding happens even at -O0 and both bugs are in the folder.  So, please try
> ftp://sources.redhat.com/pub/gcc/snapshots/4.4.1-RC-20090715/
> first.
> 

I tried it. 

kirkby@t2:[/tmp/kirkby/mpfr-2.4.1] $  gcc -v
Using built-in specs.
Target: sparc-sun-solaris2.10
Configured with: ../gcc-4.4.1-RC-20090715/configure --prefix=/usr/local/gcc-4.4.1-RC-20090715-sun-linker --with-as=/usr/ccs/bin/as --without-gnu-as --with-ld=/usr/ccs/bin/ld --without-gnu-ld --enable-languages=c,c++,fortran --with-mpfr-include=/usr/local/include --with-mpfr-lib=/usr/local/lib --with-gmp-include=/usr/local/include --with-gmp-lib=/usr/local/lib
Thread model: posix
gcc version 4.4.1 20090715 (prerelease) (GCC)


but again got 20 test failures on mpfr 2.4.1
Comment 11 Jakub Jelinek 2009-07-16 14:07:18 UTC
You haven't mentioned what options you compiled this file with.  So, assuming -O2, I see:
        add     %i4, -1, %l5    ! n,, tmp186
        sethi   %hi(1073740800), %o2    !, tmp189
        sll     %l5, 2, %l5     ! tmp186,, D.4491
        or      %o2, 1023, %o2  ! tmp189,, tmp188
        st      %g1, [%i0+%l5]  !,* D.4491
        add     %i4, %o2, %o2   ! n, tmp188, tmp187
        mov     %i0, %o0        ! a,
        sll     %o2, 2, %o2     ! tmp187,,
        call    memset, 0       !,
         mov    0, %o1  !,
for this memset call, which looks correct to me.  The st %g1, [%i0+%l5] line
stores to %i0 a[n-1] and memset is called with memset (a, 0, (n + 0x3fffffffU) << 2);  So, if this doesn't work (and you see the same), you hit a bug in Solaris memset implementation, which doesn't handle properly length with garbage in upper 32-bits, guess it could use brz,pn %o2, do_nothing or something similar, which is fine for 64-bit code, but certainly not for 32-bit code.
Comment 12 Marc Glisse 2009-07-16 20:34:29 UTC
(In reply to comment #11)
> for this memset call, which looks correct to me.  The st %g1, [%i0+%l5] line
> stores to %i0 a[n-1] and memset is called with memset (a, 0, (n + 0x3fffffffU)
> << 2);  So, if this doesn't work (and you see the same), you hit a bug in
> Solaris memset implementation, which doesn't handle properly length with
> garbage in upper 32-bits, guess it could use brz,pn %o2, do_nothing or
> something similar, which is fine for 64-bit code, but certainly not for 32-bit
> code.

The sun4v implementation in opensolaris looks fine (but may not have been backported to solaris 10). The following one on the other hand seems to use the same brnz for 32 and 64 bit code:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/sparc_hwcap1/common/gen/memset.s#88

It would be good to know which implementation is used...
Comment 13 Dr. David Kirkby 2009-07-16 21:29:11 UTC
(In reply to comment #11)
> You haven't mentioned what options you compiled this file with.  So, assuming
> -O2, I see:
>         add     %i4, -1, %l5    ! n,, tmp186
>         sethi   %hi(1073740800), %o2    !, tmp189
>         sll     %l5, 2, %l5     ! tmp186,, D.4491
>         or      %o2, 1023, %o2  ! tmp189,, tmp188
>         st      %g1, [%i0+%l5]  !,* D.4491
>         add     %i4, %o2, %o2   ! n, tmp188, tmp187
>         mov     %i0, %o0        ! a,
>         sll     %o2, 2, %o2     ! tmp187,,
>         call    memset, 0       !,
>          mov    0, %o1  !,
> for this memset call, which looks correct to me.  The st %g1, [%i0+%l5] line
> stores to %i0 a[n-1] and memset is called with memset (a, 0, (n + 0x3fffffffU)
> << 2);  So, if this doesn't work (and you see the same), you hit a bug in
> Solaris memset implementation, which doesn't handle properly length with
> garbage in upper 32-bits, guess it could use brz,pn %o2, do_nothing or
> something similar, which is fine for 64-bit code, but certainly not for 32-bit
> code.
> 

I should add we are using the Sun assembler and linker, not the GNU ones. I don't know whether that would effect the output. If so, can you tell me how to generate the assembler code? Or as I say, you can have an account. 
Comment 14 Paul Zimmermann 2009-07-17 00:57:46 UTC
> You haven't mentioned what options you compiled this file with. 

the problem appears both with -O0, -O1 and -O2.

Paul
Comment 15 Dr. David Kirkby 2009-07-17 03:21:55 UTC
(In reply to comment #14)
> > You haven't mentioned what options you compiled this file with. 
> 
> the problem appears both with -O0, -O1 and -O2.
> 
> Paul
> 

Also worth noting is that this builds fine with some versions of gcc

gcc 4.1.1 OK
gcc 4.2.1 OK  
gcc 4.2.4 OK 

I could build gcc 4.3.0, but it would never install properly, so I have not tested on gcc 4.3.0. 

gcc 4.3.1 MPFR        fails 20 tests
gcc 4.3.3 MPFR        fails 20 tests
gcc 4.4.0 MPFR        fails 20 tests. 
gcc-4.4.1-RC-20090715 MPFR fails 20 tests
Comment 16 Dr. David Kirkby 2009-07-17 04:11:13 UTC
(In reply to comment #0)
> See http://websympa.loria.fr/wwsympa/arc/mpfr/2009-07/msg00031.html
> and the following discussion.
> 
> This was on t2.math.washington.edu with
> /usr/local/gcc-4.4.0-sun-linker/bin/gcc:
> zimmerma@t2:/tmp/mpfr-2.4.1$ /usr/local/gcc-4.4.0-sun-linker/bin/gcc -v
> Using built-in specs.
> Target: sparc-sun-solaris2.10
> Configured with: /home/kirkby/gcc-4.4.0/configure CC=/usr/sfw/bin/gcc
> --prefix=/usr/local/gcc-4.4.0-sun-linker --without-gnu-as --without-gnu-ld
> --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld
> --enable-languages=c,c++,fortran --with-mpfr-lib=/usr/local/lib
> --with-mpfr-include=/usr/local/include --with-gmp-include=/usr/local/include
> --with-gmp-lib=/usr/local/lib --with-libiconv-prefix=/usr/lib/iconv
> Thread model: posix
> gcc version 4.4.0 (GCC)


It would be useful if we had an exact statement of the problem, in terms of "memset fails for ..." or whatever. I'd ask a few Sun people to have a look here, and comment, but it's unclear precisely what is believed to be the issue.  

Comment 17 Jakub Jelinek 2009-07-17 07:49:21 UTC
Try:
typedef __SIZE_TYPE__ size_t;
extern void *memset (void *, const void *, size_t);
extern void abort (void);
volatile size_t i = 0x80000000U, j = 0x80000000U;
char buf[16];

int main (void)
{
  if (sizeof (size_t) != 4)
    return 0;
  buf[0] = 6;
  memset (buf, 0, i + j);
  if (buf[0] != 6)
    abort ();
  return 0;
}
In 32-bit code, size_t is 32-bit, memset is called with buf, 0, 0 arguments, but the %o2 register passed to it doesn't contain 0, but 0x100000000.  That is fine, this is 32-bit code, so the uppermost bits are always undefined.  But when 32-bit memset uses brnz %o2, ... instruction, it needs to first zero-extend it to 64-bits, as brnz operates on all 64 bits only.  Or not use brnz, but compare and be %icc, ...
Comment 18 Dr. David Kirkby 2009-07-17 11:19:13 UTC
(In reply to comment #17)
> Try:
> typedef __SIZE_TYPE__ size_t;
> extern void *memset (void *, const void *, size_t);
> extern void abort (void);
> volatile size_t i = 0x80000000U, j = 0x80000000U;
> char buf[16];
> 
> int main (void)
> {
>   if (sizeof (size_t) != 4)
>     return 0;
>   buf[0] = 6;
>   memset (buf, 0, i + j);
>   if (buf[0] != 6)
>     abort ();
>   return 0;
> }
> In 32-bit code, size_t is 32-bit, memset is called with buf, 0, 0 arguments,
> but the %o2 register passed to it doesn't contain 0, but 0x100000000.  That is
> fine, this is 32-bit code, so the uppermost bits are always undefined.  But
> when 32-bit memset uses brnz %o2, ... instruction, it needs to first
> zero-extend it to 64-bits, as brnz operates on all 64 bits only.  Or not use
> brnz, but compare and be %icc, ...
> 


I've compiled and linked that code. It does not abort. i.e. I do NOT see:
Abort (core dumped).

In 64-bit mode, it issues a warning:

$ gcc -m64  check.c
check.c:2: warning: conflicting types for built-in function 'memset'

but again builds an executable which exits normally. 

If you want the assembler output posted from Tim's preprocessed file, let me know how to do it and I'll do it, just in case gcc is behaving differently on this machine to yours.

As I say, if you want an account Jakub, you can have one. Just tell me a user name. It might be the simplest way for you to look at this. 

Dave 
Comment 19 Marc Glisse 2009-07-17 15:51:20 UTC
(In reply to comment #18)
> I've compiled and linked that code. It does not abort.

Bad point for this theory :-(
The result could still depend on the alignment of the pointer, so you could try replacing buf[0] by buf[4] and calling memset(buf+4,...) (try several small values instead of 4).
Comment 20 Dr. David Kirkby 2009-07-18 01:14:09 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > I've compiled and linked that code. It does not abort.
> 
> Bad point for this theory :-(
> The result could still depend on the alignment of the pointer, so you could try
> replacing buf[0] by buf[4] and calling memset(buf+4,...) (try several small
> values instead of 4).
> 

I tried this in a loop. It still works, with no abort. 


typedef __SIZE_TYPE__ size_t;
extern void *memset (void *, const void *, size_t);
extern void abort (void);
volatile size_t i = 0x80000000U, j = 0x80000000U;
char buf[16];

int main (void)
{
  int n;

  for (n=0 ; n <=64;++n) {
    if (sizeof (size_t) != 4)
      return 0;
    buf[n] = 6;
    memset (buf+n, 0, i + j);
    if (buf[0] != 6)
      abort ();
  }
  return 0;
}

I extended 'n' to larger values and find it does abort when it is 5592, but not exactly the 'small' value you were looking for. 

Comment 21 Dr. David Kirkby 2009-07-18 01:18:37 UTC
(In reply to comment #20)
> (In reply to comment #19)
> > (In reply to comment #18)
> > > I've compiled and linked that code. It does not abort.
> > 
> > Bad point for this theory :-(
> > The result could still depend on the alignment of the pointer, so you could try
> > replacing buf[0] by buf[4] and calling memset(buf+4,...) (try several small
> > values instead of 4).
> > 
> 
> I tried this in a loop. It still works, with no abort. 
> 
> 
> typedef __SIZE_TYPE__ size_t;
> extern void *memset (void *, const void *, size_t);
> extern void abort (void);
> volatile size_t i = 0x80000000U, j = 0x80000000U;
> char buf[16];
> 
> int main (void)
> {
>   int n;
> 
>   for (n=0 ; n <=64;++n) {
>     if (sizeof (size_t) != 4)
>       return 0;
>     buf[n] = 6;
>     memset (buf+n, 0, i + j);
>     if (buf[0] != 6)
>       abort ();
>   }
>   return 0;
> }
> 
> I extended 'n' to larger values and find it does abort when it is 5592, but not
> exactly the 'small' value you were looking for. 
> 

Looking closely, you only declared allocated 16 bytes for the buffer, so it was hardly surprising it aborted at large n. When I increased the buffer size, so it worked as expected, even for n as large as 10000

dave
Comment 22 Marc Glisse 2009-07-18 04:50:06 UTC
(In reply to comment #20)
>     buf[n] = 6;
>     memset (buf+n, 0, i + j);
>     if (buf[0] != 6)

It looks like you forgot to replace the second buf[0] by buf[n].
Comment 23 Dr. David Kirkby 2009-07-18 14:36:14 UTC
(In reply to comment #22)
> (In reply to comment #20)
> >     buf[n] = 6;
> >     memset (buf+n, 0, i + j);
> >     if (buf[0] != 6)
> 
> It looks like you forgot to replace the second buf[0] by buf[n].
> 

Sorry, my mistake. This is the current code and it does show something intersting. 

drkirkby@kestrel:[~] $ cat check2.c
#include <stdio.h>
typedef __SIZE_TYPE__ size_t;
extern void *memset (void *, const void *, size_t);
extern void abort (void);
volatile size_t i = 0x80000000U, j = 0x80000000U;
char buf[16];

int main (void)
{
  int n;

  for (n=0 ; n <=10;++n) {
    printf("n=%d\n",n);
    if (sizeof (size_t) != 4)
      return 0;
    buf[n] = 6;
    memset (buf+n, 0, i + j);
    if (buf[n] != 6)
      abort ();
  }
  return 0;
}

On the machine with the T2+ processor, which fails the mpfr tests:

kirkby@t2:[~] $ gcc -Wall check2.c
check2.c: In function 'main':
check2.c:17: warning: null argument where non-null required (argument 2)
kirkby@t2:[~] $ ./a.out
n=0
n=1
Abort (core dumped)

I tried with the Sun compiler too, but that refuses to compile the code. 

kirkby@t2:[~] $ /opt/SUNWspro/bin/cc check2.c
"check2.c", line 2: warning: no explicit type given
"check2.c", line 2: syntax error before or at: size_t
"check2.c", line 2: warning: useless declaration
cc: acomp failed for check2.c


However, when I went back and took away all the loop stuff and dropped it to the bare minimum. 

kirkby@t2:[~] $ cat check3.c
typedef __SIZE_TYPE__ size_t;
extern void *memset (void *, const void *, size_t);
extern void abort (void);
volatile size_t i = 0x80000000U, j = 0x80000000U;
char buf[16];

int main (void)
{
  if (sizeof (size_t) != 4)
    return 0;
  buf[1] = 6;
  memset (buf+1, 0, i + j);
  if (buf[1] != 6)
    abort ();
  return 0;
}

then it does not dump core. 

kirkby@t2:[~] $ gcc check3.c
kirkby@t2:[~] $ ./a.out
kirkby@t2:[~] $

In fact, trying a few small values manually, without the loop, causes no problem. 


I also tried it on my own personal machine, a Sun Blade 2000 with the sun4u architecture (unlike the T5240 with it's T2+ processors, which is sun4v). 

drkirkby@kestrel:[~] $ ./a.out
n=0
n=1
n=2
n=3
n=4
n=5
n=6
n=7
n=8
n=9
n=10
Comment 24 Dr. David Kirkby 2009-07-18 14:44:23 UTC
I should have added, the core dumps were observed on gcc versions

3.4.3
4.2.4
4.4.0
4.4.1 20090715 (prerelease) 

on the Sun T5240 with it's T2+ processors. 

The success on the Sun Blade 2000 was only tried with gcc 3.4.4 and 4.4.0. 

To me at least, that does rather point the finger at Solaris's memset(). 

Comment 25 Dr. David Kirkby 2009-07-18 19:33:10 UTC
(In reply to comment #24)
> I should have added, the core dumps were observed on gcc versions
> 
> 3.4.3
> 4.2.4
> 4.4.0
> 4.4.1 20090715 (prerelease) 
> 
> on the Sun T5240 with it's T2+ processors. 
> 
> The success on the Sun Blade 2000 was only tried with gcc 3.4.4 and 4.4.0. 
> 
> To me at least, that does rather point the finger at Solaris's memset(). 
> 

From what I now understand of this, it does indeed indicate a bug in the memset() implementation on Solaris for this particular machine. 



Comment 26 Dr. David Kirkby 2009-10-01 08:34:41 UTC
I should have added this some time back, but Sun have confirmed this is a bug. I should have a IDR from Sun myself in a couple of weeks, and a patch will be on sunsolve some time after that. It will be fixed in Solaris 10 update 8. 
Comment 27 Eric Botcazou 2009-10-04 11:01:11 UTC
Not a GCC bug.