Bug 33383 - [4.3 Regression] Revision 128092 miscompiles 400.perlbench
Summary: [4.3 Regression] Revision 128092 miscompiles 400.perlbench
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.3.0
: P1 blocker
Target Milestone: 4.3.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on: 30375
Blocks: 33389
  Show dependency treegraph
 
Reported: 2007-09-11 05:01 UTC by H.J. Lu
Modified: 2007-10-27 09:49 UTC (History)
6 users (show)

See Also:
Host:
Target: x86_64-*-*, i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
perl-5.8.7-aliasing.patch (858 bytes, patch)
2007-10-27 09:42 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2007-09-11 05:01:59 UTC
Revision 128092 miscompiles 400.perlbench in SPEC CPU 2006
on Linux/x86-64 with -O2 -ffast-math. I got

Running Benchmarks
  Running 400.perlbench ref base o2 default

Kernel killed it:

Out of Memory: Killed process 27337 (perlbench_base.).
Comment 1 Jan Hubicka 2007-09-11 14:10:29 UTC
This seems to be latent problem unconvered by -finline-small-functions patch.  Perlbmk was failing at -O3 on older SVN snapshots too.  -finline-small-functions should work as an workaround until the issue is identified (it also might be aliasing problem in perl)

Honza
Comment 2 H.J. Lu 2007-09-11 14:16:36 UTC
It is very likely that it is perl which is miscompiled. It looks quite
serious to me.
Comment 3 Jan Hubicka 2007-09-11 14:22:30 UTC
Sure, it will be neccessary to do some binary to figure out what inlining breaks the benchmark.  Not having the CPU2006 sources, I will be able to do so until Monday.  But if I remember right, it reproduced down to 3.x GCC's.
Comment 4 H.J. Lu 2007-09-11 23:29:51 UTC
This workaround:

--- gcc/opts.c.small    2007-09-09 12:56:26.000000000 -0700
+++ gcc/opts.c  2007-09-11 07:13:15.000000000 -0700
@@ -822,7 +822,9 @@ decode_options (unsigned int argc, const
 
   if (optimize >= 2)
     {
+#if 0
       flag_inline_small_functions = 1;
+#endif
       flag_thread_jumps = 1;
       flag_crossjumping = 1;
       flag_optimize_sibling_calls = 1;

works for me.
Comment 5 Richard Biener 2007-09-24 11:06:53 UTC
The regular perl testsuite fails as well:

t/op/override.............................ok
t/op/pack.................................Invalid type ']' in unpack at op/pack.t line 1363.
# Looks like you planned 13864 tests but ran 4348.
FAILED--expected 13864 tests, saw 4348

this is with perl 5.8.8 and building with -O2.  It might just be a latent
problem with perl though.
Comment 6 Richard Biener 2007-09-25 13:59:47 UTC
Perl itself passes with -fno-tree-dse (see PR33389), I didn't check 400.perlbench.
Comment 7 Richard Biener 2007-09-26 11:57:36 UTC
This should now be fixed.  Waiting for next results from the spec-tester to
verify.
Comment 8 H.J. Lu 2007-09-26 17:39:51 UTC
Revision 128810:

http://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00806.html

doesn't fix 400.perlbench on Linux/x86-64. I am testing revision 128815:

http://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00812.html
Comment 9 H.J. Lu 2007-09-26 19:57:38 UTC
(In reply to comment #8)
> Revision 128810:
> 
> http://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00806.html
> 
> doesn't fix 400.perlbench on Linux/x86-64. I am testing revision 128815:
> 
> http://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00812.html

Revision 128815 doesn't fix the problem. perlbench executable still ran
out of memory on Linux/x86-64 with -O2 -ffast-math.

Comment 10 Richard Biener 2007-10-04 11:58:02 UTC
It fixes it for me to build 400.perlbench with -fno-strict-aliasing.  The
different inlining decisions after r128092 might expose some problems with
the benchmark.
Comment 11 Mark Mitchell 2007-10-10 17:57:00 UTC
I understand that we don't know whether this is a problem in GCC or in Perl.  However, until we know, I think this should be P1 -- having GCC releases that don't work with SPEC, without an explanation, undermines GCC's credibility.
Comment 12 Jakub Jelinek 2007-10-12 20:22:19 UTC
Trying to create self-contained testcase from it.  So far I have found which
function is miscompiled at -O2 (and not -O2 -fno-strict-aliasing) if only
3 other functions are inlined into it, the rest is noinline.  Unfortunately
the function is quite large.
Comment 13 Jakub Jelinek 2007-10-15 11:20:18 UTC
To me this looks like aliasing violation in 400.perlbench (and in perl 5.8.8
too, so I'll cite the latter).

#define       IVTYPE          long            /**/
typedef IVTYPE IV;
typedef struct xpviv XPVIV;
typedef size_t STRLEN;
struct xpviv {
    char *      xpv_pv;         /* pointer to malloced string */
    STRLEN      xpv_cur;        /* length of xpv_pv as a C string */
    STRLEN      xpv_len;        /* allocated size */
    IV          xiv_iv;         /* integer value or pv offset */
};
#   define STRUCT_OFFSET(s,m)  (Size_t)(&(((s *)0)->m))
#define STATIC static
#  define pTHX_
#  define LOCK_SV_MUTEX
#  define UNLOCK_SV_MUTEX
extern IV * PL_xiv_root;

In sv.c, we have:

STATIC void
S_del_xiv(pTHX_ XPVIV *p)
{
    IV* xiv = (IV*)((char*)(p) + STRUCT_OFFSET(XPVIV, xiv_iv));
    LOCK_SV_MUTEX;
    *(IV**)xiv = PL_xiv_root;
    PL_xiv_root = xiv;
    UNLOCK_SV_MUTEX;
}

The problem is when Perl_sv_upgrade function is called, with (sv->sv_flags & 0xff) == SVt_IV, mt == SVt_PVNV, ((XPVIV*) (sv)->sv_any)->xiv_iv == 0 and
PL_xiv_root != NULL:
    IV          iv;
...
    switch (SvTYPE(sv)) {
    case SVt_NULL:
        break;
    case SVt_IV:
        iv      = SvIVX(sv);
        del_XIV(SvANY(sv));
        if (mt == SVt_NV)
            mt = SVt_PVNV;
        else if (mt < SVt_PVIV)
            mt = SVt_PVIV;
        break;
...
    switch (mt) {
    case SVt_PVNV:
        SvANY(sv) = new_XPVNV();
        SvPV_set(sv, pv);
        SvCUR_set(sv, cur);
        SvLEN_set(sv, len);
        SvIV_set(sv, iv);
        SvNV_set(sv, nv);
        break;
...

or better preprocessed:

    switch (((sv)->sv_flags & 0xff)) {
    case SVt_NULL:
 break;
    case SVt_IV:
 iv = ((XPVIV*) (sv)->sv_any)->xiv_iv;
 S_del_xiv((XPVIV*) (sv)->sv_any);
 if (mt == SVt_NV)
     mt = SVt_PVNV;
 else if (mt < SVt_PVIV)
     mt = SVt_PVIV;   
 break;
...
    switch (mt) {
...
    case SVt_PVNV:
 (sv)->sv_any = (void*)S_new_xpvnv();
 ((XPV*) (sv)->sv_any)->xpv_pv = pv; 
 ((XPV*) (sv)->sv_any)->xpv_cur = cur;
 ((XPV*) (sv)->sv_any)->xpv_len = len;
 ((XPVIV*) (sv)->sv_any)->xiv_iv = iv;
 ((XPVNV*)(sv)->sv_any)->xnv_nv = nv; 
 break;

when sv.c is compiled with -O2 -fstrict-aliasing, then the
*(IV**)xiv = PL_xiv_root; write isn't considered aliased with the
iv = ((XPVIV*) (sv)->sv_any)->xiv_iv;, because the former accesses the object
as long *, while the latter as long.  The difference between -O2 -fstrict-aliasing and -O2 -fno-strict-aliasing compiled sv.c is then that
in this call ((XPVIV*) (sv)->sv_any)->xiv_iv = iv; stores (long) PL_xiv_root value before entering this function with -fstrict-aliasing and 0 (i.e. what
((XPVIV*) (sv)->sv_any)->xiv_iv contained on function entry) with -fno-strict-aliasing.

It seems perl is aware of this, but doesn't care to fix it - Configure has instead:
        case "$gccversion" in
        1*) ;;
        2.[0-8]*) ;;
        ?*)     echo " "
                echo "Checking if your compiler accepts -fno-strict-aliasing" 2
>&1
                echo 'int main(void) { return 0; }' > gcctest.c
                if $cc -O2 -fno-strict-aliasing -o gcctest gcctest.c; then
                        echo "Yes, it does." 2>&1
                        case "$ccflags" in
                        *strict-aliasing*)
                                echo "Leaving current flags $ccflags alone." 2>
&1
                                ;;
                        *) dflt="$dflt -fno-strict-aliasing" ;;
                        esac
                else
                        echo "Nope, it doesn't, but that's ok." 2>&1
                fi
                ;;
        esac

Not sure what is the best way forward with CPU2006 and GCC though (and what are other compilers doing to pass 400.perlbench).  Is it acceptable to add -fno-strict-aliasing just for this test?
Comment 14 Jan Hubicka 2007-10-15 11:39:47 UTC
Subject: Re:  [4.3 Regression]  Revision 128092 miscompiles 400.perlbench

> 
> when sv.c is compiled with -O2 -fstrict-aliasing, then the
> *(IV**)xiv = PL_xiv_root; write isn't considered aliased with the
> iv = ((XPVIV*) (sv)->sv_any)->xiv_iv;, because the former accesses the object

Duh, thanks and congratulations for finding this!
> 
> It seems perl is aware of this, but doesn't care to fix it - Configure has
> instead:

It also might be that perl developers figured out by experimentation
that -fno-strict-aliasing helps, we probably could try to let them know.
>         case "$gccversion" in
>         1*) ;;
>         2.[0-8]*) ;;
>         ?*)     echo " "
>                 echo "Checking if your compiler accepts -fno-strict-aliasing" 2
> >&1
>                 echo 'int main(void) { return 0; }' > gcctest.c
>                 if $cc -O2 -fno-strict-aliasing -o gcctest gcctest.c; then
>                         echo "Yes, it does." 2>&1
>                         case "$ccflags" in
>                         *strict-aliasing*)
>                                 echo "Leaving current flags $ccflags alone." 2>
> &1
>                                 ;;
>                         *) dflt="$dflt -fno-strict-aliasing" ;;
>                         esac
>                 else
>                         echo "Nope, it doesn't, but that's ok." 2>&1
>                 fi
>                 ;;
>         esac
> 
> Not sure what is the best way forward with CPU2006 and GCC though (and what are
> other compilers doing to pass 400.perlbench).  Is it acceptable to add
> -fno-strict-aliasing just for this test?

This should not be that dificult to fix. Ideally we can convince SPEC to
produce official patch as they did with similar problems with CPU2000?

Honza
Comment 15 H.J. Lu 2007-10-15 13:58:03 UTC
(In reply to comment #14)
> Subject: Re:  [4.3 Regression]  Revision 128092 miscompiles 400.perlbench
?
> 
> This should not be that dificult to fix. Ideally we can convince SPEC to
> produce official patch as they did with similar problems with CPU2000?
> 

Can we come up with a patch to fix perl? I can send it to our SPEC
representatives.
Comment 16 Jakub Jelinek 2007-10-27 09:42:03 UTC
Created attachment 14413 [details]
perl-5.8.7-aliasing.patch

Sure, here is a minimal patch against perl 5.8.7 which cures this (and for whatever reason happens to apply to SPEC2006 as well).  The S_{new,del,more}_xiv
changes IMHO aren't controversial at all, perl assumes that IV aka long
can hold a pointer.  For S_{new,del,more}_xnv where NV is double there is an option to use memcpy as in the patch (for S_new_xnv and S_del_xnv is GCC 4.3 able
to compile exactly the same code as before when those functions are noinline,
S_more_xnv is 2 or 3 insns larger, but nothing serious) which IMHO is standard conforming, or could e.g. use union { NV nv; NV *xnv; } u; and stuff values through it (but, as double is bigger than pointer on 32-bit arches wouldn't we
risk issues with signalling NaNs?), or of course can do much bigger changes and
change the data structures that contain e.g. xnv_nv fields to an anonymous union.

From quick skimming other S_{new,del,more}_x* routines the other routines don't violate strict aliasing.
Comment 17 Jakub Jelinek 2007-10-27 09:43:48 UTC
Closing as this isn't really a GCC bug, but perl and 400.perlbench bug.