Bug 38052 - [4.4 Regression] genautomata segfaults when -O2 is enabled
Summary: [4.4 Regression] genautomata segfaults when -O2 is enabled
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.0
: P3 major
Target Milestone: 4.4.0
Assignee: rsandifo@gcc.gnu.org
URL:
Keywords: build, wrong-code
Depends on:
Blocks:
 
Reported: 2008-11-07 15:32 UTC by Zhang Le
Modified: 2008-11-16 21:08 UTC (History)
3 users (show)

See Also:
Host: mipsel-unknown-linux-gnu
Target: mipsel-unknown-linux-gnu
Build: mipsel-unknown-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2008-11-15 20:42:11


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zhang Le 2008-11-07 15:32:35 UTC
I have done some detailed research on this problem. It is after the needed information. Please do have a look.

The source is checked out on 2008-11-07

The system is using O32 abi.

The configure option is
var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc/configure --cache-file=./config.cache --with-stabs --prefix=/usr --bindir=/usr/mipsel-unknown-linux-gnu/gcc-bin/4.4.0-pre9999 --includedir=/usr/lib/gcc/mipsel-unknown-linux-gnu/4.4.0-pre9999/include --datadir=/usr/share/gcc-data/mipsel-unknown-linux-gnu/4.4.0-pre9999 --mandir=/usr/share/gcc-data/mipsel-unknown-linux-gnu/4.4.0-pre9999/man --infodir=/usr/share/gcc-data/mipsel-unknown-linux-gnu/4.4.0-pre9999/info --with-gxx-include-dir=/usr/lib/gcc/mipsel-unknown-linux-gnu/4.4.0-pre9999/include/g++-v4 --disable-altivec --disable-fixed-point --enable-nls --without-included-gettext --with-system-zlib --disable-checking --disable-werror --enable-secureplt --disable-multilib --enable-libmudflap --disable-libssp --disable-libgomp --enable-cld --disable-libgcj --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion= --enable-linux-futex --enable-languages=c,c++ --program-transform-name=s,y,y, --build=mipsel-unknown-linux-gnu --host=mipsel-unknown-linux-gnu --target=mipsel-unknown-linux-gnu --srcdir=/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc --with-build-libsubdir=.

The command to build genautomata is:
/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/build/./prev-gcc/xgcc -B/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/build/./prev-gcc/ -B/usr/mipsel-unknown-linux-gnu/bin/ -c  -O2 -g -pipe -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wcast-qual -Wold-style-definition -Wc++-compat -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc -I/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc/build -I/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc/../include -I/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc/../libcpp/include  -I/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc/../libdecnumber -I/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc/../libdecnumber/dpd -I../libdecnumber  -DCLOOG_PPL_BACKEND   -o build/genautomata.o /var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc/genautomata.c
/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/build/./prev-gcc/xgcc -B/var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/build/./prev-gcc/ -B/usr/mipsel-unknown-linux-gnu/bin/  -O2 -g -pipe -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wcast-qual -Wold-style-definition -Wc++-compat -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H -DGENERATOR_FILE  -o build/genautomata \
            build/genautomata.o build/rtl.o build/read-rtl.o build/ggc-none.o build/vec.o build/min-insn-modes.o build/gensupport.o build/print-rtl.o build/errors.o .././libiberty/libiberty.a -lm

The command segfaulted:
build/genautomata /var/tmp/portage/sys-devel/gcc-4.4.0_pre9999/work/gcc-4.4.0-9999/gcc/config/mips/mips.md insn-conditions.md


I have debugged this executable with gdb. And I found the exact instruction which caused the segfaults. It has something to do with the manipulation of $gp register.

From readelf -a genautomata, the Canonical gp value of Primary GOT: 00440090

Normally before calling an function, the gp register must be loaded with this value. This can be observed it compiled with no optimization option.

However if -O2 is enabled, the following code in genautomata.c:
6975   fprintf (output_file, "static const ");
6976   output_range_type (output_file, 0, automaton->insn_equiv_classes_num);
6977   fprintf (output_file, " ");
6978   output_translate_vect_name (output_file, automaton);
becomes(this is generated with -S option):
        lw      $28,16($sp)
        lw      $7,%lo(output_file)($17)
        lw      $25,%call16(fwrite)($28)
        lui     $4,%hi($LC171)
        addiu   $4,$4,%lo($LC171)
        li      $5,1                    # 0x1
        jalr    $25
        li      $6,13                   # 0xd

        lw      $6,16($18)
        lw      $4,%lo(output_file)($17)
        .option pic0
        jal     output_range_type
        .option pic2
        move    $5,$0

        lw      $28,16($sp)
        lw      $5,%lo(output_file)($17)
        lw      $25,%call16(fputc)($28)
        nop
        jalr    $25
        li      $4,32                   # 0x20

        lw      $4,%lo(output_file)($17)
        .option pic0
        jal     output_translate_vect_name
        .option pic2
        move    $5,$18
We can see there is no "lw      $28,16($sp)" before "jal     output_range_type" or "jal     output_translate_vect_name"

In the case of output_range_type, there is no problem, because inside the function, the only access to GOT is to find "fwrite"'s GOT entry. And the entry already contains the resolved address, since immediately before the call to output_range_type, there is a call to fwrite.

The problem with output_translate_vect_name is this function has two calls to fprintf, and they are called for the first time in this program. So we need to call the lazy resolver. And that's exactly where the program segfaults. Since gp's value is incorrect, neither will be the resolver's address.

That's what I have got so far. Hope this could help to solve the problem.
Comment 1 Zhang Le 2008-11-07 15:41:29 UTC
(In reply to comment #0)
> The problem with output_translate_vect_name is this function has two calls to
> fprintf, and they are called for the first time in this program. 

Another thing I don't understand is why fprintf elsewhere is replaced with fwrite including in the output_range_type function, but it is still itself in output_translate_vect_name.

I have done an experiment. Please take a look at the beginning of output_translate_vect_name function:
0x0040647c <output_translate_vect_name+0>:      addiu  sp,sp,-8
0x00406480 <output_translate_vect_name+4>:      lw     v0,4(a1)
0x00406484 <output_translate_vect_name+8>:      sw     gp,4(sp)
0x00406488 <output_translate_vect_name+12>:     lui    gp,0x44
0x0040648c <output_translate_vect_name+16>:     beqz   v0,0x4064b0 <output_translate_vect_name+52>
0x00406490 <output_translate_vect_name+20>:     addiu  gp,gp,4240
0x00406494 <output_translate_vect_name+24>:     lw     t9,-32532(gp)
0x00406498 <output_translate_vect_name+28>:     lui    a1,0x42
0x0040649c <output_translate_vect_name+32>:     lw     gp,4(sp)
0x004064a0 <output_translate_vect_name+36>:     lw     a2,0(v0)
0x004064a4 <output_translate_vect_name+40>:     addiu  a1,a1,-8008
0x004064a8 <output_translate_vect_name+44>:     jr     t9

If I replace the following instruction with nop, then the genautomata will succeed.
0x0040649c <output_translate_vect_name+32>:     lw     gp,4(sp)
Comment 2 Joshua Kinard 2008-11-12 01:01:35 UTC
I ran into this too.  The problem flag is -foptimize-sibling-calls.  You can pass that with -O1 to trigger the bug, but not with -O0.  Some other optimization in -O1 seems to be mixing with this one and causing the flaw.

Ran into this on mips-unknown-linux-gnu, btw.  Mips-specific maybe?
Comment 3 Zhang Le 2008-11-13 09:53:21 UTC
liblbxutil-1.0.1 package could be used to reproduce this bug.
I will investigate this later when i have time.
Comment 4 Zhang Le 2008-11-13 11:36:56 UTC
and sed-4.1.5, too.
Comment 5 Zhang Le 2008-11-13 17:27:32 UTC
I am trying to find which specific flag or flags when used together with -foptimize-sibling-calls could trigger this problem.
As the first step I tried to find a set of flags used together with -O0 and -foptimize-sibling-calls could trigger this problem. Presumably this set of flags should be those disabled at -O0 but enabled at -O1.

So I did the following to find the differences:
gcc -c -Q -O0 --help=optimizers > /tmp/O0-opts
gcc -c -Q -O1 --help=optimizers > /tmp/O1-opts
diff /tmp/O0-opts /tmp/O1-opts | grep enabled | cut -d " " -f 4

Then I used these flags with -O0 and -foptimize-sibling-calls together, but this didn't trigger the bug.

Is there anything I have overlooked here?
Comment 6 Eric Botcazou 2008-11-13 17:34:11 UTC
> Then I used these flags with -O0 and -foptimize-sibling-calls together, but
> this didn't trigger the bug.
> 
> Is there anything I have overlooked here?

Yes, not all optimizations are controlled by a specific flag, as explained in
the manual.
Comment 7 rsandifo@gcc.gnu.org 2008-11-15 20:42:11 UTC
I'll try to look at this tomorrow.  The code in comment #1 is certainly wrong: the store at <output_translate_vect_name+8> is supposed come after the GP addiu at <output_translate_vect_name+20>.  With that fixed, the function should work as expected.

I'm guessing this is a scheduling bug, but time will tell.
Comment 8 rsandifo@gcc.gnu.org 2008-11-16 20:27:06 UTC
Subject: Bug 38052

Author: rsandifo
Date: Sun Nov 16 20:25:40 2008
New Revision: 141925

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=141925
Log:
gcc/
	PR target/38052
	* config/mips/mips.c (machine_function): Update the comment
	above global_pointer.
	(mips_global_pointer): Use INVALID_REGNUM rather than 0 to indicate
	that a function doesn't need a global pointer.
	(mips_current_loadgp_style): Update accordingly.
	(mips_restore_gp): Likewise.
	(mips_output_cplocal): Likewise.
	(mips_expand_prologue): Likewise.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/mips/mips.c

Comment 9 rsandifo@gcc.gnu.org 2008-11-16 20:32:36 UTC
Subject: Bug 38052

Author: rsandifo
Date: Sun Nov 16 20:31:13 2008
New Revision: 141926

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=141926
Log:
gcc/
	PR target/38052
	* config/mips/mips.c (mips_cfun_call_saved_reg_p)
	(mips_cfun_might_clobber_call_saved_reg_p): New functions,
	split out from...
	(mips_save_reg_p): ...here.  Always consult TARGET_CALL_SAVED_GP
	rather than call_really_used_regs when handling $gp.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/mips/mips.c

Comment 10 rsandifo@gcc.gnu.org 2008-11-16 21:08:50 UTC
Fixed on mainline.