Bug 98689 - [11 Regression] FAIL: gcc.dg/torture/stackalign/builtin-return-1.c -O1 execution test
Summary: [11 Regression] FAIL: gcc.dg/torture/stackalign/builtin-return-1.c -O1 exe...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 11.0
: P4 normal
Target Milestone: 11.0
Assignee: Richard Sandiford
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-14 19:25 UTC by John David Anglin
Modified: 2021-04-16 11:41 UTC (History)
2 users (show)

See Also:
Host: hppa*-*-*
Target: hppa*-*-*
Build: hppa*-*-*
Known to work:
Known to fail:
Last reconfirmed: 2021-03-30 00:00:00


Attachments
.s file (763 bytes, text/plain)
2021-01-14 19:25 UTC, John David Anglin
Details
.s diff to gcc10 .s (447 bytes, text/plain)
2021-03-06 21:41 UTC, John David Anglin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description John David Anglin 2021-01-14 19:25:40 UTC
Created attachment 49967 [details]
.s file

Executing on host: /home/dave/gnu/gcc/objdir/gcc/xgcc -B/home/dave/gnu/gcc/objdi
r/gcc/ /home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/torture/stackalign/builtin-re
turn-1.c    -fdiagnostics-plain-output    -O1    -lm  -o ./builtin-return-1.exe
   (timeout = 300)
spawn -ignore SIGHUP /home/dave/gnu/gcc/objdir/gcc/xgcc -B/home/dave/gnu/gcc/obj
dir/gcc/ /home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/torture/stackalign/builtin-
return-1.c -fdiagnostics-plain-output -O1 -lm -o ./builtin-return-1.exe
PASS: gcc.dg/torture/stackalign/builtin-return-1.c   -O1  (test for excess error
s)
Setting LD_LIBRARY_PATH to :/home/dave/gnu/gcc/objdir/gcc:/home/dave/gnu/gcc/obj
dir/hppa-linux-gnu/./libatomic/.libs::/home/dave/gnu/gcc/objdir/gcc:/home/dave/g
nu/gcc/objdir/hppa-linux-gnu/./libatomic/.libs:/home/dave/gnu/gcc/objdir/hppa-li
nux-gnu/libstdc++-v3/src/.libs:/home/dave/gnu/gcc/objdir/hppa-linux-gnu/libssp/.
libs:/home/dave/gnu/gcc/objdir/hppa-linux-gnu/libphobos/src/.libs:/home/dave/gnu
/gcc/objdir/hppa-linux-gnu/libgomp/.libs:/home/dave/gnu/gcc/objdir/hppa-linux-gn
u/libatomic/.libs:/home/dave/gnu/gcc/objdir/./gcc:/home/dave/gnu/gcc/objdir/./pr
ev-gcc:/home/dave/gnu/gcc/objdir/hppa-linux-gnu/libstdc++-v3/src/.libs:/home/dav
e/gnu/gcc/objdir/hppa-linux-gnu/libssp/.libs:/home/dave/gnu/gcc/objdir/hppa-linu
x-gnu/libphobos/src/.libs:/home/dave/gnu/gcc/objdir/hppa-linux-gnu/libgomp/.libs
:/home/dave/gnu/gcc/objdir/hppa-linux-gnu/libatomic/.libs:/home/dave/gnu/gcc/obj
dir/./gcc:/home/dave/gnu/gcc/objdir/./prev-gcc
Execution timeout is: 300
spawn [open ...]
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O1  execution test

Similar fails:
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O1 -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O2  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O2 -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O3 -g  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O3 -g -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -Os  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -Os -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O2 -flto -flto-partition=none  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O2 -flto -flto-partition=none -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O2 -flto  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-1.c   -O2 -flto -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O0  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O0 -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O1  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O1 -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O2  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O2 -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O3 -g  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O3 -g -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -Os  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -Os -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O2 -flto -flto-partition=none  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O2 -flto -flto-partition=none -fpic execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O2 -flto  execution test
FAIL: gcc.dg/torture/stackalign/builtin-return-2.c   -O2 -flto -fpic execution test
Comment 1 John David Anglin 2021-01-21 15:21:40 UTC
Revision c4a6b2dadcd:b9a7bc9531b:b5f24739632389d50903bfde6d1bfc06d522c56b was okay.
Comment 2 John David Anglin 2021-01-23 23:28:34 UTC
Introduced by the following commit:

commit 0b76990a9d75d97b84014e37519086b81824c307 (HEAD)
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Thu Dec 17 00:15:12 2020 +0000

    fwprop: Rewrite to use RTL SSA

    This patch rewrites fwprop.c to use the RTL SSA framework.  It tries
    as far as possible to mimic the old behaviour, even in caes where
    that doesn't fit naturally with the new framework.  I've added ???
    comments to mark those places, but I think <E2><80><9C>fixing<E2><80><9D> them should
    be done separately to make bisection easier.

    In particular:

    * The old implementation iterated over uses, and after a successful
      substitution, the new insn's uses were added to the end of the list.
      The pass still processed those uses, but because it processed them at
      the end, it didn't fully optimise one instruction before propagating
      it into the next.

      The new version follows the same approach for comparison purposes,
      but I'd like to drop that as a follow-on patch.

    * The old implementation operated on single use sites (DF_REF_LOCs).
      This doesn't work well for instructions with match_dups, where it's
      necessary to update both an operand and its dups at the same time.
      For example, attempting to substitute into a divmod instruction would
      fail because only the div or the mod side would be updated.

      The new version again follows this to some extent for comparison
      purposes (although not exactly).  Again I'd like to drop it as a
      follow-on patch.

      One difference is that if a register occurs in multiple MEM addresses
      in a set, the new version will try to update them all at once.  This is
      what causes the SVE ACLE st4* output to improve.

    Also, the old version didn't naturally guarantee termination (PR79405),
    whereas the new one does.

    gcc/
            * fwprop.c: Rewrite to use the RTL SSA framework.

    gcc/testsuite/
            * gcc.dg/rtl/x86_64/test-return-const.c.before-fwprop.c: Don't
            expect insn updates to be deferred.
            * gcc.target/aarch64/sve/acle/asm/st4_s8.c: Expect the addition
            to be folded into the address.
            * gcc.target/aarch64/sve/acle/asm/st4_u8.c: Likewise.
Comment 3 John David Anglin 2021-03-06 21:41:22 UTC
Created attachment 50320 [details]
.s diff to gcc10 .s

The following code is wrong in gcc-11:

-       ldw 12(%r3),%r28
+       ldw 12(%r3),%r5
+       copy %r5,%r28
        bl foo,%r2
        ldo 64(%r3),%r4
-       stw %r28,0(%r4)
+       stw %r5,0(%r4)

In gcc-10, the return value from foo is stored in 0(%r4).
Comment 4 Richard Sandiford 2021-03-30 12:10:04 UTC
Mine.
Comment 5 GCC Commits 2021-04-16 11:38:24 UTC
The master branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:49e651990a6966936a0273138dd56ac394e57b16

commit r11-8214-g49e651990a6966936a0273138dd56ac394e57b16
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Fri Apr 16 12:38:02 2021 +0100

    Mark untyped calls and handle them specially [PR98689]
    
    This patch fixes a regression introduced by the rtl-ssa patches.
    It was seen on HPPA but it might be latent elsewhere.
    
    The problem is that the traditional way of expanding an untyped_call
    is to emit sequences like:
    
       (call (mem (symbol_ref "foo")))
       (set (reg pseudo1) (reg result1))
       ...
       (set (reg pseudon) (reg resultn))
    
    The ABI specifies that result1..resultn are clobbered by the call but
    nothing in the RTL indicates that result1..resultn are the results
    of the call.  Normally, using a clobbered value gives undefined results,
    but in this case the results are well-defined and matter for correctness.
    
    This seems like a niche case, so I think it would be better to mark
    it explicitly rather than try to detect it heuristically.
    
    Note that in expand_builtin_apply we already have an rtx_insn *,
    so it doesn't matter whether we call emit_call_insn or emit_insn.
    Calling emit_insn seems more natural now that the gen_* call
    has been split out.  It also matches later code in the function.
    
    gcc/
            PR rtl-optimization/98689
            * reg-notes.def (UNTYPED_CALL): New note.
            * combine.c (distribute_notes): Handle it.
            * emit-rtl.c (try_split): Likewise.
            * rtlanal.c (rtx_properties::try_to_add_insn): Likewise.  Assume
            that calls with the note implicitly set all return value registers.
            * builtins.c (expand_builtin_apply): Add a REG_UNTYPED_CALL
            to untyped_calls.
Comment 6 Richard Sandiford 2021-04-16 11:41:37 UTC
Finally fixed, sorry for the long delay.