This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/86763] [8/9 Regression] Wrong code comparing member of copy of a 237 byte object with nontrivial default constructor on x86-64 arch
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 02 Aug 2018 07:46:53 +0000
- Subject: [Bug middle-end/86763] [8/9 Regression] Wrong code comparing member of copy of a 237 byte object with nontrivial default constructor on x86-64 arch
- Auto-submitted: auto-generated
- References: <bug-86763-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86763
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |alias
Status|NEW |ASSIGNED
Component|rtl-optimization |middle-end
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #5)
> (In reply to Richard Biener from comment #3)
> > see how the compare is moved very far away from the branch possibly across
> > flag clobbering insns (rep movsq?). Target sounds good.
>
> Scheduler (sched2) pass is moving the compare:
>
> (insn 21 17 22 2 (set (reg:CCZ 17 flags)
> (compare:CCZ (mem/c:DI (plus:DI (reg/f:DI 7 sp)
> (const_int 8 [0x8])) [1 MEM[(struct Msg &)&t].id+0 S8
> A64])
> (const_int 1001 [0x3e9]))) "t4.cpp":32 12 {*cmpdi_1}
> (nil))
>
> in front of rep movsd:
>
> (insn 11 10 12 2 (parallel [
> (set (reg:DI 2 cx [92])
> (const_int 0 [0]))
> (set (reg/f:DI 5 di [90])
> (plus:DI (ashift:DI (reg:DI 2 cx [92])
> (const_int 3 [0x3]))
> (reg/f:DI 5 di [90])))
> (set (reg/f:DI 4 si [91])
> (plus:DI (ashift:DI (reg:DI 2 cx [92])
> (const_int 3 [0x3]))
> (reg/f:DI 4 si [91])))
> (set (mem/c:BLK (reg/f:DI 5 di [90]) [7 MEM[(struct T *)&t]+0
> S232 A128])
> (mem/c:BLK (reg/f:DI 4 si [91]) [7 MEM[(struct T
> *)&D.2891]+0 S232 A128]))
> (use (reg:DI 2 cx [92]))
> ]) "t4.cpp":31 993 {*rep_movdi_rex64}
> (expr_list:REG_UNUSED (reg:DI 2 cx [92])
> (nil)))
>
> Scheduler doesn't notice that rep movsd target overlaps compare operand.
>
> Looks like rtl-optimization (scheduler) to me.
I notice that -O -fstrict-aliasing -fschedule-insns2 fails and -O2
-fno-strict-aliasing succeeds. The above shows the block move uses alias-set 7
and the
compare alias-set 1. Alias set 1 is a subset of alias set 7 but not the
other way around - this suggests that sched-deps is querying the oracle
in a wrong way. Huh, no.
Breakpoint 2, true_dependence_1 (mem=0x7ffff6a5ec00, mem_mode=E_QImode,
mem_addr=0x3106258, x=0x7ffff6a5edb0, x_addr=0x7ffff6a61780,
mem_canonicalized=true)
at /space/rguenther/src/svn/gcc-8-branch/gcc/alias.c:2905
2905 gcc_checking_assert (mem_canonicalized ? (mem_addr != NULL_RTX)
(mem/c:QI (reg/f:DI 90) [7 MEM[(struct T *)&t]+236 S1 A32])
$27 = void
(mem/c:DI (plus:DI (reg/f:DI 20 frame)
(const_int -232 [0xffffffffffffff18])) [1 MEM[(struct Msg &)&t].id+0 S8
A64])
$28 = void
(gdb) fin
Run till exit from #0 true_dependence_1 (mem=0x7ffff6a5ec00,
mem_mode=E_QImode, mem_addr=0x3106258, x=0x7ffff6a5edb0,
x_addr=0x7ffff6a61780, mem_canonicalized=true)
at /space/rguenther/src/svn/gcc-8-branch/gcc/alias.c:2905
canon_true_dependence (mem=0x7ffff6a5ec00, mem_mode=E_QImode,
mem_addr=0x3106258, x=0x7ffff6a5edb0, x_addr=0x7ffff6a61780)
at /space/rguenther/src/svn/gcc-8-branch/gcc/alias.c:2998
2998 }
Value returned is $29 = 0
so the oracle says the load cannot alias the store. We enter with
(gdb) p debug_rtx (mem_addr)
(value:DI 10:17131 @0x3106258/0x30e6310)
(gdb) p debug_rtx (x_addr)
(plus:DI (reg/f:DI 20 frame)
(const_int -232 [0xffffffffffffff18]))
and find_base_term (x_addr) returns (address:DI -3) (same as mem_base)
Hmm. alias_sets_conflict_p returns false?! Ah, alias_set_subset_of
returns true because of has_zero_child. alias-set 7 has no children,
it looks like struct T has no members but it does have a member of type V.
So we do record_component_aliases arriving at that field and get_alias_set
on this field returns zero due to TYPE_TYPELESS_STORAGE (which is new
in GCC 8 which explains why there may be a regression). So we record
the ->has_zero_child relationship (instead of previously recording
a distinct alias-set).
Now - some earlier change - improvement to alias_sets_conflict_p in fact,
made us not consider ->has_zero_child there basically to avoid making
struct X { int i; }; and struct Y { float f; char x[4]; } conflict. But
this is exactly what we are trying to allow with TYPE_TYPELESS_STORAGE :/
So TYPE_TYPELESS_STORAGE doesn't work as desired and introduces wrong-code.