Bug 67573 - [SH] wrong code generated for libstdc++-v3/src/c++11/cxx11-shim_facets.cc at -mlra
Summary: [SH] wrong code generated for libstdc++-v3/src/c++11/cxx11-shim_facets.cc at ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 6.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks: 55212
  Show dependency treegraph
 
Reported: 2015-09-14 13:00 UTC by Kazumoto Kojima
Modified: 2015-09-20 23:54 UTC (History)
0 users

See Also:
Host:
Target: sh*-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
test case (85.26 KB, application/x-bzip)
2015-09-14 13:09 UTC, Kazumoto Kojima
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kazumoto Kojima 2015-09-14 13:00:16 UTC
Wrong code is generated with sh-linux/sh-elf compilers for
void std::__facet_shims::__moneypunct_fill_cache<wchar_t, true>(std::integral_constant<bool, true>, std::locale::facet const*, std::__moneypunct_cache<wchar_t, true>*)
at "-m4 -ml -O2 -fPIC -mlra -std=gnu++11".

It uses __copy<wchar_t> inline function and __copy<wchar_t> calls
std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::copy(wchar_t*, unsigned int, unsigned int)

with the wrong code

	mov	r11,r4
	mov.l	.L1122,r7
	mov	r0,r5
	mov.l	@r15,r6
	mov	r10,r4
	bsrf	r7
.LPCS203:
	mov.l	r0,@(4,r15)
	...
.L1122:
	.long	_ZNKSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE4copyEPwjj@PLT-(.LPCS203+2-.)

where r7 should be 0 as the last argument.  With -mno-lra,
the corresponding code is:

	mov.l	.L1050,r1
	mov	r0,r14
	mov	#0,r7
	mov	r13,r6
	mov	r0,r5
	bsrf	r1
.LPCS191:
	mov	r10,r4
	...

It seems that dse/dce removes mov #0, r7.

.pro_and_epilogue:

(insn 204 831 830 22 (set (reg:SI 7 r7)
        (const_int 0 [0])) fa.cc:484 253 {movsi_ie}
     (nil))

.dse2 marks r7 unused:

(insn 204 831 830 22 (set (reg:SI 7 r7)
        (const_int 0 [0])) fa.cc:484 253 {movsi_ie}
     (expr_list:REG_UNUSED (reg:SI 7 r7)
        (nil)))

.jump2 eliminates insn 204:

DCE: Deleting insn 204
deleting insn with uid = 204.
Comment 1 Kazumoto Kojima 2015-09-14 13:09:35 UTC
Created attachment 36333 [details]
test case
Comment 2 Kazumoto Kojima 2015-09-15 11:27:58 UTC
It seems that LRA allocates r7 for the scratch reg at

(define_insn_and_split "call_value_pcrel"
  [(set (match_operand 0 "" "=rf")
	(call (mem:SI (match_operand:SI 1 "symbol_ref_operand" ""))
	      (match_operand 2 "" "")))
   (use (reg:SI FPSCR_MODES_REG))
   (use (reg:SI PIC_REG))
   (clobber (reg:SI PR_REG))
   (use (match_scratch:SI 3 "=r"))]

in the problematic case and the insn

(call_insn 208 207 209 22 (parallel [
            (set (reg:SI 0 r0)
                (call (mem:SI (symbol_ref:SI ("_ZNKSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE4copyEPwjj") [flags 0x41]  <function_decl 0xb72c0280 copy>) [0 copy S4 A32])
                    (const_int 0 [0])))
            (use (reg:SI 154 fpscr0))
            (use (reg:SI 12 r12))
            (clobber (reg:SI 146 pr))
            (clobber (reg:SI 7 r7 [357]))
        ]) fa.cc:484 322 {call_value_pcrel}
     (expr_list:REG_EH_REGION (const_int 4 [0x4])
        (expr_list:REG_CALL_DECL (symbol_ref:SI ("_ZNKSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE4copyEPwjj") [flags 0x41]  <function_decl 0xb72c0280 copy>)
            (nil)))
    (expr_list:SI (use (reg:SI 4 r4))
        (expr_list:SI (use (reg:SI 5 r5))
            (expr_list:SI (use (reg:SI 6 r6))
                (expr_list:SI (use (reg:SI 7 r7))
                    (nil))))))

made the following passes confused.
Comment 3 Kazumoto Kojima 2015-09-15 14:13:22 UTC
I've wrongly cut&paste call_value_pcrel.  It's

(define_insn_and_split "call_value_pcrel"
  [(set (match_operand 0 "" "=rf")
	(call (mem:SI (match_operand:SI 1 "symbol_ref_operand" ""))
	      (match_operand 2 "" "")))
   (use (reg:SI FPSCR_MODES_REG))
   (use (reg:SI PIC_REG))
   (clobber (reg:SI PR_REG))
   (clobber (match_scratch:SI 3 "=&r"))]
   ...

I think that the last clobber should be

   (clobber (match_scratch:SI 3 "=&r"))]

i.e. the scratch register is early clobbered.  With that change,
it looks the wrong code go away.  I'll come up with the tested
patch for all similar *call_*pcrel insns.
Comment 4 Oleg Endo 2015-09-15 23:14:46 UTC
Maybe FPSCR_STAT_REG should be in the clobber list, too?  Otherwise stores of FP exception bits etc (get_fpscr builtin) could be wrongly CSE'd across function calls... However, I don't know if this is a problem at the moment.
Comment 5 Kazumoto Kojima 2015-09-16 06:19:00 UTC
(In reply to Oleg Endo from comment #4)
> Maybe FPSCR_STAT_REG should be in the clobber list, too?  Otherwise stores
> of FP exception bits etc (get_fpscr builtin) could be wrongly CSE'd across
> function calls... However, I don't know if this is a problem at the moment.

I'm not sure whether it's a problem or not ATM too.  The original
issue happens with the aggressive allocation by LRA.
If the scenario for FPSCR_MODES_REG is real, it could be seen
with the old RA already.
Comment 6 Kazumoto Kojima 2015-09-17 00:13:29 UTC
Author: kkojima
Date: Thu Sep 17 00:12:57 2015
New Revision: 227837

URL: https://gcc.gnu.org/viewcvs?rev=227837&root=gcc&view=rev
Log:
PR target/67573
* config/sh/sh.md: Add early clobber to scratch	operand of *call_*pcrel
insn_and_split so not to CSE scratch reg.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md
Comment 7 Kazumoto Kojima 2015-09-17 00:24:20 UTC
Fixed on trunk.
Oleg, now we can propose to make -mlra default on trunk.
Comment 8 Oleg Endo 2015-09-17 14:28:01 UTC
(In reply to Kazumoto Kojima from comment #7)
> Fixed on trunk.
> Oleg, now we can propose to make -mlra default on trunk.

Nice, thank you!
Comment 9 Oleg Endo 2015-09-19 01:24:59 UTC
I think this should be backported to GCC 5.  Even if it might not be triggered often, there is a possibility for silent wrong-code bugs.
Comment 10 Kazumoto Kojima 2015-09-19 12:52:51 UTC
(In reply to Oleg Endo from comment #9)
> I think this should be backported to GCC 5.  Even if it might not be
> triggered often, there is a possibility for silent wrong-code bugs.

OK, will do.
Comment 11 Kazumoto Kojima 2015-09-20 23:54:35 UTC
Author: kkojima
Date: Sun Sep 20 23:54:03 2015
New Revision: 227953

URL: https://gcc.gnu.org/viewcvs?rev=227953&root=gcc&view=rev
Log:
PR target/67573
* config/sh/sh.md: Add early clobber to scratch	operand of *call_*pcrel
insn_and_split so to avoid CSEd scratch reg.


Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/config/sh/sh.md