57003 – [4.8/4.9/5.0 Regression] gcc breaks -O2 optimization with Wine(64) - links/info/bisect of commits included

Bug 57003 - [4.8/4.9/5.0 Regression] gcc breaks -O2 optimization with Wine(64) - links/info/bisect of commits included

Summary: [4.8/4.9/5.0 Regression] gcc breaks -O2 optimization with Wine(64) - links/in...

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	unknown

Importance:	P3 normal
Target Milestone:	4.8.4
Assignee:	Jakub Jelinek

URL:
Keywords:

Depends on:
Blocks:

Reported:	2013-04-19 00:45 UTC by Jordan
Modified:	2023-12-29 10:57 UTC (History)
CC List:	6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2013-04-19 00:00:00

Attachments
gcc -E output. (112.62 KB, application/octet-stream) 2013-04-24 16:06 UTC, Kirill Smirnov	Details
gcc49-pr57003.patch (1.01 KB, patch) 2013-04-25 09:54 UTC, Jakub Jelinek	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jordan 2013-04-19 00:45:55 UTC

Hi,

myself and others have been trying to track down problems with compiling (and using) wine64 with GCC-4.8.0. ~ Optimization level -O2 - which is usually the default optimization level for building wine, breaks wine64. Obviously, with gcc-4.7.2/3 -O2 works just fine. 

Here is the Wine bug report; http://bugs.winehq.org/show_bug.cgi?id=33307

Austin English (Wine-dev) ran a bisect on GCC to try to narrow down the problems. here is his relevant post on that, regarding this bug report;

So the bad commit is:
commit c8010b803d34fa7e096747067e33c650b36ecc06
Author: bernds <bernds@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Mon May 14 12:08:31 2012 +0000

        * attribs.c (decl_attributes): Avoid emitting a warning if
        ATTR_FLAG_BUILT_IN.
        * doc/rtl.texi (CALL_INSN_FUNCTION_USAGE): Use lowercase for rtx
        codes.  Document meaning of sets inside CALL_INSN_FUNCTION_USAGE.
        * c-family/c-common.c (DEF_ATTR_STRING): Define and undefine as
        necessary.
        * builtin-attrs.def (DEF_ATTR_FOR_STRING): Define.  Use it to
        define a string "1".
        (ATTR_RET1_NOTHROW_NONNULL_LEAF): New attr definition.
        * builtins.def (BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET,
        BUILT_IN_STRCPY): Use it for these functions.
        * postreload.c (reload_combine): Deal with SETs inside
        CALL_INSN_FUNCTION_USAGE.
        * caller-save.c (setup_save_areas, save_call_clobbered_regs):
        Look for REG_RETURNED notes and use a cheap restore if possible.
        * ira-int.h (struct ira_allocno): New member cheap_calls_crossed_num.
        (ALLOCNO_CHEAP_CALLS_CROSSED_NUM): New macro.
        * ira-build.c (ira_create_allocno, create_cap_allocno,
        propagate_allocno_info, propagate_some_info_from_allocno,
        copy_info_to_removed_store_destination, ira_flattening): Handle it.
        * ira-lives.c (pseudo_regno_single_word_and_live_p,
        find_call_crossed_cheap_reg): New static functions.
        (process_bb_node_lives): Look for SETs in CALL_INSN_FUNCTION_USAGE,
        and set ALLOCNO_CHEAP_CALLS_CROSSED_NUM if possible.  Also make
        a REG_RETURNED note in that case.
        * ira.c (setup_reg_renumber): Change assert to allow cases where
        allocnos only cross calls for which they are cheap to restore.
        * ira-costs.c (ira_tune_allocno_costs): Compare
        ALLOCNO_CALLS_CROSSED_NUM to ALLOCNO_CHEAP_CALLS_CROSSED_NUM rather
        than 0.
        * reg-notes.def (REG_RETURNED): New note.
        * cse.c (cse_insn): Likewise.
        * sched-deps.c (sched_analyze_insn): Likewise.
        * expr.c (init_block_move_fn): Set a "fn spec" attribute.
        * calls.c (decl_return_flags): New static function.
        (expand_call): Generate a SET in CALL_INSN_FUNCTION_USAGE for
        functions that return one of their arguments.
        * lto/lto-lang.c (handle_fnspec_attribute): New static function.
        (lto_attribute_table): Add "fn spec".
        (DEF_ATTR_STRING): Define and undefine along with the other macros.
        * regcprop.c (struct kill_set_value_data): New.
        (kill_set_value): Interpret data as a pointer to such a struct.
        Do nothing if the caller wants the register to be ignored.
        (copyprop_hardreg_forward_1): Handle SETs in CALL_INSN_FUNCTION_USAGE.

    testsuite/
        * gcc.target/i386/retarg.c: New test.

    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@187459
138bc75d-0d04-0410-961f-82ee72b054a4

after this commit, wine will fail to build (with default -O2 settings, gcc will
crash with an internal error). The compile itself is fixed by a later commit:
commit d73df2920c77116fd88e03cd95dd352f16511a3f
Author: bernds <bernds@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Mon May 21 21:37:01 2012 +0000

        PR rtl-optimization/53373
        * caller-save.c (save_call_clobbered_regs): Look into a possible
        PARALLEL manually rather than using single_set on a call insn.
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@187745
138bc75d-0d04-0410-96

after this, wine will compile fine, but segfaults on launch.

A quick search on http://gcc.gnu.org/bugzilla/ doesn't show anything for that
commit. If someone else would file a bug upstream with this info, it would be
appreciated.
______

If i can provide any more info - please let me know, or even better ~ post the the WineHQ bug report that i have linked to above ~ as Austin or possibly another Wine-Developer may be able to help further.

thanks.

Jordan

Comment 1 Andrew Pinski 2013-04-19 00:55:11 UTC

IIRC this is usually due to mis writing of memcpy/memset where you return null rather than the dest (first argument).

Comment 2 Andrew Pinski 2013-04-19 01:00:03 UTC

(In reply to comment #1)
> IIRC this is usually due to mis writing of memcpy/memset where you return null
> rather than the dest (first argument).

See PR56881.

Comment 3 Richard Biener 2013-04-19 08:26:30 UTC

To narrow down the problem more you can cherry-pick d73df2920c77116fd88e03cd95dd352f16511a3f to after the commit that broke build
and continue bi-secting?

Comment 4 dank 2013-04-21 14:22:41 UTC

Austin English said

"I did...that's what led to the original commit showing the segfault.
IOW:
$ git reset --hard c8010b803d34fa7e096747067e33c650b36ecc06
# build gcc/wine - gcc fails to build wine, internal compiler error
$ git show d73df2920c77116fd88e03cd95dd352f16511a3f | patch -p1
# build gcc/wine - wine segfaults on launch"

So bisecting seems complete.

Also, Kirill Smirnov reports that building wine using just -O2 runs into
this bug for him, but building wine with "-O2 -fno-builtin-memcpy" doesn't.

Next step may be to bisect wine's source code to see which file or files
misbehave when built with -O2 but without -fno-builtin-memcpy.

Comment 5 Andrew Pinski 2013-04-21 18:06:36 UTC

(In reply to comment #4)
> Next step may be to bisect wine's source code to see which file or files
> misbehave when built with -O2 but without -fno-builtin-memcpy.

Actually it should be easier than that.  Look at the implementation of memcpy that wine has included in it and see if it returns the correct value.

Comment 6 dank 2013-04-22 16:39:37 UTC

You'd think... but I didn't find any obvious memcpy replacement.

I spent some time bisecting the wine source yesterday.  There appear to be
at least three or four affected sites.  I bisected one of the affected files with
#pragma GCC optimize("-O2")
...
#pragma GCC optimize("-O1")
and the offending memcpy() there seems to be
http://source.winehq.org/source/dlls/kernel32/process.c#L1316

There are plenty of clues to chase down, hope I have time to spend on it 
sometime soon.

Comment 7 Kirill Smirnov 2013-04-23 09:44:03 UTC

It seems gcc over-optimizes series of memcpy() function calls one after another. The piece of code does not work:

memcpy( buffer, DIR_Windows, len * sizeof(WCHAR) );
memcpy( buffer + len, default_syswow64W, sizeof(default_syswow64W)

There is a wrapper around memcpy() called memcpy_unaligned() to avoid builtin/inlining.
And these pieces of code work:

memcpy( buffer, DIR_Windows, len * sizeof(WCHAR) );
memcpy_unaligned( buffer + len, default_syswow64W, sizeof(default_syswow64W) );

and

memcpy_unaligned( buffer, DIR_Windows, len * sizeof(WCHAR) );
memcpy( buffer + len, default_syswow64W, sizeof(default_syswow64W) );

I'm sorry for copy-and-pasting wine code as is, I tried but failed to create a refined test case.


So this case is opposite as previously suggested: the memcpy_unaligned() wrapper is OK, but native memcpy() is failing.

Comment 8 Jakub Jelinek 2013-04-23 09:51:23 UTC

The important question is what that
   memcpy( buffer, DIR_Windows, len * sizeof(WCHAR) );
   memcpy( buffer + len, default_syswow64W, sizeof(default_syswow64W) );
compiles to.  If it is
...
call memcpy
leal (%rax, ...), %rdi ! or similar, the important is that buffer is preserved in return value of the previous memcpy call
...
call memcpy
Then if it doesn't work, you need to look at whatever memcpy implementation you are calling and see whether it correctly returns the first argument it has been passed to it in all cases.

Comment 9 Kirill Smirnov 2013-04-23 15:56:59 UTC

>... whatever memcpy implementation you are calling and see whether it correctly returns the first argument it has been passed to it in all cases.

Fails (gcc version of memcpy):
   __builtin_memcpy( buffer, DIR_Windows, len * sizeof(WCHAR) );
   __builtin_memcpy( buffer + len, default_syswow64W, sizeof(default_syswow64W) );


Works (glibc version of memcpy):
   memcpy( buffer, DIR_Windows, len * sizeof(WCHAR) );
   memcpy( buffer + len, default_syswow64W, sizeof(default_syswow64W) );

Comment 10 Jakub Jelinek 2013-04-23 16:00:52 UTC

But __builtin_memcpy isn't necessarily the inline memcpy code, it can very well be a library call too.
Anyway, this bugreport doesn't have a preprocessed source attached to it, nor list of all gcc options to compile it, so there is nothing to look at.

Comment 11 Kirill Smirnov 2013-04-23 21:33:10 UTC

I'm sorry I cannot reproduce invalid behaviour within a refined test case.

Instead I can provide commented asm dump from wine.

This block of code works: the returned value (rax register) is used as a pointer to destination buffer.

// memcpy( buffer, DIR_Windows, len * sizeof(WCHAR) );
  mov    rsi,QWORD PTR [rip+0x3e80aa]      # 455c30 <DIR_Windows>
  mov    rdx,rbx
  mov    rdi,rax
  call   26000 <memcpy@plt>
// HERE rax points to destination buffer and rdi is corrupted by memcpy
  mov    rcx,rax
// memcpy( buffer + len, default_syswow64W, sizeof(default_syswow64W) );
  mov    rax,QWORD PTR [rip+0x33d95]       # a1930 <default_syswow64W.21831>
  xor    edx,edx
  mov    QWORD PTR [rcx+rbx*1],rax
  mov    rax,QWORD PTR [rip+0x33d90]       # a1938<default_syswow64W.21831+0x8>
  mov    QWORD PTR [rip+0x3e8071],rcx      # 455c20 <DIR_SysWow64>
  mov    QWORD PTR [rcx+rbx*1+0x8],rax



This block of code does not work: the returned value (rax) is immediately overwritten and rdi (corrupted by memcpy()) is used as a destination buffer.

// memcpy( buffer, DIR_Windows, len * sizeof(WCHAR) );
  mov    rsi,QWORD PTR [rip+0x3e296a]        # 455d10 <DIR_Windows>
  mov    rdi,rax
  mov    rdx,rbx
  call   26060 <memcpy@plt>
// memcpy( buffer + len, default_syswow64W, sizeof(default_syswow64W) );
// HERE rdi is corrupted my GLIBC memcpy and rax is going to be overwritten
  mov    rax,QWORD PTR [rip+0x2e698]        # a1a50 <default_syswow64W.21831>
  xor    edx,edx
  mov    rcx,rdi
  mov    QWORD PTR [rdi+rbx*1],rax


I'm not sure whether registers with arguments must be kept intact after function returns, but it seems the bug is found - in gcc or glibc.

Comment 12 Kirill Smirnov 2013-04-23 22:01:18 UTC

I' sorry, forgot to mention compiler flags: -O2 -g

Comment 13 Bernd Schmidt 2013-04-24 09:25:47 UTC

We need at least preprocessed source of the failing code, produced with gcc -E.

Comment 14 Kirill Smirnov 2013-04-24 16:06:41 UTC

Created attachment 29929 [details]
gcc -E output.

Attached gcc -E output. Lines around 22354 are being investigated.

Comment 15 Jakub Jelinek 2013-04-24 16:50:21 UTC

Ah, ok, I can see it.  To look at 22354 it helps to sed -i 's/^# .*$//', then look for 22354 in the dump.  Seems the code is still correct at *.ce3 pass,
Bernd's optimization kicks in during RA time before that and the pseudo holding
buffer is assigned to %rdi before the call memcpy and assumed to live in %rax right after the call.  But cprop_hardreg breaks this, changes the %rax after the memcpy call into %rdi, even when %rdi is call clobbered register, and even the call pattern contains (clobber (reg:DI 5 di)) and REG_DEAD note for the same register.

Comment 16 Jakub Jelinek 2013-04-24 18:24:23 UTC

Reduced testcase (non-runtime, but one can see that %rdi which should be
clobbered by the memcpy call is used immediately after the memcpy call).
Probably wouldn't be too hard to turn this into an executable testcase, by adding some noinline/noclone attributes, define bar, baz functions, and in main initialize b.

const unsigned short *b, *c;
unsigned bar (void);
unsigned short *baz (unsigned long);

void __attribute__ ((ms_abi))
test (void)
{
  unsigned d;
  unsigned short *e;
  if ((d = bar ()))
    {
      e = baz (d * sizeof (unsigned short) + 20);
      __builtin_memcpy (e, b, d * sizeof (unsigned short));
      c = e;
    }
}

The ms_abi attribute seems to be essential for this, so perhaps something is broken in the ms ABI support or when mixing the two ABIs?

Comment 17 Jakub Jelinek 2013-04-24 18:45:28 UTC

Runtime testcase for -O2, works with -O2 -fno-cprop-registers.
It doesn't fail always, but around in 50% of cases, and heavily depends on what glibc is in use and what CPU too (as contemporary glibc's have IFUNC memcpy and select implementations based on cpuid).

#define N 2001
unsigned short *b, *c, *d;

__attribute__ ((noinline, noclone)) unsigned
bar (void)
{
  asm volatile ("" : : : "memory");
  return N;
}

__attribute__ ((noinline, noclone)) unsigned short *
baz (unsigned long x)
{
  if (x != N * sizeof (unsigned short) + 20)
    __builtin_abort ();
  asm volatile ("" : : : "memory");
  return d;
}

__attribute__ ((ms_abi, noinline, noclone))
foo (void)
{
  unsigned d;
  unsigned short *e;
  if ((d = bar ()))
    {
      e = baz (d * sizeof (unsigned short) + 20);
      __builtin_memcpy (e, b, d * sizeof (unsigned short));
      c = e;
    }
}

int
main ()
{
  unsigned short a[2 * N];
  int i;
  for (i = 0; i < 2 * N; i++)
    a[i] = i + 1;
  b = a;
  d = a + N;
  asm volatile ("" : : : "memory");
  foo ();
  for (i = 0; i < N; i++)
    if (a[i] != i + 1 || a[i + N] != i + 1)
      __builtin_abort ();
  if (c != a + N)
    __builtin_abort ();
  return 0;
}

Comment 18 Jakub Jelinek 2013-04-25 07:51:48 UTC

Ah, so the issue is related to the fact that %rdi/%rsi aren't call clobbered in ms_abi, but are call clobbered in the sysv abi.  The CALL_INSN pattern has clobbers for rsi/rdi, but copyprop_hardreg_forward_1 processes those early, perhaps then the CALL_INSN_FUNCTION_USAGE changes something and relies on
regs_invalidated_by_call processing to invalidate whatever is necessary.

Comment 19 Jakub Jelinek 2013-04-25 09:54:20 UTC

Created attachment 29936 [details]
gcc49-pr57003.patch

Untested fix.  While we kill_clobbered_value early, for SET in CALL_INSN_FUNCTION_USAGE we add a value equivalence.  In standard x86_64 ABI %rdi is invalidated by the call and present in the hard reg set to invalidate, so it is removed again, and perhaps if there is some target where the first argument is passed in non-call-clobbered register, the equivalence is correct.

So, what the patch does is in this case apply the clobbers again.  Bernd, does this look reasonable to you as the author of those changes?

Comment 20 Bernd Schmidt 2013-04-25 10:12:12 UTC

Patch is OK, thanks Jakub - you were too fast for me on this one.

Comment 21 Jakub Jelinek 2013-04-26 13:16:21 UTC

Author: jakub
Date: Thu Apr 25 21:49:22 2013
New Revision: 198320

URL: http://gcc.gnu.org/viewcvs?rev=198320&root=gcc&view=rev
Log:
	PR rtl-optimization/57003
	* regcprop.c (copyprop_hardreg_forward_1): If ksvd.ignore_set_reg,
	call note_stores with kill_clobbered_value callback again after
	killing regs_invalidated_by_call.

	* gcc.target/i386/pr57003.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr57003.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/regcprop.c
    trunk/gcc/testsuite/ChangeLog

Author: jakub
Date: Thu Apr 25 21:50:26 2013
New Revision: 198321

URL: http://gcc.gnu.org/viewcvs?rev=198321&root=gcc&view=rev
Log:
	PR rtl-optimization/57003
	* regcprop.c (copyprop_hardreg_forward_1): If ksvd.ignore_set_reg,
	call note_stores with kill_clobbered_value callback again after
	killing regs_invalidated_by_call.

	* gcc.target/i386/pr57003.c: New test.

Added:
    branches/gcc-4_8-branch/gcc/testsuite/gcc.target/i386/pr57003.c
Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/regcprop.c
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog

Comment 22 Kirill Smirnov 2013-04-26 17:50:07 UTC

Confirming: the attached patch fixes the problem with wine.

Thank you!

Comment 23 Uroš Bizjak 2014-10-03 14:23:06 UTC

My r215428 change regressed this PR again. The difference is:

--- pr57003.s   2014-10-03 15:08:24.000000000 +0200
+++ pr57003_.s  2014-10-03 15:08:19.000000000 +0200
@@ -78,7 +78,7 @@
        leaq    -20(%rbx), %rdx
        movq    %rax, %rdi
        call    memcpy
-       movq    %rdi, c(%rip)
+       movq    %rax, c(%rip)
 .L8:
        movaps  (%rsp), %xmm6
        movaps  16(%rsp), %xmm7
@@ -321,5 +321,5 @@
        .byte   0xb
        .align 8
 .LEFDE7:
-       .ident  "GCC: (GNU) 5.0.0 20141002 (experimental) [trunk revision 215797]"
+       .ident  "GCC: (GNU) 4.9.2 20141001 (prerelease) [gcc-4_9-branch revision 215749]"
        .section        .note.GNU-stack,"",@progbits

So, gcc-5.0 does not detect that %rdi is clobbered in ELF ABI.

The runtime failure happens only on CentOS5 (and not in Fedora20), which supports findings in Comment #17.

The difference is, that previously we emit memcpy call as:

#(call_insn:TI 24 23 27 3 (set (reg:DI 0 ax)
#        (call (mem:QI (symbol_ref:DI ("memcpy") [flags 0x41] <function_decl 0x7fce6f586438 memcpy>) [0 memcpy S1 A8])
#            (const_int 0 [0]))) pr57003.c:32 661 {*call_value}
#     (expr_list:REG_DEAD (reg:DI 5 di)
#        (expr_list:REG_DEAD (reg:DI 4 si)
#            (expr_list:REG_DEAD (reg:DI 1 dx)
#                (expr_list:REG_UNUSED (reg:DI 0 ax)
#                    (expr_list:REG_RETURNED (reg/v/f:DI 2 cx [orig:87 e ] [87])
#                        (expr_list:REG_CALL_DECL (symbol_ref:DI ("memcpy") [flags 0x41] <function_decl 0x7fce6f586438 memcpy>)
#                            (expr_list:REG_EH_REGION (const_int 0 [0])
#                                (nil))))))))
#    (expr_list (clobber (reg:TI 52 xmm15))
#        (expr_list (clobber (reg:TI 51 xmm14))
#            (expr_list (clobber (reg:TI 50 xmm13))
#                (expr_list (clobber (reg:TI 49 xmm12))
#                    (expr_list (clobber (reg:TI 48 xmm11))
#                        (expr_list (clobber (reg:TI 47 xmm10))
#                            (expr_list (clobber (reg:TI 46 xmm9))
#                                (expr_list (clobber (reg:TI 45 xmm8))
#                                    (expr_list (clobber (reg:TI 28 xmm7))
#                                        (expr_list (clobber (reg:TI 27 xmm6))
#                                            (expr_list (clobber (reg:DI 5 di))
#                                                (expr_list (clobber (reg:DI 4 si))
#                                                    (expr_list:DI (set (reg:DI 0 ax)
#                                                            (reg:DI 5 di))
#                                                        (expr_list:DI (use (reg:DI 5 di))
#                                                            (expr_list:DI (use (reg:DI 4 si))
#                                                                (expr_list:DI (use (reg:DI 1 dx))
#                                                                    (nil))))))))))))))))))

which is alternate, but equivalent form of what was generated previously:

#(call_insn:TI 24 23 27 3 (parallel [
#            (set (reg:DI 0 ax)
#                (call (mem:QI (symbol_ref:DI ("memcpy") [flags 0x41] <function_decl 0x7fd91824a800 memcpy>) [0 memcpy S1 A8])
#                    (const_int 0 [0])))
#            (unspec [
#                    (const_int 0 [0])
#                ] UNSPEC_MS_TO_SYSV_CALL)
#            (clobber (reg:DI 4 si))
#            (clobber (reg:DI 5 di))
#            (clobber (reg:TI 27 xmm6))
#            (clobber (reg:TI 28 xmm7))
#            (clobber (reg:TI 45 xmm8))
#            (clobber (reg:TI 46 xmm9))
#            (clobber (reg:TI 47 xmm10))
#            (clobber (reg:TI 48 xmm11))
#            (clobber (reg:TI 49 xmm12))
#            (clobber (reg:TI 50 xmm13))
#            (clobber (reg:TI 51 xmm14))
#            (clobber (reg:TI 52 xmm15))
#        ]) pr57003.c:32 652 {*call_value_rex64_ms_sysv}
#     (expr_list:REG_DEAD (reg:DI 5 di)
#        (expr_list:REG_DEAD (reg:DI 4 si)
#            (expr_list:REG_DEAD (reg:DI 1 dx)
#                (expr_list:REG_RETURNED (reg/v/f:DI 2 cx [orig:87 e ] [87])
#                    (expr_list:REG_EH_REGION (const_int 0 [0])
#                        (nil))))))
#    (expr_list:DI (set (reg:DI 0 ax)
#            (reg:DI 5 di))
#        (expr_list:DI (use (reg:DI 5 di))
#            (expr_list:DI (use (reg:DI 4 si))
#                (expr_list:DI (use (reg:DI 1 dx))
#                    (nil))))))

It looks that Jakub's patch, proposed in Comment #21 doesn't cover alternative form, so it doesn't record clobbers in the alternative form properly.

Reopened as 5.0 regression.

Comment 24 Uroš Bizjak 2014-10-03 14:24:47 UTC

(In reply to Uroš Bizjak from comment #23)

> The difference is, that previously we emit memcpy call as:

Slip of the tongue, this should read:

... that now we emit memcpy call as:

Comment 25 Uroš Bizjak 2014-10-03 16:30:55 UTC

Patch in testing:

--cut here--
Index: regcprop.c
===================================================================
--- regcprop.c  (revision 215852)
+++ regcprop.c  (working copy)
@@ -1030,6 +1030,12 @@ copyprop_hardreg_forward_1 (basic_block bb, struct
             assume the value in it is still live.  */
          if (ksvd.ignore_set_reg)
            note_stores (PATTERN (insn), kill_clobbered_value, vd);
+         for (exp = CALL_INSN_FUNCTION_USAGE (insn); exp; exp = XEXP (exp, 1))
+           {
+             rtx x = XEXP (exp, 0);
+             if (GET_CODE (x) == CLOBBER)
+               kill_value (SET_DEST (x), vd);
+           }
        }
 
       /* Notice stores.  */
--cut here--

Comment 26 Uroš Bizjak 2014-10-03 17:55:28 UTC

Additional patch at [1]

[1] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00332.html

Comment 27 Jakub Jelinek 2014-10-03 18:08:29 UTC

Shouldn't it be guarded by if (ksvd.ignored_set_reg) too?

Comment 28 Uroš Bizjak 2014-10-03 18:10:41 UTC

(In reply to Jakub Jelinek from comment #27)
> Shouldn't it be guarded by if (ksvd.ignored_set_reg) too?

Yes, and the published patch implements just that.

Comment 29 uros 2014-10-09 06:36:40 UTC

Author: uros
Date: Thu Oct  9 06:36:08 2014
New Revision: 216026

URL: https://gcc.gnu.org/viewcvs?rev=216026&root=gcc&view=rev
Log:
	PR rtl-optimization/57003
	* regcprop.c (copyprop_hardreg_forward_1): If ksvd.ignore_set_reg,
	also check CALL_INSN_FUNCTION_USAGE for clobbers again after
	killing regs_invalidated_by_call.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/regcprop.c

Comment 30 uros 2014-10-09 08:22:54 UTC

Author: uros
Date: Thu Oct  9 08:22:23 2014
New Revision: 216030

URL: https://gcc.gnu.org/viewcvs?rev=216030&root=gcc&view=rev
Log:
	Backport from mainline
	2014-10-09  Uros Bizjak  <ubizjak@gmail.com>

	PR rtl-optimization/57003
	* regcprop.c (copyprop_hardreg_forward_1): If ksvd.ignore_set_reg,
	also check CALL_INSN_FUNCTION_USAGE for clobbers again after
	killing regs_invalidated_by_call.


Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/regcprop.c

Comment 31 uros 2014-10-09 09:06:09 UTC

Author: uros
Date: Thu Oct  9 09:05:37 2014
New Revision: 216035

URL: https://gcc.gnu.org/viewcvs?rev=216035&root=gcc&view=rev
Log:
	Backport from mainline
	2014-10-09  Uros Bizjak  <ubizjak@gmail.com>

	PR rtl-optimization/57003
	* regcprop.c (copyprop_hardreg_forward_1): If ksvd.ignore_set_reg,
	also check CALL_INSN_FUNCTION_USAGE for clobbers again after
	killing regs_invalidated_by_call.


Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/regcprop.c

Comment 32 Uroš Bizjak 2014-10-09 13:26:22 UTC

Again fixed everywhere.