Bug 16427 - dead memset not optimized away
Summary: dead memset not optimized away
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.0.0
: P2 enhancement
Target Milestone: ---
Assignee: Richard Biener
URL:
Keywords: missed-optimization
Depends on:
Blocks: 36602
  Show dependency treegraph
 
Reported: 2004-07-08 01:25 UTC by Andi Kleen
Modified: 2016-12-15 21:05 UTC (History)
2 users (show)

See Also:
Host:
Target: x86_64-linux
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-06-09 11:59:50


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andi Kleen 2004-07-08 01:25:16 UTC
gcc version 3.5.0 20040704 (experimental)

void f(void)
{
        unsigned long x[16];
        memset(x, 0, sizeof(x));
}

generates 

f:
.LFB2:
        subq    $136, %rsp
.LCFI0:
        movl    $128, %edx
        xorl    %esi, %esi
        movq    %rsp, %rdi
        call    memset
        addq    $136, %rsp
        ret

but when the 0 is changed to 0xff or any other value != 0 you get

f:
.LFB2:
        subq    $16, %rsp
.LCFI0:
        addq    $16, %rsp
        ret

(which btw is also weird because it leaks 16 bytes, but that's a differen issue)

I would expect the 0 memset to be optimized away too.

Looking at the code builtin_memset calls clear_storage() for the 0 case
and store_by_pieces directly for any other values.
Clearly something clear_storage() does is giving the optimizer the fits.
Comment 1 Andrew Pinski 2004-07-08 01:31:55 UTC
Oh that is not leaking.

The real issue is that the memset is not removed at the tree level.
Comment 2 Andrew Pinski 2008-06-22 21:54:27 UTC
Related to bug 36602.
Comment 3 Andi Kleen 2010-06-09 11:09:49 UTC
Jakub, for this example: how would you suggest to work around this warning?
Comment 4 Richard Biener 2010-06-09 11:59:49 UTC
It's now optimized by RTL DSE but we keep the stack allocated.

Re-confirmed on the tree-level.  Should be easy to extend DSE to handle this.
Comment 5 Richard Biener 2011-03-09 14:56:26 UTC
Related to this is

struct X { int i; int j; int k; };
void foo (void)
{
  struct X a, b;
  __builtin_memcpy (&a, &b, 4);
}

where we are unable to DCE the memcpy call.

Both issues should be tackled at tree DCE level by better handling of
aliased (local) variables.  Needs the same infrastructure changes as
PR41490.
Comment 6 Jeffrey A. Law 2016-12-15 21:05:02 UTC
Both cases compile down to simple returns at the tree level now.  I did not bother to bisect precisely which changes are responsible for fixing this BZ.