Bug 11877 - gcc should use xor trick with -Os
: gcc should use xor trick with -Os
Status: ASSIGNED
Product: gcc
Classification: Unclassified
Component: target
: 3.3.1
: P2 enhancement
: ---
Assigned To: Kazu Hirata
: http://gcc.gnu.org/ml/gcc-patches/200...
: patch
: 23102
:
  Show dependency treegraph
 
Reported: 2003-08-10 15:47 UTC by Debian GCC Maintainers
Modified: 2006-01-05 20:22 UTC (History)
4 users (show)

See Also:
Host: i386-linux
Target: i386-linux
Build: i386-linux
Known to work:
Known to fail:
Last reconfirmed: 2005-07-22 20:08:16


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Debian GCC Maintainers 2003-08-10 15:47:28 UTC
[forwarded from http://bugs.debian.org/204687]

The program 
 
-- 
#include <linux/time.h> 
 
void foo(struct timespec *t, struct timespec *u) 
{ 
        struct timespec zero = {0, 0}; 
 
        *t = zero; 
        *u = zero; 
} 
-- 
 
produces the following code on i386 with -Os 
 
-- 
        .file   "b.c" 
        .text 
.globl foo 
        .type   foo, @function 
foo: 
        pushl   %ebp 
        movl    %esp, %ebp 
        movl    8(%ebp), %eax 
        movl    $0, (%eax) 
        movl    $0, 4(%eax) 
        movl    12(%ebp), %eax 
        movl    $0, (%eax) 
        movl    $0, 4(%eax) 
        leave 
        ret 
        .size   foo, .-foo 
        .ident  "GCC: (GNU) 3.3.1 20030626 (Debian prerelease)" 
-- 
 
It would be much better size-wise if it stored a zero in a register 
and then stored the register into those locations.
Comment 1 Andrew Pinski 2003-08-10 16:03:50 UTC
I can confirm that the xor trick will be a size win.

The movl still happens with the mainline (20030810).
Comment 2 Kazu Hirata 2003-12-31 21:19:28 UTC
(set (mem:DI ...) (const_int 0)) is split into two moves in SImode after
reload.
We could delay the split until after peephole2.
In peephole2, if a scratch reg is available,
load 0 into it with XOR and then copy that reg to two mem:SI locations.

Reduced to:

void
foo (long long *p)
{
  *p = 0;
}

The reduction from 13 bytes down to 7 bytes sounds impressive.
My proposed solution would still leave two XORs, though.
Comment 3 Kazu Hirata 2004-01-04 06:55:49 UTC
Patch posted:

http://gcc.gnu.org/ml/gcc-patches/2004-01/msg00153.html
Comment 4 Andrew Pinski 2004-01-04 07:50:46 UTC
What about expanding (set (mem:DI ...) (const_int 0)) at expand time, this will
cause more 
opportunities to happen and then the discusion is up to other parts of the
compiler.
It looks like an easy change to ix86_expand_move.
Also interesting is that this testcase:
void
foo (long *p)
{
  *p = 0;
  p[1] = 0;
}
Does not use the xor trick either.
Comment 5 Kazu Hirata 2004-02-01 17:24:57 UTC
Then we need something like an un-cse pass.
Comment 6 Kazu Hirata 2004-03-25 19:16:13 UTC
Even if you split the long move in ix86_expand_move,
the constant 0 is propagated into the two moves.
I guess the right way may be uncse sometime after register allocation.
Comment 7 Andrew Pinski 2005-08-12 05:27:38 UTC
*** Bug 23338 has been marked as a duplicate of this bug. ***
Comment 8 Andrew Pinski 2005-08-12 05:28:32 UTC
PR 23102 is the bug for multiple xors.
Comment 9 Dan Nicolaescu 2006-01-05 20:22:31 UTC
(In reply to comment #7)
> *** Bug 23338 has been marked as a duplicate of this bug. ***
> 

Bug 23338 contained a patch that might fixed this issue. Here it is, so
that it can be evaluated.


*** i386.md    08 Aug 2005 16:38:37 -0700    1.652
--- i386.md    11 Aug 2005 11:27:11 -0700    
***************
*** 18874,18881 ****
    [(match_scratch:SI 1 "r")
     (set (match_operand:SI 0 "memory_operand" "")
          (const_int 0))]
!   "! optimize_size
!    && ! TARGET_USE_MOV0
     && TARGET_SPLIT_LONG_MOVES
     && get_attr_length (insn) >= ix86_cost->large_insn
     && peep2_regno_dead_p (0, FLAGS_REG)"
--- 18874,18880 ----
    [(match_scratch:SI 1 "r")
     (set (match_operand:SI 0 "memory_operand" "")
          (const_int 0))]
!   "! TARGET_USE_MOV0
     && TARGET_SPLIT_LONG_MOVES
     && get_attr_length (insn) >= ix86_cost->large_insn
     && peep2_regno_dead_p (0, FLAGS_REG)"