[PATCH] Fix *_BY_PIECES_P

Roger Sayle roger@eyesopen.com
Mon Aug 9 16:23:00 GMT 2004


On Mon, 9 Aug 2004, Jakub Jelinek wrote:
> I have added a 5 insn cap for bzero on i386, because GCC doesn't optimize
> many identical constant arguments by loading the constant into a register
> and using that register in the instructions.
> For very small number of instructions that isn't worthwhile, but e.g.
> movl $0, (%edi)
> movl $0, 4(%edi)
> movl $0, 8(%edi)
> movl $0, 12(%edi)
> movl $0, 16(%edi)
> movl $0, 20(%edi)
> movl $0, 24(%edi)
> movl $0, 28(%edi)
> is already much bigger and also slightly slower on e.g. P4 than:
> xorl %eax, %eax
> movl %eax, (%edi)
> movl %eax, 4(%edi)
> movl %eax, 8(%edi)
> movl %eax, 12(%edi)
> movl %eax, 16(%edi)
> movl %eax, 20(%edi)
> movl %eax, 24(%edi)
> movl %eax, 28(%edi)

I think this is a red herring.  Yes, GCC doesn't un-CSE the zero constant
into it's own register, but the CLEAR_RATIO threshold is between using the
inefficient inline sequence vs. a clrmem or libcall.  If the top sequence
of eight instructions is faster than a call to memset or a stos* sequence,
then CLEAR_RATIO should be higher than six.



> 2004-08-09  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR target/3144
> 	* expr.c (move_by_pieces_ninsns): Add max_size argument.
> 	(MOVE_BY_PIECES_P): Pass MOVE_MAX_PIECES + 1 to it.
> 	(CLEAR_BY_PIECES_P): Pass STORE_MAX_PIECES + 1 to it.
> 	(STORE_BY_PIECES_P): Define similarly to CLEAR_BY_PIECES_P,
> 	but using MOVE_RATIO.
> 	(move_by_pieces): Pass max_size to move_by_pieces_ninsns.
> 	(can_store_by_pieces): Change max_size type to unsigned int.
> 	(store_by_pieces_1): Likewise.  Pass max_size to
> 	move_by_pieces_ninsns.
> 	* config/s390/s390.h (STORE_BY_PIECES_P): Define.
> 	* config/sh/sh.c (MOVE_BY_PIECES_P): Pass MOVE_MAX_PIECES + 1
> 	to move_by_pieces_ninsns.
> 	(STORE_BY_PIECES_P): Define.
> 	* config/ns32k/ns32k.h (STORE_BY_PIECES_P): Pass STORE_MAX_PIECES + 1
> 	to move_by_pieces_ninsns.
> 	* doc/tm.texi (STORE_BY_PIECES_P): Document changed default.
>
> 	* config/i386/i386.h (CLEAR_RATIO): Define.


Ok for mainline, with the fix to the PR number pointed out by Giovanni.

Roger
--



More information about the Gcc-patches mailing list