This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/34027] [4.3 regression] -Os code size nearly doubled
- From: "rguenther at suse dot de" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 9 Nov 2007 12:37:32 -0000
- Subject: [Bug tree-optimization/34027] [4.3 regression] -Os code size nearly doubled
- References: <bug-34027-9876@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #5 from rguenther at suse dot de 2007-11-09 12:37 -------
Subject: Re: [4.3 regression] -Os code size
nearly doubled
On Fri, 9 Nov 2007, jakub at gcc dot gnu dot org wrote:
> ------- Comment #4 from jakub at gcc dot gnu dot org 2007-11-09 12:30 -------
> So then shouldn't this bug be about:
> unsigned long long
> foo (unsigned long long ns)
> {
> return ns % 1000000000L;
> }
>
> unsigned long long
> bar (unsigned long long ns)
> {
> return ns - (ns / 1000000000L) * 1000000000L;
> }
>
> not compiling the same code at -Os? On x86_64 with -O2 it actually produces
> identical code with the subtraction, supposedly that's faster. Guess even
> (ns / 1000000000L) * 1000000000L should be folded into
> ns - (ns % 1000000000L).
With -O2 we express the division by the constant by multiplication / add
sequences. But for both we get the extra multiplication:
bar:
.LFB3:
movl $1000000000, %esi
movq %rdi, %rax
xorl %edx, %edx
divq %rsi
movq %rdi, %rcx
imulq $1000000000, %rax, %rdx
subq %rdx, %rcx
movq %rcx, %rax
ret
bar:
.LFB3:
movq %rdi, %rdx
movabsq $19342813113834067, %rax
shrq $9, %rdx
mulq %rdx
shrq $11, %rdx
imulq $1000000000, %rdx, %rdx
subq %rdx, %rdi
movq %rdi, %rax
ret
because we miss this folding opportunity.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34027