Bug 17958 - expand_divmod fails to optimize division of 64-bit quantity by small constant when BITS_PER_WORD is 32
Summary: expand_divmod fails to optimize division of 64-bit quantity by small constant...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.0.0
: P2 enhancement
Target Milestone: 11.0
Assignee: Dinar Temirbulatov
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2004-10-12 22:48 UTC by Zack Weinberg
Modified: 2021-08-15 12:14 UTC (History)
5 users (show)

See Also:
Host:
Target: powerpc-*-*
Build:
Known to work:
Known to fail: 4.8.3, 4.9.3, 5.3.0, 6.0
Last reconfirmed: 2016-01-27 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zack Weinberg 2004-10-12 22:48:17 UTC
expand_divmod cannot optimize code such as

long long div10(long long n) { return n / 10; }

when BITS_PER_WORD is 32.  A call to __divdi3 gets generated.  By contrast, when
BITS_PER_WORD is 64, this is optimized to a multiply and a shift.  I noticed
this problem on powerpc32, but it ought to affect any 32-bit target.
Comment 1 Andrew Pinski 2004-10-12 22:55:35 UTC
not all constants though (power of 2 are fine) for an example 16:
        mr r12,r3
        srawi r11,r3,31
        srawi r9,r11,31
        srawi r10,r11,31
        srwi r10,r9,28
        li r9,0
        addc r12,r10,r4
        adde r11,r9,r3
        srwi r4,r12,4
        insrwi r4,r11,4,0
        srawi r3,r11,4


Confirmed.
Comment 3 Martin Sebor 2016-01-27 22:54:00 UTC
It doesn't look like the patch referenced in comment #2 was ever committed and the 32-bit code still emits a call to __divdi3, not just on powerpc but also on x86_64.  This affects all still supported GCC versions.

$ cat ~/tmp/t.c && /build/gcc-trunk/gcc/xgcc -B /build/gcc-trunk/gcc -O2 -S -Wall -Wextra -Wpedantic -m32 -o/dev/stdout ~/tmp/t.c
long long div10(long long n) { return n / 10; }
	.file	"t.c"
	.machine power4
	.globl __divdi3
	.section	".text"
	.align 2
	.p2align 4,,15
	.globl div10
	.type	div10, @function
div10:
	stwu 1,-16(1)
	li 5,0
	mflr 0
	li 6,10
	stw 0,20(1)
	bl __divdi3
	lwz 0,20(1)
	addi 1,1,16
	mtlr 0
	blr
	.size	div10,.-div10
	.ident	"GCC: (GNU) 6.0.0 20160125 (experimental)"
	.section	.note.GNU-stack,"",@progbits
Comment 4 Andrew Pinski 2021-08-15 12:14:19 UTC
Implemented by r11-5533, r11-5614 (PPC improvement), and r11-5648.