This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/58897] New: Improve 128/64 division
- From: "glisse at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 27 Oct 2013 21:35:29 +0000
- Subject: [Bug target/58897] New: Improve 128/64 division
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58897
Bug ID: 58897
Summary: Improve 128/64 division
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: glisse at gcc dot gnu.org
Target: x86_64-linux-gnu
typedef unsigned __int128 ui;
ui f(ui a, unsigned long b){
return a/b;
}
is compiled to a library call to __udivti3, which is implemented as a rather
long loop. However, it seems to me that 2 calls to divq should do it (and
sometimes only 1 if we have range information on the result).
Ideally the following would eventually compile to just mul+div, but that's
probably too complicated for now.
unsigned long prod(unsigned long a, unsigned long b, unsigned long m){
if (a >= m || b >= m) __builtin_unreachable ();
return ((unsigned __int128) a * b) % m;
}