Traps for signed arithmetic overflow
Helmut Eller
eller.helmut@gmail.com
Fri Nov 23 20:02:00 GMT 2018
Hello,
when compiling this example with gcc -O2 -ftrapv:
long foo (long x, long y) { return x + y; }
long bar (long x, long y) {
long z;
if (__builtin_add_overflow (x, y, &z))
__builtin_trap ();
return z;
}
then GCC seems to produce less efficient code for foo than for bar:
foo:
subq $8, %rsp
call __addvdi3@PLT
addq $8, %rsp
ret
bar:
movq %rdi, %rax
addq %rsi, %rax
jo .L9
rep ret
.L9:
ud2
I see several inefficiencies:
1.) __addvdi3 is not inlined.
2.) %rsp is adjusted before calling __addvdi3. Why is that needed?
3.) Obviously __addvdi3 is not implemented as sibling-call even though
-O2 should enable that.
Where should I start, if I wanted to teach GCC how to produce the same
code for foo as for bar? Would it be enough to add a pattern to
i386.md? There is already a pattern for "addv<mode>4", but apparently
it's not used in this case.
Helmut
More information about the Gcc-help
mailing list