This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/48986] Missed optimization in atomic decrement on x86/x64
- From: "jakub at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 16 May 2011 11:39:09 +0000
- Subject: [Bug rtl-optimization/48986] Missed optimization in atomic decrement on x86/x64
- Auto-submitted: auto-generated
- References: <bug-48986-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |NEW
CC| |uros at gcc dot gnu.org
AssignedTo|jakub at gcc dot gnu.org |unassigned at gcc dot
| |gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-05-16 11:26:51 UTC ---
On:
int
foo (int *p)
{
return __sync_fetch_and_add (p, -1) == 1;
}
int
bar (int *p)
{
return __sync_add_and_fetch (p, -1) == 0;
}
I get better generated code for the second routine if I do:
--- gcc/config/i386/sync.md.jj 72010-05-21 11:46:29.000000000 +0200
+++ gcc/config/i386/sync.md 2011-05-16 13:06:13.000000000 +0200
@@ -170,7 +170,7 @@
[(match_operand:SWI 1 "memory_operand" "+m")] UNSPECV_XCHG))
(set (match_dup 1)
(plus:SWI (match_dup 1)
- (match_operand:SWI 2 "register_operand" "0")))
+ (match_operand:SWI 2 "nonmemory_operand" "0")))
(clobber (reg:CC FLAGS_REG))]
"TARGET_XADD"
"lock{%;} xadd{<imodesuffix>}\t{%0, %1|%1, %0}")
and for foo identical code, so maybe that change is always beneficial, allowing
combiner and other early RTL passes to see there a constant instead of a REG.
Unfortunately, even with this change the combiner doesn't attempt to combine
this pattern with the following cmpsi_1 pattern, supposedly because
sync_old_addsi pattern isn't single_set. I guess we could handle this during
expansion, but it would be a mess, or some other pass (e.g. peephole2 or
something similar). peephole2 might kind of too late though, by that time the
constant must be loaded already into some register, so we'd need to peephole2 3
insns, where the load of the constant might often not be the first one.