This is the mail archive of the
mailing list for the GCC project.
Re: -fno-tree-cselim not working?
- From: Andi Kleen <andi at firstfloor dot org>
- To: "Richard Guenther" <richard dot guenther at gmail dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Fri, 26 Oct 2007 21:56:00 +0200
- Subject: Re: -fno-tree-cselim not working?
- References: <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <015601c817e3$649ffa00$2e08a8c0@CAM.ARTIMI.COM.suse.lists.egcs> <firstname.lastname@example.org> <016101c817e8$7323d9c0$2e08a8c0@CAM.ARTIMI.COM.suse.lists.egcs> <email@example.com> <firstname.lastname@example.org>
"Richard Guenther" <email@example.com> writes:
> I hope we're not trying to support such w/o volatile counter. Whatever
> POSIX says, this would pessimize generic code too much.
It is dubious this transformation is an optimization at all for memory.
e.g. consider the case counter is not in cache.
You'll add an cache miss which will be 2-3 degrees of magnitude
more costly than what you can safe by not jumping. Full cache misses are so
expensive that even when they happen rarely they still hurt a lot.
There might be a case for doing it on memory when you can pretty much
guarantee the variable is in L1 (e.g. it is in the stack frame and
you only got a very small stack frame) or only in a register.
But for other cases it's likely better to not do it at all.
BTW there is a cache friendly (and incidentially thread-safe) alternative way
to eliminate the jump transformation when the CPU has CMOV available.
You can use
int dummy; // on stack, likely in L1
ptr = &dummy;
if (cond) // can be implemented jumpless using CMOV
ptr = &counter;
This will take more registers though.