This is the mail archive of the
java@gcc.gnu.org
mailing list for the Java project.
RE: Fibonacci and performance
- To: "'tromey at redhat dot com'" <tromey at redhat dot com>, Jeff Sturm <jsturm at one-point dot com>
- Subject: RE: Fibonacci and performance
- From: "Boehm, Hans" <hans_boehm at hp dot com>
- Date: Tue, 1 May 2001 09:57:28 -0700
- Cc: "Boehm, Hans" <hans_boehm at hp dot com>, "'green at redhat dot com'" <green at redhat dot com>, Java Discuss List <java at gcc dot gnu dot org>
> From: Tom Tromey [mailto:tromey@redhat.com]
>
> Hans> Isn't there a fundamental problem here? The testing of the
> Hans> "initialized" flag may not be reordered with respect to variable
> Hans> references, and hence should be treated as volatile (and
> Hans> possibly requiring a memory barrier) by the back end. But if
> Hans> it's treated as volatile, you can't optimize out redundant
> Hans> tests?
>
> Jeff> I think AG is describing something else. His method-local flag
> Jeff> serves to eliminate redundant calls to _Jv_InitClass at compile
> Jeff> time.
I'm confused. Is the method-local flag static? I had assumed so.
If not, then the reordering issues go away, but you've only solved part of
the
problem (which is fine for now).
>
Tom> But Hans is saying that in AG's scenario the check of the method-local
Tom> flag could be reordered with respect to access to class variables,
Tom> unless the flag is volatile.
Tom>
Tom> I'm not sure this can actually happen. The test will look like:
Tom>
Tom> if (! method-local-flag) { _Jv_InitClass (...); m-l-f = true; }
Tom>
Tom> Will the compiler really pull a class variable access before this? I
Tom> don't see how it could do that. It seems to me that if the compiler
Tom> could do that then our current approach of always calling
Tom> _Jv_InitClass is also broken.
After I posted the earlier message, I temporarily arrived at the same
conclusion as Tom just did. But after thinking about it some more, I think
the compiler reordering may not happen yet in gcc, but there are other
compilers that might already do so, maybe even with good reason. And gcc is
likely to start doing this sort of thing.
Let s_f be a static field, and x be a local variable.
I claim that (at least on something with more than 8 registers) the compiler
should usually transform
if (! m_l_f) { _Jv_InitClass (...); m_l_f = true; }
x = s_f;
... x ...
to something like
x = s_f;
if (! m_l_f) { _Jv_InitClass (...); m_l_f = true; x = s_f; }
... x ...
This hides more of the poential load latency for s_f along the fast path.
(Though both loads should happen before the conditional, I suspect that
usually the actual load of m_l_f should still precede the load of s_f in the
final schedule, since it's used first. But since they're now in the same
basic block, and there is no dependency, I'd be unwilling to bet on that.
The outcome may depend on other considerations.)
Probably the more important issue is hardware reordering. Treating m_l_f as
volatile deals with that issue on Itanium, but it's not sufficient on Alpha.
None of this applies if m_l_f is an automatic variable.
Hans