This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Re: Bloated Struct Problem
On Fri, Jun 5, 2009 at 7:50 AM, Brian Budge<brian.budge@gmail.com> wrote:
> I think that one famous experiment showcasing a small subset of
> problems is probably not enough to definitively say that we should be
> packing our structs :) ?As always, experimentation and profiling on
> your individual program should be done before overriding the compiler
> with a micro-optimization.
Hi Brian,
First, as to your alignment question, it's 20 because I'm compiling to
a 32-bit target. So yes, I expect that your result of 24 is due to
x86-64. With the packing attribute on, it reverts to 17.
Secondly, as to the more general problem, I think the statistics of
most realworld usage will bear out the results of this experiment.
(In other words, I think the default for x86(-64) should be packed,
but I'll leave it to others to have that flame war.) Granted, the
experiment pitted an extremely bus-intensive algorithm against a large
number of starving cores. But it should be clear that, as the number
of cores approaches infinity, the rate of forward progress of any
cache-blowing algorithm approaches a linear relationship with the
frontside bus frequency, and not the number of cores (i.e. utilization
approaches zero). This is why a footprint reduction could ideally
result in a concomitant and reciprocal performance improvement. This
assumes, of course, that there are multiple cores on a single
frontside, as opposed to a network of individual cores; such is the
case with most x86 configurations.
I can imagine that unless this problem gets solved in the hardware,
algorithms running on centicore chips will actually perform better if
they use full-blown data compression when talking through the
frontside.
But yes, in any specific case, I am still in favor of profiling.