Need C optimization help

Mark Rose markrose@markrose.ca
Sat May 15 20:01:00 GMT 2010


Hi All,

I have a line that gets called approxmiately 2 trillion times according to 
valgrind, and I love any suggestions for speeding it up. I have taken over the 
project from someone else and my C abilities are only intermediate. I've 
attached a trimmed down version of the function.

Here are some notes about the code:
* tsMemAlloc and tsMemFree are basically wrappers to malloc/free.
* dsfmt_genrand_close_open() generates a number in the range [0,1)

And notes about the data:
* n_topics is 35 (loaded from a configuration file; changeable)
* n_tokens is about 7M (and increases as the data set grows)
* params->iter is 500
* topics->wcount size is about 190k items
* topics->dcount size is about 275k items
* topics->tcount size is 35 items

I'm compling on an Opteron, with options: -03 -std=gnu99 -march=native -ffast-
math -ftree-loop-distribution -ftree-loop-linear -ftree-interchange -floop-
block -ftree-vectorizer-verbose=8

I'm using GCC 4.4.3.

The tree vectorizer spits out this note about the loop I'm particularly 
interested in:

for (int k = 1; k < n_topics; k++)
	cdf[k] = cdf[k-1] + (dcount[k+d_offset] + alpha)*(wcount[k+w_offset + 
beta)/(tcount[k] + wbeta);
note: not vectorized: data ref analysis failed D.8597_92 = *D.8596_91;

I've done hours of googling, playing with restrict keywords, splitting the 
cdf[k-1] addition into another loop, and nothing will help. The error message 
itself is very poor, as I can't find a decent explanation anywhere online as to 
what it means nor how to fix it.

Thanks for whatever insight you can give!

-Mark
-------------- next part --------------
A non-text attachment was scrubbed...
Name: topicsort.c
Type: text/x-csrc
Size: 12480 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-help/attachments/20100515/d26bfa33/attachment.bin>


More information about the Gcc-help mailing list