what optimization can be expected?
Burlen Loring
burlen.loring@gmail.com
Fri Apr 24 12:24:00 GMT 2009
Tim Prince wrote:
> burlen wrote:
>
>
>> Can loops with a non-unit stride be automagically optimized by compiler
>> with SSE?
>>
>> template <int nComp>
>> void norm(double *result, double *data, size_t n)
>> {
>> double *pDat=data;
>> double *pRes=result;
>>
>> for (size_t i=0; i<n; ++i)
>> {
>> *pRes=*pDat**pDat;
>> for (int j=1; j<nComp; ++j)
>> {
>> *pRes+=pDat[j]*pDat[j];
>> }
>> *pRes=sqrt(*pRes);
>>
>> pRes+=1;
>> pDat+=nComp;
>> }
>> }
>>
>
> Your inner loop appears to have unit stride, and might be optimized easily
> if you didn't write it with potential aliases. If you meant
> inner_product(), why not use that?
>
Inner loop does have unit stride but its usually small between 1 and 12
and the outer loop is usually large in the 10-100s of thousands. That
example is simply one simple situation that I encounter. I want to
understand how the compiler applies SSE optimization. What can be
automatically SSE optimized by g++? Is this documented somewhere?
I want to write in such a way to take advantage of g++ capability. It's
important for me to let g++ do optimization because the code needs to be
cross platform.
> I know gmail is fashionable, but there's plenty of reason for it going in
> the spam box, and no effort at google to improve the situation.
>
Sorry but that's all I've got at the moment.
Thanks
Burlen
More information about the Gcc-help
mailing list