what optimization can be expected?

Burlen Loring burlen.loring@gmail.com
Fri Apr 24 12:24:00 GMT 2009


Tim Prince wrote:
> burlen wrote:
>
>   
>> Can loops with a non-unit stride be automagically optimized by compiler
>> with SSE?
>>
>> template <int nComp>
>> void norm(double *result, double *data, size_t n)
>> {
>>  double *pDat=data;
>>  double *pRes=result;
>>
>>  for (size_t i=0; i<n; ++i)
>>  {
>>    *pRes=*pDat**pDat;
>>    for (int j=1; j<nComp; ++j)
>>    {
>>      *pRes+=pDat[j]*pDat[j];
>>    }
>>    *pRes=sqrt(*pRes);
>>
>>    pRes+=1;
>>    pDat+=nComp;
>>  }
>> }
>>     
>
> Your inner loop appears to have unit stride, and might be optimized easily
> if you didn't write it with potential aliases.  If you meant
> inner_product(), why not use that?
>   
Inner loop does have unit stride but its usually small between 1 and 12 
and the outer loop is usually large in the 10-100s of thousands. That 
example is simply one simple situation that I encounter. I want to 
understand how the compiler applies SSE optimization. What can be 
automatically SSE optimized by g++? Is this documented somewhere?

I want to write in such a way to take advantage of g++ capability. It's 
important for me to let g++ do optimization because the code needs to be 
cross platform.

> I know gmail is fashionable, but there's plenty of reason for it going in
> the spam box, and no effort at google to improve the situation.
>   
Sorry but that's all I've got at the moment.

Thanks

Burlen



More information about the Gcc-help mailing list