Bug 28030 - missed optimization with load in a loop (restrict gets lost)
Summary: missed optimization with load in a loop (restrict gets lost)
Status: RESOLVED DUPLICATE of bug 14187
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: alias, missed-optimization
Depends on:
Blocks:
 
Reported: 2006-06-14 15:03 UTC by Peter Doerfler
Modified: 2008-10-01 14:34 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2008-03-14 21:16:58


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Doerfler 2006-06-14 15:03:14 UTC
The following code only gets vectorized with explicitly copying the "size_" member to the local "sz" variable for 4.1.1, 4.2.0 rev. 114610 and autovect branch.

=========================================
template <class T>
class vec 
{
public:
  vec& multiply(const vec& other)
    {
      // do something to make sure restrict is valid...

      const T* __restrict__ op = other.data_;
      T* __restrict__ tp = data_;
      
      unsigned int sz = size_; // NEEDED!
      
      for (unsigned int i=0; i<sz; ++i) {
        tp[i] *= op[i];
      }
      return *this;
    }

private:
  unsigned int size_;
  T* data_;
};

template class vec<int>;
=========================================

Without the local variable I get the following output:

g++ -O3 -ftree-vectorize -ftree-vectorizer-verbose=7 -march=pentium-m -c vectorizer.cpp

vectorizer.cpp:16: note: ===== analyze_loop_nest =====
vectorizer.cpp:16: note: === vect_analyze_loop_form ===
vectorizer.cpp:16: note: split exit edge.
vectorizer.cpp:16: note: === get_loop_niters ===
vectorizer.cpp:16: note: not vectorized: number of iterations cannot be computed.
vectorizer.cpp:16: note: bad loop form.
vectorizer.cpp:16: note: vectorized 0 loops in function.
--------------------------------------------------------------------

It works fine for T in [float, double, long int] 

Using
typedef int aint __attribute__ ((__aligned__(16)));
as in the examples doesn't help.

Replacing the template declaration with int and using the class in main() leads to vectorization of the loop without needing the local variable.
Comment 1 Andrew Pinski 2006-06-14 15:09:03 UTC
I doubt this has anything to do with templates and more to do with the aliasing info about the the load of this->size_ .
Comment 2 Richard Biener 2008-03-14 21:16:58 UTC
This issue boils down to restrict not working or getting lost.
Comment 3 Andrew Pinski 2008-03-14 21:24:37 UTC
This is most likely a dup of PR 16306.
Comment 4 Richard Biener 2008-10-01 14:34:08 UTC

*** This bug has been marked as a duplicate of 14187 ***