This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug c++/35117] Vectorization on power PC
- From: "victork at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 11 Feb 2008 14:21:36 -0000
- Subject: [Bug c++/35117] Vectorization on power PC
- References: <bug-35117-15734@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #28 from victork at gcc dot gnu dot org 2008-02-11 14:21 -------
> As for the last email, Victor:
> 1. Using a smaller number of iterations, doesnt help me. This is not what the
> real world code runs.
Looks like in your example the memory subsystem is a performance bottleneck.
Vectorization alone does not help. Probably you need to think how to partition
your arrays to fit the data cache.
> 2. new/malloc almost didnt do anything maybe a gain of 20%
With data allocated my malloc compiler is able to prove independence
statically. So, it would be better to alocate memory by malloc.
> 3. The difference between 1.738sec and 0.781sec can either be a 2 times
> performance gain or simply a 1 second gain that would remain 1 second for more
> intensive calculations. Therefore I cant use/rely on the test you did.
See an example in my previous comment. It is about 2.4 times performance gain.
-- Victor
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35117