This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fwd: vectorization with negative offsets


On 8/30/2013 2:00 PM, Nagaraju Mekala wrote:
   I was working with X86 gcc-4.7.2 version. I found that gcc was not
able to vectorize small simple loops with negative offsets...


  assume ntimes & LEN to be some constant. In  my case they are 200000
int abc()

{
  for (int nl = 0; nl < ntimes*3; nl++) {
  for (int i = LEN - 1; i >= 0; i--) {
  a[i] = b[i] + (float) 1.;
}
}}

if we modify the above code as below Gcc has vectorized them.

int abc()
{
for (int nl = 0; nl < ntimes*3; nl++) {
for (int i = 0; i <= LEN - 1;  i++) {
a[i] = b[i] + (float) 1.;
}
}}

Can anyone explain why GCC is not able to vectorize is..

Thanks,
Nagaraju
I would call this negative stride. I don't see any offsets in your example. If this is 32-bit mode, you must be specifying a -march or you would not have vectorization even with the loop reversed (and you must be specifying -std=c99 or a C++ mode). This is well known. There are compilers e.g. Sun Studio which will vectorize negative 1 stride, but without the peeling for alignment which is desired for several common platforms (including the defaults for x86_64 and older vector ones for 387). I don't know how fundamental this is to gcc. For targets with gather instructions (not supported by gcc-4.7) those can be used to vectorize without reversing the loop (not necessarily with full efficiency). Your example also brings up the question of whether you expect the compiler to eliminate duplicated code automatically (the compiler may see that your outer loop can be collapsed to a single iteration).

--
Tim Prince


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]