RFC patch for #pragma ivdep

Tue Oct 8 06:51:00 GMT 2013

Attached is an early version (C only) for  #pragma ivdep, which aids 
vectorization by setting (for the following for-loop) loop->safelen to 
INT_MAX. [In the final version, I will also add parsing support for C++ 
and use it for Fortran's "do concurrent".]

As suggested by Richard and Jakub (thanks!), it is implemented as follows:
* An ANNOTATE_EXPR with ANNOTATE_EXPR_ID == annot_expr_ivdep_kind is 
attached to the condition of the loop
* In gimplify.c, it is converted into an internal function (ANNOTATE)
* When the "struct loops" is created, the internal function is removed 
and loop->safelen is set to INT_MAX

RFC:
* The replacement of the internal function is done in cfgloop.c's 
flow_loops_find. The code path I am interested in is: pass_build_cfg -> 
execute_build_cfg -> loop_optimizer_init (AVOID_CFG_MODIFICATIONS) -> 
flow_loops_find. But flow_loops_find is also called elsewhere.
Thus, is this the best spot? Is the slowdown of walking the gimple 
statements acceptable? Additionally, there are some assumptions, which 
may or may not be valid:
- There is only one latch edge (if there are more, loop->latch is set to 
NULL and my code is not reached; if expand_ANNOTATE might then get 
called and one gets an ICE [gcc_unreachable()]).
- The IFN_ANNOTATE is just before the GIMPLE_COND
- The loop condition is in loop->latch->next_bb.
* Parsing: Currently, #pragma ivdep and #pragma omp for require that a 
for loop follows. Other compilers permit #pragma ivdep followed by 
#pragma omp for - and vice versa [which gets complicated when OpenMPv4's 
safelen is also used]. Is restricting to either ivdep xor another pragma 
fine?

Example:

void foo(int n, int *a, int *b, int *c) {
   int i;
#pragma ivdep
   for (i = 0; i < n; ++i) {
     a[i] = b[i] + c[i];
   }
}

Without the pragma, "gcc -O3 -fopt-info-vec-optimized -c foo.c" gives:
foo.c:4:3: note: loop vectorized
foo.c:4:3: note: loop versioned for vectorization because of possible 
aliasing
foo.c:4:3: note: loop peeled for vectorization to enhance alignment

With the pragma, as expected, no loop versioning is done (i.e. there is 
no "loop versioned for vectorization because of possible aliasing").

(I successfully did an ada,c,c++,fortran,go,java,lto,objc,obj-c++ 
bootstrap on x86-64-gnu-linux; regtesting is on-going.)

Tobias
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ivdep.diff
Type: text/x-patch
Size: 10865 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20131008/22f8c932/attachment.bin>