RFC patch for #pragma ivdep
Tobias Burnus
burnus@net-b.de
Tue Oct 8 06:51:00 GMT 2013
Attached is an early version (C only) for #pragma ivdep, which aids
vectorization by setting (for the following for-loop) loop->safelen to
INT_MAX. [In the final version, I will also add parsing support for C++
and use it for Fortran's "do concurrent".]
As suggested by Richard and Jakub (thanks!), it is implemented as follows:
* An ANNOTATE_EXPR with ANNOTATE_EXPR_ID == annot_expr_ivdep_kind is
attached to the condition of the loop
* In gimplify.c, it is converted into an internal function (ANNOTATE)
* When the "struct loops" is created, the internal function is removed
and loop->safelen is set to INT_MAX
RFC:
* The replacement of the internal function is done in cfgloop.c's
flow_loops_find. The code path I am interested in is: pass_build_cfg ->
execute_build_cfg -> loop_optimizer_init (AVOID_CFG_MODIFICATIONS) ->
flow_loops_find. But flow_loops_find is also called elsewhere.
Thus, is this the best spot? Is the slowdown of walking the gimple
statements acceptable? Additionally, there are some assumptions, which
may or may not be valid:
- There is only one latch edge (if there are more, loop->latch is set to
NULL and my code is not reached; if expand_ANNOTATE might then get
called and one gets an ICE [gcc_unreachable()]).
- The IFN_ANNOTATE is just before the GIMPLE_COND
- The loop condition is in loop->latch->next_bb.
* Parsing: Currently, #pragma ivdep and #pragma omp for require that a
for loop follows. Other compilers permit #pragma ivdep followed by
#pragma omp for - and vice versa [which gets complicated when OpenMPv4's
safelen is also used]. Is restricting to either ivdep xor another pragma
fine?
Example:
void foo(int n, int *a, int *b, int *c) {
int i;
#pragma ivdep
for (i = 0; i < n; ++i) {
a[i] = b[i] + c[i];
}
}
Without the pragma, "gcc -O3 -fopt-info-vec-optimized -c foo.c" gives:
foo.c:4:3: note: loop vectorized
foo.c:4:3: note: loop versioned for vectorization because of possible
aliasing
foo.c:4:3: note: loop peeled for vectorization to enhance alignment
With the pragma, as expected, no loop versioning is done (i.e. there is
no "loop versioned for vectorization because of possible aliasing").
(I successfully did an ada,c,c++,fortran,go,java,lto,objc,obj-c++
bootstrap on x86-64-gnu-linux; regtesting is on-going.)
Tobias
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ivdep.diff
Type: text/x-patch
Size: 10865 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20131008/22f8c932/attachment.bin>
More information about the Gcc-patches
mailing list