This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] [autovect patch] Implement vectorization hints


One question loop level pragma support needs to answer is - How to handle code motions/transformations done by various optimization passes. For example,

void foo()
{
  int i, j;
  for (i=0; i < 3; ++i)
#pragma ivdep
    for (j=0; j<2; ++j)
      {
        k = B[i];
        A[i][j] = -A[i][j];
      }
}


Once "k = B[i]" is moved in outer loop, which loop is associated with #pragma ivdep? Various loop optimization passes can do lots of tricks before vectorizer gets to do anything. What is the loop identifying construct that vectorizer can rely on?


One approach is to use loop structure to keep pragma info instead of relying on source location only. However it does not provide much help to answer above question. And add possibility of GCC doing loop distribution, loop fusion etc.. in future. Plus one additional difficulty with using loop structure approach is the fact that loop info is not maintained wire-to-wire. During gimplification, loop structure is dismantled in favor of if-gotos. Various passes optimizes GIMPLE before loop optimizer recreates loop. So question is - How to map new loops to source level loops?

Third approach is to insert, during gimplification, PRAGMA_EXPR_BEGIN and PRAGMA_EXPR_END in the IL stream. If loop body falls within this PRAGMA_EXPR_BEGIN/END then pragma is active. This requires various passes follow certain rules while modifying IL.

And forth approach is to use basic blocks marking. Idea is (using ivdep as an example):

1) Parser notes down location ranges where loop pragma ivdep is active.
2) During gimplification, basic blocks that fall in to active ivdep location range is marked appropriately.
3) Various CFG manipulation utilities are updated to keep pragma-ivdep-active bit up to date.
4) Vectorizer considers pragma ivdep active only if ALL loop basic blocks have pragma-ivdep-active bit set.


Here is the prototype based on this fourth approach.
thoughts ?
-
Devang

2005-03-01 Devang Patel <dpatel@apple.com>

* basic-block.h (struct basic_block_def): Add new member, pragmas.
(BB_PRAGMA_IVDEP, BB_PRAGMA_NOVECTOR): New.
* c-common.c (c_common_ivdep_pragma, finish_ivdep_pragma,
pragma_ivdep_active_p, c_common_novector_pragma): New functions.
(pragma_ivdep, pragma_novector): New.
* c-common.h (c_common_ivdep_pragma, c_common_novector_pragma,
finish_ivdep_pragma): New.
* c-parser.c (c_parser_for_statement): Finish pragam.
* c-pragma.c (init_pragma): Register ivdeps and novector pragma handlers.
* input.h (LOCATION_RANGE_OPEN_ENDED): New.
(struct location_range): New.
* tree-cfg.c (make_blocks): Note down pragmas for new blocks.
(tree_split_edge): Update pragmas for new block.
* tree-vect-analyze.c (vect_analyze_data_ref_dependence): Check ivdep
pragma.
* tree-vectorizer.c (pragma_ivdep_on, pragma_novector_on): New.
(vect_pragma_lookup): New.
(vectorize_loops): Check pragmas for candidate loop.
* tree.c (compare_location, new_location_range): New.
* tree.h (comapre_location, new_location_range): New.



Index: gcc/basic-block.h =================================================================== RCS file: /cvs/gcc/gcc/gcc/basic-block.h,v retrieving revision 1.238 diff -Idpatel.pbxuser -c -3 -p -r1.238 basic-block.h *** gcc/basic-block.h 15 Feb 2005 07:18:22 -0000 1.238 --- gcc/basic-block.h 2 Mar 2005 02:12:08 -0000 *************** struct basic_block_def GTY((chain_next ( *** 261,266 **** --- 261,269 ----

    /* Various flags.  See BB_* below.  */
    int flags;
+
+   /* Various pragmas. See BB_PRAGMA_* below.  */
+   int pragmas;
  };

  typedef struct basic_block_def *basic_block;
*************** typedef struct reorder_block_def
*** 285,290 ****
--- 288,296 ----

#define BB_FREQ_MAX 10000

+ #define BB_PRAGMA_IVDEP        1
+ #define BB_PRAGMA_NOVECTOR     2
+
  /* Masks for basic_block.flags.

     BB_VISITED should not be used by passes, it is used internally by
Index: gcc/c-common.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-common.c,v
retrieving revision 1.606
diff -Idpatel.pbxuser -c -3 -p -r1.606 c-common.c
*** gcc/c-common.c      22 Feb 2005 20:10:45 -0000      1.606
--- gcc/c-common.c      2 Mar 2005 02:12:08 -0000
*************** Software Foundation, 59 Temple Place - S
*** 47,52 ****
--- 47,53 ----
  #include "tree-mudflap.h"
  #include "opts.h"
  #include "real.h"
+ extern bool pragma_ivdep_active_p (source_locus);

cpp_reader *parse_in; /* Declared in c-pragma.h. */

*************** lvalue_or_else (tree ref, enum lvalue_us
*** 5746,5749 ****
--- 5747,5838 ----
    return win;
  }

+ static varray_type pragma_ivdep;
+ static varray_type pragma_novector;
+
+ /* Handler for #pragma ivdep */
+
+ void
+ c_common_ivdep_pragma (cpp_reader * ARG_UNUSED (pfile))
+ {
+ location_range *new_range;
+ if (!pragma_ivdep)
+ VARRAY_GENERIC_PTR_INIT (pragma_ivdep, 20, "pragma_ivdep");
+ new_range = new_location_range ();
+ VARRAY_PUSH_GENERIC_PTR (pragma_ivdep, new_range);
+
+ }
+
+ void
+ finish_ivdep_pragma (void)
+ {
+ unsigned int i;
+ source_locus l;
+
+ if (!pragma_ivdep)
+ return;
+
+ #ifdef USE_MAPPED_LOCATION
+ l = input_location;
+ #else
+ l = ggc_alloc (sizeof (location_t));
+ l->file = input_filename;
+ l->line = input_line;
+ #endif
+
+ for (i = 0; i < VARRAY_ACTIVE_SIZE (pragma_ivdep); i++)
+ {
+ location_range *r = VARRAY_GENERIC_PTR (pragma_ivdep, i);
+ if (compare_location (l, r->begin))
+ {
+ #ifdef USE_MAPPED_LOCATION
+ if (r->end == LOCATION_RANGE_OPEN_ENDED)
+ r->end = l;
+ #else
+ if (r->end->line == LOCATION_RANGE_OPEN_ENDED)
+ {
+ r->end->line = l->line;
+ r->end->file = l->file;
+ }
+ #endif
+ }
+ return;
+ }
+ }
+
+ /* Return true iff #pragma ivdeps is active at location L. */
+
+ bool
+ pragma_ivdep_active_p (source_locus l)
+ {
+ unsigned int i;
+
+ if (!pragma_ivdep)
+ return false;
+
+ for (i = 0; i < VARRAY_ACTIVE_SIZE (pragma_ivdep); i++)
+ {
+ location_range *r = VARRAY_GENERIC_PTR (pragma_ivdep, i);
+ if (compare_location (l, r->begin)
+ && compare_location (r->end, l))
+ return true;
+ }
+
+ return false;
+ }
+
+
+
+ /* Handler for #pragma novect */
+
+ void
+ c_common_novector_pragma (cpp_reader * ARG_UNUSED (pfile))
+ {
+ location_range *new_range;
+ if (!pragma_novector)
+ VARRAY_GENERIC_PTR_INIT (pragma_novector, 20, "pragma_novector");
+ new_range = new_location_range ();
+ VARRAY_PUSH_GENERIC_PTR (pragma_novector, new_range);
+ }
+
#include "gt-c-common.h"
Index: gcc/c-common.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-common.h,v
retrieving revision 1.277
diff -Idpatel.pbxuser -c -3 -p -r1.277 c-common.h
*** gcc/c-common.h 20 Feb 2005 17:01:15 -0000 1.277
--- gcc/c-common.h 2 Mar 2005 02:12:08 -0000
*************** extern void preprocess_file (cpp_reader
*** 972,975 ****
--- 972,979 ----
extern void pp_file_change (const struct line_map *);
extern void pp_dir_change (cpp_reader *, const char *);


+ extern void c_common_ivdep_pragma (cpp_reader *);
+ extern void c_common_novector_pragma (cpp_reader *);
+ extern void finish_ivdep_pragma (void);
+
#endif /* ! GCC_C_COMMON_H */
Index: gcc/c-parser.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-parser.c,v
retrieving revision 2.2
diff -Idpatel.pbxuser -c -3 -p -r2.2 c-parser.c
*** gcc/c-parser.c 28 Feb 2005 19:22:26 -0000 2.2
--- gcc/c-parser.c 2 Mar 2005 02:12:08 -0000
*************** c_parser_for_statement (c_parser *parser
*** 3839,3844 ****
--- 3839,3846 ----
body = c_parser_c99_block_statement (parser);
c_finish_loop (loc, cond, incr, body, c_break_label, c_cont_label, true);
add_stmt (c_end_compound_stmt (block, flag_isoc99));
+ finish_ivdep_pragma ();
+
c_break_label = save_break;
c_cont_label = save_cont;
}
Index: gcc/c-pragma.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-pragma.c,v
retrieving revision 1.82
diff -Idpatel.pbxuser -c -3 -p -r1.82 c-pragma.c
*** gcc/c-pragma.c 29 Nov 2004 18:53:53 -0000 1.82
--- gcc/c-pragma.c 2 Mar 2005 02:12:08 -0000
*************** init_pragma (void)
*** 697,703 ****
c_register_pragma (0, "extern_prefix", handle_pragma_extern_prefix);


    c_register_pragma ("GCC", "pch_preprocess", c_common_pch_pragma);
!
  #ifdef REGISTER_TARGET_PRAGMAS
    REGISTER_TARGET_PRAGMAS ();
  #endif
--- 697,704 ----
    c_register_pragma (0, "extern_prefix", handle_pragma_extern_prefix);

    c_register_pragma ("GCC", "pch_preprocess", c_common_pch_pragma);
!   c_register_pragma (0, "ivdep", c_common_ivdep_pragma);
!   c_register_pragma (0, "novector", c_common_novector_pragma);
  #ifdef REGISTER_TARGET_PRAGMAS
    REGISTER_TARGET_PRAGMAS ();
  #endif
Index: gcc/input.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/input.h,v
retrieving revision 1.20
diff -Idpatel.pbxuser -c -3 -p -r1.20 input.h
*** gcc/input.h 15 Jul 2004 00:02:29 -0000      1.20
--- gcc/input.h 2 Mar 2005 02:12:08 -0000
*************** extern location_t unknown_location;
*** 68,73 ****
--- 68,83 ----

#endif /* ! USE_MAPPED_LOCATION */

+ /* Describe the state where beginning of location range is known,
+    but end is still unknown.  */
+
+ #define LOCATION_RANGE_OPEN_ENDED 0
+ typedef struct
+ {
+   source_locus begin;
+   source_locus end;
+ } location_range;
+
  struct file_stack
  {
    struct file_stack *next;
Index: gcc/tree-cfg.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-cfg.c,v
retrieving revision 2.152
diff -Idpatel.pbxuser -c -3 -p -r2.152 tree-cfg.c
*** gcc/tree-cfg.c      28 Feb 2005 18:18:25 -0000      2.152
--- gcc/tree-cfg.c      2 Mar 2005 02:12:08 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 45,50 ****
--- 45,51 ----
  #include "cfgloop.h"
  #include "cfglayout.h"
  #include "hashtab.h"
+ extern bool pragma_ivdep_active_p (source_locus);

/* This file contains functions for building the Control Flow Graph (CFG)
for a function tree. */
*************** make_blocks (tree stmt_list)
*** 381,386 ****
--- 382,393 ----
codes. */
set_bb_for_stmt (stmt, bb);


+       if (EXPR_P (stmt) && EXPR_LOCUS (stmt))
+       {
+         if (pragma_ivdep_active_p (EXPR_LOCUS (stmt)))
+           bb->pragmas |= BB_PRAGMA_IVDEP;
+       }
+
        if (computed_goto_p (stmt))
        found_computed_goto = true;

*************** create_bb (void *h, void *e, basic_block
*** 432,437 ****
--- 439,445 ----
    last_basic_block++;

    initialize_bb_rbi (bb);
+   bb->pragmas = 0;
    return bb;
  }

*************** tree_split_edge (edge edge_in)
*** 3232,3237 ****
--- 3240,3247 ----
    new_edge->probability = REG_BR_PROB_BASE;
    new_edge->count = edge_in->count;

+   if (src->pragmas && dest->pragmas)
+     new_bb->pragmas = src->pragmas;
    e = redirect_edge_and_branch (edge_in, new_bb);
    gcc_assert (e);
    reinstall_phi_args (new_edge, e);
Index: gcc/tree-vect-analyze.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vect-analyze.c,v
retrieving revision 2.4
diff -Idpatel.pbxuser -c -3 -p -r2.4 tree-vect-analyze.c
*** gcc/tree-vect-analyze.c     20 Feb 2005 13:47:28 -0000      2.4
--- gcc/tree-vect-analyze.c     2 Mar 2005 02:12:09 -0000
*************** vect_analyze_data_ref_dependence (struct
*** 720,726 ****

    if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
      return false;
!
    if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS,
                            LOOP_LOC (loop_vinfo)))
      {
--- 720,737 ----

    if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
      return false;
!
!   if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
!       && pragma_ivdep_on)
!     {
!       if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS,
!                               LOOP_LOC (loop_vinfo)))
!       {
!         fprintf (vect_dump,
!                  "unknown data dependence: #pragma ivdep seen 1");
!         return false;
!       }
!     }
    if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS,
                            LOOP_LOC (loop_vinfo)))
      {
Index: gcc/tree-vectorizer.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vectorizer.c,v
retrieving revision 2.75
diff -Idpatel.pbxuser -c -3 -p -r2.75 tree-vectorizer.c
*** gcc/tree-vectorizer.c       17 Feb 2005 16:19:49 -0000      2.75
--- gcc/tree-vectorizer.c       2 Mar 2005 02:12:09 -0000
*************** static bool need_imm_uses_for (tree);
*** 172,177 ****
--- 172,184 ----
  /* vect_dump will be set to stderr or dump_file if exist.  */
  FILE *vect_dump;

+
+ /* Vectorization pragma support */
+ bool pragma_ivdep_on = false;
+ bool pragma_novector_on = false;
+
+ static void vect_pragma_lookup (struct loop *);
+
  /* vect_verbosity_level set to an invalid value
     to mark that it's uninitialized.  */
  enum verbosity_levels vect_verbosity_level = MAX_VERBOSITY_LEVEL;
*************** need_imm_uses_for (tree var)
*** 1550,1555 ****
--- 1557,1582 ----
    return is_gimple_reg (var);
  }

+ /* Lookup any vectorization pragma active for this loop.  */
+
+ static void
+ vect_pragma_lookup (struct loop *loop)
+ {
+   unsigned int i;
+   basic_block *bbs;
+   bbs = get_loop_body (loop);
+
+   pragma_ivdep_on = false;
+   for (i = 0; i < loop->num_nodes; i++)
+     if (!(bbs[i]->pragmas & BB_PRAGMA_IVDEP))
+       break;
+
+   if (i == loop->num_nodes)
+     pragma_ivdep_on = true;
+
+   free (bbs);
+
+ }

/* Function vectorize_loops.

*************** vectorize_loops (struct loops *loops)
*** 1593,1598 ****
--- 1620,1632 ----
        if (!loop)
          continue;

+       /* Lookup any vectorization pragma active for this loop.  */
+       vect_pragma_lookup (loop);
+
+       /* If #pragma novect is ON then do not vectorize this loop.  */
+       if (pragma_novector_on)
+         continue;
+
        loop_vinfo = vect_analyze_loop (loop);
        loop->aux = loop_vinfo;

Index: gcc/tree-vectorizer.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vectorizer.h,v
retrieving revision 2.14
diff -Idpatel.pbxuser -c -3 -p -r2.14 tree-vectorizer.h
*** gcc/tree-vectorizer.h       17 Feb 2005 08:47:28 -0000      2.14
--- gcc/tree-vectorizer.h       2 Mar 2005 02:12:09 -0000
*************** extern bool vect_print_dump_info (enum v
*** 322,325 ****
--- 322,330 ----
  extern void vect_set_verbosity_level (const char *);
  extern LOC find_loop_location (struct loop *);

+ extern bool pragma_novector_active_p (source_locus);
+
+ /* Vectorization pragma support */
+ extern bool pragma_ivdep_on;
+
  #endif  /* GCC_TREE_VECTORIZER_H  */
Index: gcc/tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.c,v
retrieving revision 1.466
diff -Idpatel.pbxuser -c -3 -p -r1.466 tree.c
*** gcc/tree.c  12 Feb 2005 00:26:56 -0000      1.466
--- gcc/tree.c  2 Mar 2005 02:12:09 -0000
*************** annotate_with_locus (tree node, location
*** 2885,2890 ****
--- 2885,2928 ----
    annotate_with_file_line (node, locus.file, locus.line);
  }
  #endif
+
+ /* Return true if A is located after B.  */
+
+ bool
+ compare_location (source_locus a, source_locus b)
+ {
+ #ifdef USE_MAPPED_LOCATION
+   if (a > b)
+     return true;
+ #else
+   if (!strcmp (a->file, b->file)
+       && a->line > b->line)
+     return true;
+ #endif
+   return false;
+ }
+
+ /* Create new location range using current location as the
+    starting point. */
+ location_range *
+ new_location_range (void)
+ {
+   location_range *range = (location_range *)
+     ggc_alloc (sizeof (location_range));
+ #ifdef USE_MAPPED_LOCATION
+   range->begin = input_location;
+   range->end = OPEN_PRAGMA;
+ #else
+   range->begin = ggc_alloc (sizeof (location_t));
+   range->begin->file = input_filename;
+   range->begin->line = input_line;
+   range->end = ggc_alloc (sizeof (location_t));
+   range->end->file = input_filename;
+   range->end->line = LOCATION_RANGE_OPEN_ENDED;
+ #endif
+   return range;
+ }
+


/* Return a declaration like DDECL except that its DECL_ATTRIBUTES is ATTRIBUTE. */ Index: gcc/tree.h =================================================================== RCS file: /cvs/gcc/gcc/gcc/tree.h,v retrieving revision 1.694 diff -Idpatel.pbxuser -c -3 -p -r1.694 tree.h *** gcc/tree.h 28 Feb 2005 18:18:26 -0000 1.694 --- gcc/tree.h 2 Mar 2005 02:12:09 -0000 *************** extern bool commutative_tree_code (enum *** 3466,3471 **** --- 3466,3474 ---- extern tree upper_bound_in_type (tree, tree); extern tree lower_bound_in_type (tree, tree); extern int operand_equal_for_phi_arg_p (tree, tree); + extern bool compare_location (source_locus, source_locus); + extern location_range *new_location_range (void); +


/* In stmt.c */


Index: gcc/doc/extend.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/extend.texi,v
retrieving revision 1.241
diff -Idpatel.pbxuser -c -3 -p -r1.241 extend.texi
*** gcc/doc/extend.texi 25 Feb 2005 18:29:28 -0000      1.241
--- gcc/doc/extend.texi 2 Mar 2005 02:12:10 -0000
*************** for further explanation.
*** 8567,8572 ****
--- 8567,8573 ----
  * Darwin Pragmas::
  * Solaris Pragmas::
  * Symbol-Renaming Pragmas::
+ * Auto Vectorization Pragmas::
  * Structure-Packing Pragmas::
  @end menu

*************** labels, but if @code{#pragma extern_pref
*** 8747,8752 ****
--- 8748,8778 ----
  way of knowing that that happened.)
  @end enumerate

+ @node Auto Vectorization Pragmas
+ @subsection Auto Vectorization Pragmas
+
+ The Auto Vectorization pass supports @code{#pragma ivdep} and
+ @code{#pragma novector}.
+
+ @table @code
+ @item ivdep
+ @cindex pragma, ivdep
+
+ This pragma is inserted before target loop. When data dependence
+ analysis is not able to determine data dependence, this pragma
+ instructs compiler to ignore possible data dependence. This pragma
+ does not instruct compiler to vectorize a loop when data dependence
+ analysis can recognize data dependence. When compiler is not able
+ vectorize for reasons other than data dependence, this pragam does
+ not force compiler to vectorize it.
+
+ @item novector
+ @cindex pragma, novector
+
+ This pragma instructs compiler to not vectorize a loop.
+
+ @end table
+
  @node Structure-Packing Pragmas
  @subsection Structure-Packing Pragmas


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]