This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Spectre V1 diagnostic / mitigation


Hi,

in the past weeks I've been looking into prototyping both spectre V1 
(speculative array bound bypass) diagnostics and mitigation in an
architecture independent manner to assess feasability and some kind
of upper bound on the performance impact one can expect.
https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html is
an interesting read in this context as well.

For simplicity I have implemented mitigation on GIMPLE right before
RTL expansion and have chosen TLS to do mitigation across function
boundaries.  Diagnostics sit in the same place but both are not in
any way dependent on each other.

The mitigation strategy chosen is that of tracking speculation
state via a mask that can be used to zero parts of the addresses
that leak the actual data.  That's similar to what aarch64 does
with -mtrack-speculation (but oddly there's no mitigation there).

I've optimized things to the point that is reasonable when working
target independent on GIMPLE but I've only looked at x86 assembly
and performance.  I expect any "final" mitigation if we choose to
implement and integrate such would be after RTL expansion since
RTL expansion can end up introducing quite some control flow whose
speculation state is not properly tracked by the prototype.

I'm cut&pasting single-runs of SPEC INT 2006/2017 here, the runs
were done with -O2 [-fspectre-v1={2,3}] where =2 is function-local
mitigation and =3 does mitigation global with passing the state
via TLS memory.

The following was measured on a Haswell desktop CPU:

	-O2 vs. -O2 -fspectre-v1=2

                                  Estimated                       Estimated
                Base     Base       Base        Peak     Peak       Peak
Benchmarks      Ref.   Run Time     Ratio       Ref.   Run Time     Ratio
-------------- ------  ---------  ---------    ------  ---------  ---------
400.perlbench    9770        245       39.8 *    9770        452       21.6 *  184%
401.bzip2        9650        378       25.5 *    9650        726       13.3 *  192%
403.gcc          8050        236       34.2 *    8050        352       22.8 *  149%
429.mcf          9120        223       40.9 *    9120        656       13.9 *  294%
445.gobmk       10490        400       26.2 *   10490        666       15.8 *  167%
456.hmmer        9330        388       24.1 *    9330        536       17.4 *  138%
458.sjeng       12100        437       27.7 *   12100        661       18.3 *  151%
462.libquantum  20720        300       69.1 *   20720        384       53.9 *  128%
464.h264ref     22130        451       49.1 *   22130        586       37.8 *  130%
471.omnetpp      6250        291       21.5 *    6250        398       15.7 *  137%
473.astar        7020        334       21.0 *    7020        522       13.5 *  156%
483.xalancbmk    6900        182       37.9 *    6900        306       22.6 *  168%
 Est. SPECint_base2006                   --
 Est. SPECint2006                                                        --

   -O2 -fspectre-v1=3

                                  Estimated                       Estimated
                Base     Base       Base        Peak     Peak       Peak
Benchmarks      Ref.   Run Time     Ratio       Ref.   Run Time     Ratio
-------------- ------  ---------  ---------    ------  ---------  ---------
400.perlbench                                    9770        497       19.6 *  203%
401.bzip2                                        9650        772       12.5 *  204%
403.gcc                                          8050        427       18.9 *  181%
429.mcf                                          9120        696       13.1 *  312%
445.gobmk                                       10490        726       14.4 *  181%
456.hmmer                                        9330        537       17.4 *  138%
458.sjeng                                       12100        721       16.8 *  165%
462.libquantum                                  20720        446       46.4 *  149%
464.h264ref                                     22130        613       36.1 *  136%
471.omnetpp                                      6250        471       13.3 *  162%
473.astar                                        7020        579       12.1 *  173%
483.xalancbmk                                    6900        350       19.7 *  192%
 Est. SPECint(R)_base2006           Not Run
 Est. SPECint2006                                                        --


While the following was measured on a Zen Epyc server:

-O2 vs -O2 -fspectre-v1=2

                       Estimated                       Estimated
                 Base     Base        Base        Peak     Peak        Peak
Benchmarks       Copies  Run Time     Rate        Copies  Run Time     Rate
--------------- -------  ---------  ---------    -------  ---------  ---------
500.perlbench_r       1        499       3.19  *       1        621       2.56  * 124%
502.gcc_r             1        286       4.95  *       1        392       3.61  * 137%
505.mcf_r             1        331       4.88  *       1        456       3.55  * 138%
520.omnetpp_r         1        454       2.89  *       1        563       2.33  * 124%
523.xalancbmk_r       1        328       3.22  *       1        569       1.86  * 173%
525.x264_r            1        518       3.38  *       1        776       2.26  * 150%
531.deepsjeng_r       1        365       3.14  *       1        448       2.56  * 123%
541.leela_r           1        598       2.77  *       1        729       2.27  * 122%
548.exchange2_r       1        460       5.69  *       1        756       3.46  * 164%
557.xz_r              1        403       2.68  *       1        586       1.84  * 145%
 Est. SPECrate2017_int_base              3.55
 Est. SPECrate2017_int_peak                                               2.56    72%

-O2 -fspectre-v2=3

                       Estimated                       Estimated
                 Base     Base        Base        Peak     Peak        Peak
Benchmarks       Copies  Run Time     Rate        Copies  Run Time     Rate
--------------- -------  ---------  ---------    -------  ---------  ---------
500.perlbench_r                               NR       1        700       2.27  * 140%
502.gcc_r                                     NR       1        485       2.92  * 170%
505.mcf_r                                     NR       1        596       2.71  * 180%
520.omnetpp_r                                 NR       1        604       2.17  * 133%
523.xalancbmk_r                               NR       1        643       1.64  * 196%
525.x264_r                                    NR       1        797       2.20  * 154%
531.deepsjeng_r                               NR       1        542       2.12  * 149%
541.leela_r                                   NR       1        872       1.90  * 146%
548.exchange2_r                               NR       1        761       3.44  * 165%
557.xz_r                                      NR       1        595       1.81  * 148%
 Est. SPECrate2017_int_base           Not Run
 Est. SPECrate2017_int_peak                                               2.26    64%



you can see, even thoug we're comparing apples and oranges, that the 
performance impact is quite dependent on the microarchitecture.

Similarly interesting as performance is the effect on text size which is
surprisingly high (_best_ case is 13 bytes per conditional branch plus 3
bytes per instrumented memory).

CPU2016:
   BASE  -O2
   text	   data	    bss	    dec	    hex	filename
1117726	  20928	  12704	1151358	 11917e	400.perlbench
  56568	   3800	   4416	  64784	   fd10	401.bzip2
3419568	   7912	 751520	4179000	 3fc438	403.gcc
  12212	    712	  11984	  24908	   614c	429.mcf
1460694	2081772	2330096	5872562	 599bb2	445.gobmk
 284929	   5956	  82040	 372925	  5b0bd	456.hmmer
 130782	   2152	2576896	2709830	 295946	458.sjeng
  41915	    764	     96	  42775	   a717	462.libquantum
 505452	  11220	 372320	 888992	  d90a0	464.h264ref
 638188	   9584	  14664	 662436	  a1ba4	471.omnetpp
  38859	    900	   5216	  44975	   afaf	473.astar
4033878	 140248	  12168	4186294	 3fe0b6	483.xalancbmk
   PEAK -O2 -fspectre-v1=2
   text	   data	    bss	    dec	    hex	filename
1508032	  20928	  12704	1541664	 178620	400.perlbench	135%
  76098	   3800	   4416	  84314	  1495a	401.bzip2	135%
4483530	   7912	 751520	5242962	 500052	403.gcc		131%
  16006	    712	  11984	  28702	   701e	429.mcf		131%
1647384	2081772	2330096	6059252	 5c74f4	445.gobmk	112%
 377259	   5956	  82040	 465255	  71967	456.hmmer	132%
 164672	   2152	2576896	2743720	 29dda8	458.sjeng	126%
  47901	    764	     96	  48761	   be79	462.libquantum	114%
 649854	  11220	 372320	1033394	  fc4b2	464.h264ref	129%
 706908	   9584	  14664	 731156	  b2814	471.omnetpp	111%
  48493	    900	   5216	  54609	   d551	473.astar	125%
4862056	 140248	  12168	5014472	 4c83c8	483.xalancbmk	121%
   PEAK -O2 -fspectre-v1=3
   text	   data	    bss	    dec	    hex	filename
1742008	  20936	  12704	1775648	 1b1820	400.perlbench	156%
  83338	   3808	   4416	  91562	  165aa	401.bzip2	147%
5219850	   7920	 751520	5979290	 5b3c9a	403.gcc		153%
  17422	    720	  11984	  30126	   75ae	429.mcf		143%
1801688	2081780	2330096	6213564	 5ecfbc	445.gobmk	123%
 431827	   5964	  82040	 519831	  7ee97	456.hmmer	152%
 182200	   2160	2576896	2761256	 2a2228	458.sjeng	139%
  53773	    772	     96	  54641	   d571	462.libquantum	128%
 691798	  11228	 372320	1075346	 106892	464.h264ref	137%
 976692	   9592	  14664	1000948	  f45f4	471.omnetpp	153%
  54525	    908	   5216	  60649	   ece9	473.astar	140%
5808306	 140256	  12168	5960730	 5af41a	483.xalancbmk	144%

CPU2017:
   BASE -O2 -g
   text    data     bss     dec     hex filename
2209713    8576    9080 2227369  21fca9 500.perlbench_r
9295702   37432 1150664 10483798 9ff856 502.gcc_r
  21795     712     744   23251    5ad3 505.mcf_r
2067560    8984   46888 2123432  2066a8 520.omnetpp_r
5763577  142584   20040 5926201  5a6d39 523.xalancbmk_r
 508402    6102   29592  544096   84d60 525.x264_r
  84222     784 12138360 12223366 ba8386 531.deepsjeng_r
 223480    8544   30072  262096   3ffd0 541.leela_r
  70554     864    6384   77802   12fea 548.exchange2_r
 180640     884   17704  199228   30a3c 557.xz_r
   PEAK -fspectre-v2=2
   text    data     bss     dec     hex filename
2991161    8576    9080 3008817  2de931 500.perlbench_r	135%
12244886  37432 1150664 13432982 ccf896 502.gcc_r	132%
  28475     712     744   29931    74eb 505.mcf_r	131%
2397026    8984   46888 2452898  256da2 520.omnetpp_r	116%
6846853  142584   20040 7009477  6af4c5 523.xalancbmk_r	119%
 645730    6102   29592  681424   a65d0 525.x264_r	127%
 111166     784 12138360 12250310 baecc6 531.deepsjeng_r 132%
 260835    8544   30072  299451   491bb 541.leela_r     117%
  96874     864    6384  104122   196ba 548.exchange2_r	137%
 215288     884   17704  233876   39194 557.xz_r	119%
   PEAK -fspectre-v2=3
   text    data     bss     dec     hex filename
3365945    8584    9080 3383609  33a139 500.perlbench_r	152%
14790638  37440 1150664 15978742 f3d0f6 502.gcc_r	159%
  31419     720     744   32883    8073 505.mcf_r	144%
2867893    8992   46888 2923773  2c9cfd 520.omnetpp_r	139%
8183689  142592   20040 8346321  7f5ad1 523.xalancbmk_r	142%
 697434    6110   29592  733136   b2fd0 525.x264_r	137%
 123638     792 12138360 12262790 bb1d86 531.deepsjeng_r 147%
 315347    8552   30072  353971   566b3 541.leela_r	141%
  98578     872    6384  105834   19d6a 548.exchange2_r	140%
 239144     892   17704  257740   3eecc 557.xz_r	133%


The patch relies heavily on RTL optimizations for DCE purposes.  At the
same time we rely on RTL not statically computing the mask (RTL has no
conditional constant propagation).  Full instrumentation of the classic
Spectre V1 testcase

char a[1024];
int b[1024];
int foo (int i, int bound)
{
  if (i < bound)
    return b[a[i]];
}

is the following:

foo:
.LFB0:  
        .cfi_startproc
        xorl    %eax, %eax
        cmpl    %esi, %edi
        setge   %al
        subq    $1, %rax
        jne     .L4
        ret
        .p2align 4,,10
        .p2align 3
.L4:
        andl    %eax, %edi
        movslq  %edi, %rdi
        movsbq  a(%rdi), %rax
        movl    b(,%rax,4), %eax
        ret

so the generated GIMPLE was "tuned" for reasonable x86 assembler outcome.

Patch below for reference (and your own testing in case you are curious).
I do not plan to pursue this further at this point.

Richard.

>From 01e4a5a43e266065d32489daa50de0cf2425d5f5 Mon Sep 17 00:00:00 2001
From: Richard Guenther <rguenther@suse.de>
Date: Wed, 5 Dec 2018 13:17:02 +0100
Subject: [PATCH] warn-spectrev1


diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 7960cace16a..64d472d7fa0 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1334,6 +1334,7 @@ OBJS = \
 	gimple-ssa-sprintf.o \
 	gimple-ssa-warn-alloca.o \
 	gimple-ssa-warn-restrict.o \
+	gimple-ssa-spectrev1.o \
 	gimple-streamer-in.o \
 	gimple-streamer-out.o \
 	gimple-walk.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 45d7f6189e5..1ae7fcfe177 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -702,6 +702,10 @@ Warn when one local variable shadows another local variable or parameter of comp
 Wshadow-compatible-local
 Common Warning Undocumented Alias(Wshadow=compatible-local)
 
+Wspectre-v1
+Common Var(warn_spectrev1) Warning
+Warn about code susceptible to spectre v1 style attacks.
+
 Wstack-protector
 Common Var(warn_stack_protect) Warning
 Warn when not issuing stack smashing protection for some reason.
@@ -2406,6 +2410,14 @@ fsingle-precision-constant
 Common Report Var(flag_single_precision_constant) Optimization
 Convert floating point constants to single precision constants.
 
+fspectre-v1
+Common Alias(fspectre-v1=, 2, 0)
+Insert code to mitigate spectre v1 style attacks.
+
+fspectre-v1=
+Common Report RejectNegative Joined UInteger IntegerRange(0, 3) Var(flag_spectrev1) Optimization
+Insert code to mitigate spectre v1 style attacks.
+
 fsplit-ivs-in-unroller
 Common Report Var(flag_split_ivs_in_unroller) Init(1) Optimization
 Split lifetimes of induction variables when loops are unrolled.
diff --git a/gcc/gimple-ssa-spectrev1.cc b/gcc/gimple-ssa-spectrev1.cc
new file mode 100644
index 00000000000..c2a5dc95324
--- /dev/null
+++ b/gcc/gimple-ssa-spectrev1.cc
@@ -0,0 +1,824 @@
+/* Loop interchange.
+   Copyright (C) 2017-2018 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "is-a.h"
+#include "tree.h"
+#include "gimple.h"
+#include "tree-pass.h"
+#include "ssa.h"
+#include "gimple-pretty-print.h"
+#include "gimple-iterator.h"
+#include "params.h"
+#include "tree-ssa.h"
+#include "cfganal.h"
+#include "gimple-walk.h"
+#include "tree-ssa-loop.h"
+#include "tree-dfa.h"
+#include "tree-cfg.h"
+#include "fold-const.h"
+#include "builtins.h"
+#include "alias.h"
+#include "cfgloop.h"
+#include "varasm.h"
+#include "cgraph.h"
+#include "gimple-fold.h"
+#include "diagnostic.h"
+
+/* The Spectre V1 situation is as follows:
+
+      if (attacker_controlled_idx < bound)  // speculated as true but is false
+        {
+	  // out-of-bound access, returns value interesting to attacker
+	  val = mem[attacker_controlled_idx];
+	  // access that causes a cache-line to be brought in - canary
+	  ... = attacker_controlled_mem[val];
+	}
+
+   The last load provides the side-channel.  The pattern can be split
+   into multiple functions or translation units.  Conservatively we'd
+   have to warn about
+
+      int foo (int *a) {  return *a; }
+
+   thus any indirect (or indexed) memory access.  That's obvioulsy
+   not useful.
+
+   The next level would be to warn only when we see load of val as
+   well.  That then misses cases like
+
+      int foo (int *a, int *b)
+      {
+        int idx = load_it (a);
+	return load_it (&b[idx]);
+      }
+
+   Still we'd warn about cases like
+
+      struct Foo { int *a; };
+      int foo (struct Foo *a) { return *a->a; }
+
+   though dereferencing VAL isn't really an interesting case.  It's
+   hard to exclude this conservatively so the obvious solution is
+   to restrict the kind of loads that produce val, for example based
+   on its type or its number of bits.  It's tempting to do this at
+   the point of the load producing val but in the end what matters
+   is the number of bits that reach the second loads [as index] given
+   there are practical limits on the size of the canary.  For this
+   we have to consider
+
+      int foo (struct Foo *a, int *b)
+      {
+        int *c = a->a;
+	int idx = *b;
+	return *(c + idx);
+      }
+
+   where idx has too many bits to be an interesting attack vector(?).
+ */
+
+/* The pass does two things, first it performs data flow analysis
+   to be able to warn about the second load.  This is controlled
+   via -Wspectre-v1.
+
+   Second it instruments control flow in the program to track a
+   mask which is all-ones but all-zeroes if the CPU speculated
+   a branch in the wrong direction.  This mask is then used to
+   mask the address[-part(s)] of loads with non-invariant addresses,
+   effectively mitigating the attack.  This is controlled by
+   -fpectre-v1[=N] where N is default 2 and
+     1  optimistically omit some instrumentations (currently
+        backedge control flow instructions do not update the
+	speculation mask)
+     2  instrument conservatively using a function-local speculation
+        mask
+     3  instrument conservatively using a global (TLS) speculation
+        mask.  This adds TLS loads/stores of the speculation mask
+	at function boundaries and before and after calls.
+ */
+
+/* We annotate statements whose defs cannot be used to leaking data
+   speculatively via loads with SV1_SAFE.  This is used to optimize
+   masking of indices where masked indices (and derived by constant
+   ones) are not masked again.  Note this works only up to the points
+   that possibly change the speculation mask value.  */
+#define SV1_SAFE GF_PLF_1
+
+namespace {
+
+const pass_data pass_data_spectrev1 =
+{
+  GIMPLE_PASS, /* type */
+  "spectrev1", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  PROP_cfg|PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  TODO_update_ssa, /* todo_flags_finish */
+};
+
+class pass_spectrev1 : public gimple_opt_pass
+{
+public:
+  pass_spectrev1 (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_spectrev1, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_spectrev1 (m_ctxt); }
+  virtual bool gate (function *) { return warn_spectrev1 || flag_spectrev1; }
+  virtual unsigned int execute (function *);
+
+  static bool stmt_is_indexed_load (gimple *);
+  static bool stmt_mangles_index (gimple *, tree);
+  static bool find_value_dependent_guard (gimple *, tree);
+  static void mark_influencing_outgoing_flow (basic_block, tree);
+  static tree instrument_mem (gimple_stmt_iterator *, tree, tree);
+}; // class pass_spectrev1
+
+bitmap_head *influencing_outgoing_flow;
+
+static bool
+call_between (gimple *first, gimple *second)
+{
+  gcc_assert (gimple_bb (first) == gimple_bb (second));
+  /* ???  This is inefficient.  Maybe we can use gimple_uid to assign
+     unique IDs to stmts belonging to groups with the same speculation
+     mask state.  */
+  for (gimple_stmt_iterator gsi = gsi_for_stmt (first);
+       gsi_stmt (gsi) != second; gsi_next (&gsi))
+    if (is_gimple_call (gsi_stmt (gsi)))
+      return true;
+  return false;
+}
+
+basic_block ctx_bb;
+gimple *ctx_stmt;
+static bool
+gather_indexes (tree, tree *idx, void *data)
+{
+  vec<tree *> *indexes = (vec<tree *> *)data;
+  if (TREE_CODE (*idx) != SSA_NAME)
+    return true;
+  if (!SSA_NAME_IS_DEFAULT_DEF (*idx)
+      && gimple_bb (SSA_NAME_DEF_STMT (*idx)) == ctx_bb
+      && gimple_plf (SSA_NAME_DEF_STMT (*idx), SV1_SAFE)
+      && (flag_spectrev1 < 3
+	  || !call_between (SSA_NAME_DEF_STMT (*idx), ctx_stmt)))
+    return true;
+  if (indexes->is_empty ())
+    indexes->safe_push (idx);
+  else if (*(*indexes)[0] == *idx)
+    indexes->safe_push (idx);
+  else
+    return false;
+  return true;
+}
+
+tree
+pass_spectrev1::instrument_mem (gimple_stmt_iterator *gsi, tree mem, tree mask)
+{
+  /* First try to see if we can find a single index we can zero which
+     has the chance of repeating in other loads and also avoids separate
+     LEA and memory references decreasing code size and AGU occupancy.  */
+  auto_vec<tree *, 8> indexes;
+  ctx_bb = gsi_bb (*gsi);
+  ctx_stmt = gsi_stmt (*gsi);
+  if (PARAM_VALUE (PARAM_SPECTRE_V1_MAX_INSTRUMENT_INDICES) > 0
+      && for_each_index (&mem, gather_indexes, (void *)&indexes))
+    {
+      /* All indices are safe.  */
+      if (indexes.is_empty ())
+	return mem;
+      if (TYPE_PRECISION (TREE_TYPE (*indexes[0]))
+	  <= TYPE_PRECISION (TREE_TYPE (mask)))
+	{
+	  tree idx = *indexes[0];
+	  gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (idx))
+		      || POINTER_TYPE_P (TREE_TYPE (idx)));
+	  /* Instead of instrumenting IDX directly we could look at
+	     definitions with a single SSA use and instrument that
+	     instead.  But we have to do some work to make SV1_SAFE
+	     propagation updated then - this would really ask to first
+	     gather all indexes of all refs we want to instrument and
+	     compute some optimal set of instrumentations.  */
+	  gimple_seq seq = NULL;
+	  tree idx_mask = gimple_convert (&seq, TREE_TYPE (idx), mask);
+	  tree masked_idx = gimple_build (&seq, BIT_AND_EXPR,
+					  TREE_TYPE (idx), idx, idx_mask);
+	  /* Mark the instrumentation sequence as visited.  */
+	  for (gimple_stmt_iterator si = gsi_start (seq);
+	       !gsi_end_p (si); gsi_next (&si))
+	    gimple_set_visited (gsi_stmt (si), true);
+	  gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+	  gimple_set_plf (SSA_NAME_DEF_STMT (masked_idx), SV1_SAFE, true);
+	  /* Replace downstream users in the BB which reduces register pressure
+	     and allows SV1_SAFE propagation to work (which stops at call/BB
+	     boundaries though).
+	     ???  This is really reg-pressure vs. dependence chains so not
+	     a generally easy thing.  Making the following propagate into
+	     all uses dominated by the insert slows down 429.mcf even more.
+	     ???  We can actually track SV1_SAFE across PHIs but then we
+	     have to propagate into PHIs here.  */
+	  gimple *use_stmt;
+	  use_operand_p use_p;
+	  imm_use_iterator iter;
+	  FOR_EACH_IMM_USE_STMT (use_stmt, iter, idx)
+	    if (gimple_bb (use_stmt) == gsi_bb (*gsi)
+		&& gimple_code (use_stmt) != GIMPLE_PHI
+		&& !gimple_visited_p (use_stmt))
+	      {
+		FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+		  SET_USE (use_p, masked_idx);
+		update_stmt (use_stmt);
+	      }
+	  /* Modify MEM in place...  (our stmt is already marked visited).  */
+	  for (unsigned i = 0; i < indexes.length (); ++i)
+	    *indexes[i] = masked_idx;
+	  return mem;
+	}
+    }
+
+  /* ???  Can we handle TYPE_REVERSE_STORAGE_ORDER at all?  Need to
+     handle BIT_FIELD_REFs.  */
+
+  /* Strip a bitfield reference to re-apply it at the end.  */
+  tree bitfield = NULL_TREE;
+  tree bitfield_off = NULL_TREE;
+  if (TREE_CODE (mem) == COMPONENT_REF
+      && DECL_BIT_FIELD (TREE_OPERAND (mem, 1)))
+    {
+      bitfield = TREE_OPERAND (mem, 1);
+      bitfield_off = TREE_OPERAND (mem, 2);
+      mem = TREE_OPERAND (mem, 0);
+    }
+
+  tree ptr_base = mem;
+  /* VIEW_CONVERT_EXPRs do not change offset, strip them, they get folded
+     into the MEM_REF we create.  */
+  while (TREE_CODE (ptr_base) == VIEW_CONVERT_EXPR)
+    ptr_base = TREE_OPERAND (ptr_base, 0);
+
+  tree ptr = make_ssa_name (ptr_type_node);
+  gimple *new_stmt = gimple_build_assign (ptr, build_fold_addr_expr (ptr_base));
+  gimple_set_visited (new_stmt, true);
+  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+  ptr = make_ssa_name (ptr_type_node);
+  new_stmt = gimple_build_assign (ptr, BIT_AND_EXPR,
+				  gimple_assign_lhs (new_stmt), mask);
+  gimple_set_visited (new_stmt, true);
+  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+  tree type = TREE_TYPE (mem);
+  unsigned align = get_object_alignment (mem);
+  if (align != TYPE_ALIGN (type))
+    type = build_aligned_type (type, align);
+
+  tree new_mem = build2 (MEM_REF, type, ptr,
+			 build_int_cst (reference_alias_ptr_type (mem), 0));
+  if (bitfield)
+    new_mem = build3 (COMPONENT_REF, TREE_TYPE (bitfield), new_mem,
+		      bitfield, bitfield_off);
+  return new_mem;
+}
+
+bool
+check_spectrev1_2nd_load (tree, tree *idx, void *data)
+{
+  sbitmap value_from_indexed_load = (sbitmap)data;
+  if (TREE_CODE (*idx) == SSA_NAME
+      && bitmap_bit_p (value_from_indexed_load, SSA_NAME_VERSION (*idx)))
+    return false;
+  return true;
+}
+
+bool
+check_spectrev1_2nd_load (gimple *, tree, tree ref, void *data)
+{
+  return !for_each_index (&ref, check_spectrev1_2nd_load, data);
+}
+
+void
+pass_spectrev1::mark_influencing_outgoing_flow (basic_block bb, tree op)
+{
+  if (!bitmap_set_bit (&influencing_outgoing_flow[SSA_NAME_VERSION (op)],
+		       bb->index))
+    return;
+
+  /* Note we are deliberately non-conservatively stop at call and
+     memory boundaries here expecting earlier optimization to expose
+     value dependences via SSA chains.  */
+  gimple *def_stmt = SSA_NAME_DEF_STMT (op);
+  if (gimple_vuse (def_stmt)
+      || !is_gimple_assign (def_stmt))
+    return;
+
+  ssa_op_iter i;
+  FOR_EACH_SSA_TREE_OPERAND (op, def_stmt, i, SSA_OP_USE)
+    mark_influencing_outgoing_flow (bb, op);
+}
+
+bool
+pass_spectrev1::find_value_dependent_guard (gimple *stmt, tree op)
+{
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (&influencing_outgoing_flow[SSA_NAME_VERSION (op)],
+			    0, i, bi)
+    /* ???  If control-dependent on.
+       ???  Make bits in influencing_outgoing_flow the index of the BB
+       in RPO order so we could walk bits from STMT "upwards" finding
+       the nearest one.  */
+    if (dominated_by_p (CDI_DOMINATORS,
+			gimple_bb (stmt), BASIC_BLOCK_FOR_FN (cfun, i)))
+      {
+	if (dump_enabled_p ())
+	  dump_printf_loc (MSG_NOTE, stmt, "Condition %G in block %d "
+			   "is related to indexes used in %G\n",
+			   last_stmt (BASIC_BLOCK_FOR_FN (cfun, i)),
+			   i, stmt);
+	return true;
+      }
+
+  /* Note we are deliberately non-conservatively stop at call and
+     memory boundaries here expecting earlier optimization to expose
+     value dependences via SSA chains.  */
+  gimple *def_stmt = SSA_NAME_DEF_STMT (op);
+  if (gimple_vuse (def_stmt)
+      || !is_gimple_assign (def_stmt))
+    return false;
+
+  ssa_op_iter it;
+  FOR_EACH_SSA_TREE_OPERAND (op, def_stmt, it, SSA_OP_USE)
+    if (find_value_dependent_guard (stmt, op))
+      /* Others may be "nearer".  */
+      return true;
+
+  return false;
+}
+
+bool
+pass_spectrev1::stmt_is_indexed_load (gimple *stmt)
+{
+  /* Given we ignore the function boundary for incoming parameters
+     let's ignore return values of calls as well for the purpose
+     of being the first indexed load (also ignore inline-asms).  */
+  if (!gimple_assign_load_p (stmt))
+    return false;
+
+  /* Exclude esp. pointers from the index load itself (but also floats,
+     vectors, etc. - quite a bit handwaving here).  */
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (stmt))))
+    return false;
+
+  /* If we do not have any SSA uses the load cannot be one indexed
+     by an attacker controlled value.  */
+  if (zero_ssa_operands (stmt, SSA_OP_USE))
+    return false;
+
+  return true;
+}
+
+/* Return true whether the index in the use operand OP in STMT is
+   not transfered to STMTs defs.  */
+
+bool
+pass_spectrev1::stmt_mangles_index (gimple *stmt, tree op)
+{
+  if (gimple_assign_load_p (stmt))
+    return true;
+  if (gassign *ass = dyn_cast <gassign *> (stmt))
+    {
+      enum tree_code code = gimple_assign_rhs_code (ass);
+      switch (code)
+	{
+	case TRUNC_DIV_EXPR:
+	case CEIL_DIV_EXPR:
+	case FLOOR_DIV_EXPR:
+	case ROUND_DIV_EXPR:
+	case EXACT_DIV_EXPR:
+	case RDIV_EXPR:
+	case TRUNC_MOD_EXPR:
+	case CEIL_MOD_EXPR:
+	case FLOOR_MOD_EXPR:
+	case ROUND_MOD_EXPR:
+	case LSHIFT_EXPR:
+	case RSHIFT_EXPR:
+	case LROTATE_EXPR:
+	case RROTATE_EXPR:
+	  /* Division, modulus or shifts by the index do not produce
+	     something useful for the attacker.  */
+	  if (gimple_assign_rhs2 (ass) == op)
+	    return true;
+	  break;
+	default:;
+	  /* Comparisons do not produce an index value.  */
+	  if (TREE_CODE_CLASS (code) == tcc_comparison)
+	    return true;
+	}
+    }
+  /* ???  We could handle builtins here.  */
+  return false;
+}
+
+static GTY(()) tree spectrev1_tls_mask_decl;
+
+/* Main entry for spectrev1 pass.  */
+
+unsigned int
+pass_spectrev1::execute (function *fn)
+{
+  calculate_dominance_info (CDI_DOMINATORS);
+  loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
+
+  int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
+  int rpo_num = pre_and_rev_post_order_compute_fn (fn, NULL, rpo, false);
+
+  /* We track for each SSA name whether its value (may) depend(s) on
+     the result of an indexed load.
+     A set of operation will kill a value (enough).  */
+  auto_sbitmap value_from_indexed_load (num_ssa_names);
+  bitmap_clear (value_from_indexed_load);
+
+  unsigned orig_num_ssa_names = num_ssa_names;
+  influencing_outgoing_flow = XCNEWVEC (bitmap_head, num_ssa_names);
+  for (unsigned i = 1; i < num_ssa_names; ++i)
+    bitmap_initialize (&influencing_outgoing_flow[i], &bitmap_default_obstack);
+
+
+  /* Diagnosis.  */
+
+  /* Function arguments are not indexed loads unless we want to
+     be conservative to a level no longer useful.  */
+
+  for (int i = 0; i < rpo_num; ++i)
+    {
+      basic_block bb = BASIC_BLOCK_FOR_FN (fn, rpo[i]);
+
+      for (gphi_iterator gpi = gsi_start_phis (bb);
+	   !gsi_end_p (gpi); gsi_next (&gpi))
+	{
+	  gphi *phi = gpi.phi ();
+	  bool value_from_indexed_load_p = false;
+	  use_operand_p arg_p;
+	  ssa_op_iter it;
+	  FOR_EACH_PHI_ARG (arg_p, phi, it, SSA_OP_USE)
+	    {
+	      tree arg = USE_FROM_PTR (arg_p);
+	      if (TREE_CODE (arg) == SSA_NAME
+		  && bitmap_bit_p (value_from_indexed_load,
+				   SSA_NAME_VERSION (arg)))
+		value_from_indexed_load_p = true;
+	    }
+	  if (value_from_indexed_load_p)
+	    bitmap_set_bit (value_from_indexed_load,
+			    SSA_NAME_VERSION (PHI_RESULT (phi)));
+	}
+
+      for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	   !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *stmt = gsi_stmt (gsi);
+	  if (is_gimple_debug (stmt))
+	    continue;
+
+	  if (walk_stmt_load_store_ops (stmt, value_from_indexed_load,
+					check_spectrev1_2nd_load,
+					check_spectrev1_2nd_load))
+	    warning_at (gimple_location (stmt), OPT_Wspectre_v1, "%Gspectrev1",
+			stmt);
+
+	  bool value_from_indexed_load_p = false;
+	  if (stmt_is_indexed_load (stmt))
+	    {
+	      /* We are interested in indexes to later loads so ultimatively
+		 register values that all happen to separate SSA defs.
+		 Interesting aggregates will be decomposed by later loads
+		 which we then mark as producing an index.  Simply mark
+		 all SSA defs as coming from an indexed load.  */
+	      /* We are handling a single load in STMT right now.  */
+	      ssa_op_iter it;
+	      tree op;
+	      FOR_EACH_SSA_TREE_OPERAND (op, stmt, it, SSA_OP_USE)
+	        if (find_value_dependent_guard (stmt, op))
+		  {
+		    /* ???  Somehow record the dependence to point to it in
+		       diagnostics.  */
+		    value_from_indexed_load_p = true;
+		    break;
+		  }
+	    }
+
+	  tree op;
+	  ssa_op_iter it;
+	  FOR_EACH_SSA_TREE_OPERAND (op, stmt, it, SSA_OP_USE)
+	    if (bitmap_bit_p (value_from_indexed_load,
+			      SSA_NAME_VERSION (op))
+		&& !stmt_mangles_index (stmt, op))
+	      {
+		value_from_indexed_load_p = true;
+		break;
+	      }
+
+	  if (value_from_indexed_load_p)
+	    FOR_EACH_SSA_TREE_OPERAND (op, stmt, it, SSA_OP_DEF)
+	      /* ???  We could cut off single-bit values from the chain
+	         here or pretain that float loads will be never turned
+		 into integer indices, etc.  */
+	      bitmap_set_bit (value_from_indexed_load,
+			      SSA_NAME_VERSION (op));
+	}
+
+      if (EDGE_COUNT (bb->succs) > 1)
+	{
+	  gcond *stmt = safe_dyn_cast <gcond *> (last_stmt (bb));
+	  /* ???  What about switches?  What about badly speculated EH?  */
+	  if (!stmt)
+	    continue;
+	  /* We could constrain conditions here to those more likely
+	     being "bounds checks".  For example common guards for
+	     indirect accesses are NULL pointer checks.
+	     ???  This isn't fully safe, but it drops the number of
+	     spectre warnings for dwarf2out.i from cc1files from 70 to 16.  */
+	  if ((gimple_cond_code (stmt) == EQ_EXPR
+	       || gimple_cond_code (stmt) == NE_EXPR)
+	      && integer_zerop (gimple_cond_rhs (stmt))
+	      && POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt))))
+	    ;
+	  else
+	    {
+	      ssa_op_iter it;
+	      tree op;
+	      FOR_EACH_SSA_TREE_OPERAND (op, stmt, it, SSA_OP_USE)
+		mark_influencing_outgoing_flow (bb, op);
+	    }
+	}
+    }
+
+  for (unsigned i = 1; i < orig_num_ssa_names; ++i)
+    bitmap_release (&influencing_outgoing_flow[i]);
+  XDELETEVEC (influencing_outgoing_flow);
+
+
+
+  /* Instrumentation.  */
+  if (!flag_spectrev1)
+    return 0;
+
+  /* Create the default all-ones mask.  When doing IPA instrumentation
+     this should initialize the mask from TLS memory and outgoing edges
+     need to save the mask to TLS memory.  */
+  gimple *new_stmt;
+  if (!spectrev1_tls_mask_decl
+      && flag_spectrev1 >= 3)
+    {
+      /* Use a smaller variable in case sign-extending loads are
+	 available?  */
+      spectrev1_tls_mask_decl
+	  = build_decl (BUILTINS_LOCATION,
+			VAR_DECL, NULL_TREE, ptr_type_node);
+      TREE_STATIC (spectrev1_tls_mask_decl) = 1;
+      TREE_PUBLIC (spectrev1_tls_mask_decl) = 1;
+      DECL_VISIBILITY (spectrev1_tls_mask_decl) = VISIBILITY_HIDDEN;
+      DECL_VISIBILITY_SPECIFIED (spectrev1_tls_mask_decl) = 1;
+      DECL_INITIAL (spectrev1_tls_mask_decl)
+	  = build_all_ones_cst (ptr_type_node);
+      DECL_NAME (spectrev1_tls_mask_decl) = get_identifier ("__SV1MSK");
+      DECL_ARTIFICIAL (spectrev1_tls_mask_decl) = 1;
+      DECL_IGNORED_P (spectrev1_tls_mask_decl) = 1;
+      varpool_node::finalize_decl (spectrev1_tls_mask_decl);
+      make_decl_one_only (spectrev1_tls_mask_decl,
+			  DECL_ASSEMBLER_NAME (spectrev1_tls_mask_decl));
+      set_decl_tls_model (spectrev1_tls_mask_decl,
+			  decl_default_tls_model (spectrev1_tls_mask_decl));
+    }
+
+  /* We let the SSA rewriter cope with rewriting mask into SSA and
+     inserting PHI nodes.  */
+  tree mask = create_tmp_reg (ptr_type_node, "spectre_v1_mask");
+  new_stmt = gimple_build_assign (mask,
+				  flag_spectrev1 >= 3
+				  ? spectrev1_tls_mask_decl
+				  : build_all_ones_cst (ptr_type_node));
+  gimple_stmt_iterator gsi
+      = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (fn)));
+  gsi_insert_before (&gsi, new_stmt, GSI_CONTINUE_LINKING);
+
+  /* We are using the visited flag to track stmts downstream in a BB.  */
+  for (int i = 0; i < rpo_num; ++i)
+    {
+      basic_block bb = BASIC_BLOCK_FOR_FN (fn, rpo[i]);
+      for (gphi_iterator gpi = gsi_start_phis (bb);
+	   !gsi_end_p (gpi); gsi_next (&gpi))
+	gimple_set_visited (gpi.phi (), false);
+      for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	   !gsi_end_p (gsi); gsi_next (&gsi))
+	gimple_set_visited (gsi_stmt (gsi), false);
+    }
+
+  for (int i = 0; i < rpo_num; ++i)
+    {
+      basic_block bb = BASIC_BLOCK_FOR_FN (fn, rpo[i]);
+
+      for (gphi_iterator gpi = gsi_start_phis (bb);
+	   !gsi_end_p (gpi); gsi_next (&gpi))
+	{
+	  gphi *phi = gpi.phi ();
+	  /* ???  We can merge SAFE state across BB boundaries in
+	     some cases, like when edges are not critical and the
+	     state was made SAFE in the tail of the predecessors
+	     and not invalidated by calls.   */
+	  gimple_set_plf (phi, SV1_SAFE, false);
+	}
+
+      bool instrumented_call_p = false;
+      for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	   !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *stmt = gsi_stmt (gsi);
+	  gimple_set_visited (stmt, true);
+	  if (is_gimple_debug (stmt))
+	    continue;
+
+	  tree op;
+	  ssa_op_iter it;
+	  bool safe = is_gimple_assign (stmt);
+	  if (safe)
+	    FOR_EACH_SSA_TREE_OPERAND (op, stmt, it, SSA_OP_USE)
+	      {
+		if (safe
+		    && (SSA_NAME_IS_DEFAULT_DEF (op)
+			|| !gimple_plf (SSA_NAME_DEF_STMT (op), SV1_SAFE)
+			/* Once mask can have changed we cannot further
+			   propagate safe state.  */
+			|| gimple_bb (SSA_NAME_DEF_STMT (op)) != bb
+			/* That includes calls if we have instrumented one
+			   in this block.  */
+			|| (instrumented_call_p
+			    && call_between (SSA_NAME_DEF_STMT (op), stmt))))
+		  {
+		    safe = false;
+		    break;
+		  }
+	      }
+	  gimple_set_plf (stmt, SV1_SAFE, safe);
+
+	  /* Instrument bounded loads.
+	     We instrument non-aggregate loads with non-invariant address.
+	     The idea is to reliably instrument the bounded load while
+	     leaving the canary, being it load or store, aggregate or
+	     non-aggregate, alone.  */
+	  if (gimple_assign_single_p (stmt)
+	      && gimple_vuse (stmt)
+	      && !gimple_vdef (stmt)
+	      && !zero_ssa_operands (stmt, SSA_OP_USE))
+	    {
+	      tree new_mem = instrument_mem (&gsi, gimple_assign_rhs1 (stmt),
+					     mask);
+	      gimple_assign_set_rhs1 (stmt, new_mem);
+	      update_stmt (stmt);
+	      /* The value loaded my a masked load is "safe".  */
+	      gimple_set_plf (stmt, SV1_SAFE, true);
+	    }
+
+	  /* Instrument return store to TLS mask.  */
+	  if (flag_spectrev1 >= 3
+	      && gimple_code (stmt) == GIMPLE_RETURN)
+	    {
+	      new_stmt = gimple_build_assign (spectrev1_tls_mask_decl, mask);
+	      gsi_insert_before (&gsi, new_stmt, GSI_SAME_STMT);
+	    }
+	  /* Instrument calls with store/load to/from TLS mask.
+	     ???  Placement of the stores/loads can be optimized in a LCM
+	     way.  */
+	  else if (flag_spectrev1 >= 3
+		   && is_gimple_call (stmt)
+		   && gimple_vuse (stmt))
+	    {
+	      new_stmt = gimple_build_assign (spectrev1_tls_mask_decl, mask);
+	      gsi_insert_before (&gsi, new_stmt, GSI_SAME_STMT);
+	      if (!stmt_ends_bb_p (stmt))
+		{
+		  new_stmt = gimple_build_assign (mask,
+						  spectrev1_tls_mask_decl);
+		  gsi_insert_after (&gsi, new_stmt, GSI_NEW_STMT);
+		}
+	      else
+		{
+		  edge_iterator ei;
+		  edge e;
+		  FOR_EACH_EDGE (e, ei, bb->succs)
+		    {
+		      if (e->flags & EDGE_ABNORMAL)
+			continue;
+		      new_stmt = gimple_build_assign (mask,
+						      spectrev1_tls_mask_decl);
+		      gsi_insert_on_edge (e, new_stmt);
+		    }
+		}
+	      instrumented_call_p = true;
+	    }
+	}
+
+      if (EDGE_COUNT (bb->succs) > 1)
+	{
+	  gcond *stmt = safe_dyn_cast <gcond *> (last_stmt (bb));
+	  /* ???  What about switches?  What about badly speculated EH?  */
+	  if (!stmt)
+	    continue;
+
+	  /* Instrument conditional branches to track mis-speculation
+	     via a pointer-sized mask.
+	     ???  We could restrict to instrumenting those conditions
+	     that control interesting loads or apply simple heuristics
+	     like not instrumenting FP compares or equality compares
+	     which are unlikely bounds checks.  But we have to instrument
+	     bool != 0 because multiple conditions might have been
+	     combined.  */
+	  edge truee, falsee;
+	  extract_true_false_edges_from_block (bb, &truee, &falsee);
+	  /* Unless -fspectre-v1=2 we do not instrument loop exit tests.  */
+	  if (flag_spectrev1 >= 2
+	      || !loop_exits_from_bb_p (bb->loop_father, bb))
+	    {
+	      gimple_stmt_iterator gsi = gsi_last_bb (bb);
+
+	      /* Instrument
+	           if (a_1 > b_2)
+		 as
+	           tem_mask_3 = a_1 > b_2 ? -1 : 0;
+		   if (tem_mask_3 != 0)
+		 this will result in a
+		   xor %eax, %eax; cmp|test; setCC %al; sub $0x1, %eax; jne
+		 sequence which is faster in practice than when retaining
+		 the original jump condition.  This is 10 bytes overhead
+		 on x86_64 plus 3 bytes for an and on the true path and
+		 5 bytes for an and and not on the false path.  */
+	      tree tem_mask = make_ssa_name (ptr_type_node);
+	      new_stmt = gimple_build_assign (tem_mask, COND_EXPR,
+					      build2 (gimple_cond_code (stmt),
+						      boolean_type_node,
+						      gimple_cond_lhs (stmt),
+						      gimple_cond_rhs (stmt)),
+					      build_all_ones_cst (ptr_type_node),
+					      build_zero_cst (ptr_type_node));
+	      gsi_insert_before (&gsi, new_stmt, GSI_SAME_STMT);
+	      gimple_cond_set_code (stmt, NE_EXPR);
+	      gimple_cond_set_lhs (stmt, tem_mask);
+	      gimple_cond_set_rhs (stmt, build_zero_cst (ptr_type_node));
+	      update_stmt (stmt);
+
+	      /* On the false edge
+	           mask = mask & ~tem_mask_3;  */
+	      gimple_seq tems = NULL;
+	      tree tem_mask2 = make_ssa_name (ptr_type_node);
+	      new_stmt = gimple_build_assign (tem_mask2, BIT_NOT_EXPR,
+					      tem_mask);
+	      gimple_seq_add_stmt_without_update (&tems, new_stmt);
+	      new_stmt = gimple_build_assign (mask, BIT_AND_EXPR,
+					      mask, tem_mask2);
+	      gimple_seq_add_stmt_without_update (&tems, new_stmt);
+	      gsi_insert_seq_on_edge (falsee, tems);
+
+	      /* On the true edge
+	           mask = mask & tem_mask_3;  */
+	      new_stmt = gimple_build_assign (mask, BIT_AND_EXPR,
+					      mask, tem_mask);
+	      gsi_insert_on_edge (truee, new_stmt);
+	    }
+	}
+    }
+
+  gsi_commit_edge_inserts ();
+
+  return 0;
+}
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_spectrev1 (gcc::context *ctxt)
+{
+  return new pass_spectrev1 (ctxt);
+}
diff --git a/gcc/params.def b/gcc/params.def
index 6f98fccd291..19f7dbf4dad 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1378,6 +1378,11 @@ DEFPARAM(PARAM_LOOP_VERSIONING_MAX_OUTER_INSNS,
 	 " loops.",
 	 100, 0, 0)
 
+DEFPARAM(PARAM_SPECTRE_V1_MAX_INSTRUMENT_INDICES,
+	 "spectre-v1-max-instrument-indices",
+	 "Maximum number of indices to instrument before instrumenting the whole address.",
+	 1, 0, 0)
+
 /*
 
 Local variables:
diff --git a/gcc/passes.def b/gcc/passes.def
index 144df4fa417..2fe0cdcfa7e 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -400,6 +400,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_lower_resx);
   NEXT_PASS (pass_nrv);
   NEXT_PASS (pass_cleanup_cfg_post_optimizing);
+  NEXT_PASS (pass_spectrev1);
   NEXT_PASS (pass_warn_function_noreturn);
   NEXT_PASS (pass_gen_hsail);
 
diff --git a/gcc/testsuite/gcc.dg/Wspectre-v1-1.c b/gcc/testsuite/gcc.dg/Wspectre-v1-1.c
new file mode 100644
index 00000000000..3ac647e72fd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wspectre-v1-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-Wspectre-v1" } */
+
+unsigned char a[1024];
+int b[256];
+int foo (int i, int bound)
+{
+  if (i < bound)
+    return b[a[i]];  /* { dg-warning "spectrev1" } */
+}
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 9f9d85fdbc3..f5c164f465f 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -625,6 +625,7 @@ extern gimple_opt_pass *make_pass_local_fn_summary (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_update_address_taken (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_convert_switch (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_lower_vaarg (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_spectrev1 (gcc::context *ctxt);
 
 /* Current optimization pass.  */
 extern opt_pass *current_pass;


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]