Update: [PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

Hongtao Liu crazylht@gmail.com
Tue Nov 24 12:30:17 GMT 2020


Hi:
  I'm learning about this patch, and I see one place that might be
slighted improved.

+      poly_int64 size = (top - bot);
+
+      /* Assert the edge of each variable is aligned to the HWASAN tag granule
+        size.  */
+      gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
+

The last gcc_assert looks redundant?

On Sat, Nov 21, 2020 at 2:48 AM Matthew Malcomson via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
>
> Hi there,
>
> I was just doing some double-checks and noticed I'd placed the
> documentation in the wrong section of tm.texi.  The `MEMTAG` hooks were
> documented in the `Register Classes` section, so I've now moved it to
> the `Misc` section.
>
> That's the only change, Ok for trunk?
>
> Matthew
>
>
> ------------------------------------------------------------
>
>
>
> Handling stack variables has three features.
>
> 1) Ensure HWASAN required alignment for stack variables
>
> When tagging shadow memory, we need to ensure that each tag granule is
> only used by one variable at a time.
>
> This is done by ensuring that each tagged variable is aligned to the tag
> granule representation size and also ensure that the end of each
> object is aligned to ensure the start of any other data stored on the
> stack is in a different granule.
>
> This patch ensures the above by forcing the stack pointer to be aligned
> before and after allocating any stack objects. Since we are forcing
> alignment we also use `align_local_variable` to ensure this new alignment
> is advertised properly through SET_DECL_ALIGN.
>
> 2) Put tags into each stack variable pointer
>
> Make sure that every pointer to a stack variable includes a tag of some
> sort on it.
>
> The way tagging works is:
>   1) For every new stack frame, a random tag is generated.
>   2) A base register is formed from the stack pointer value and this
>      random tag.
>   3) References to stack variables are now formed with RTL describing an
>      offset from this base in both tag and value.
>
> The random tag generation is handled by a backend hook.  This hook
> decides whether to introduce a random tag or use the stack background
> based on the parameter hwasan-random-frame-tag.  Using the stack
> background is necessary for testing and bootstrap.  It is necessary
> during bootstrap to avoid breaking the `configure` test program for
> determining stack direction.
>
> Using the stack background means that every stack frame has the initial
> tag of zero and variables are tagged with incrementing tags from 1,
> which also makes debugging a bit easier.
>
> Backend hooks define the size of a tag, the layout of the HWASAN shadow
> memory, and handle emitting the code that inserts and extracts tags from a
> pointer.
>
> 3) For each stack variable, tag and untag the shadow stack on function
>    prologue and epilogue.
>
> On entry to each function we tag the relevant shadow stack region for
> each stack variable. This stack region is tagged to match the tag added to
> each pointer to that variable.
>
> This is the first patch where we use the HWASAN shadow space, so we need
> to add in the libhwasan initialisation code that creates this shadow
> memory region into the binary we produce.  This instrumentation is done
> in `compile_file`.
>
> When exiting a function we need to ensure the shadow stack for this
> function has no remaining tags.  Without clearing the shadow stack area
> for this stack frame, later function calls could get false positives
> when those later function calls check untagged areas (such as parameters
> passed on the stack) against a shadow stack area with left-over tag.
>
> Hence we ensure that the entire stack frame is cleared on function exit.
>
> config/ChangeLog:
>
>         * bootstrap-hwasan.mk: Disable random frame tags for stack-tagging
>         during bootstrap.
>
> ChangeLog:
>
>         * gcc/asan.c (struct hwasan_stack_var): New.
>         (hwasan_sanitize_p): New.
>         (hwasan_sanitize_stack_p): New.
>         (hwasan_sanitize_allocas_p): New.
>         (initialize_sanitizer_builtins): Define new builtins.
>         (ATTR_NOTHROW_LIST): New macro.
>         (hwasan_current_frame_tag): New.
>         (hwasan_frame_base): New.
>         (stack_vars_base_reg_p): New.
>         (hwasan_maybe_init_frame_base_init): New.
>         (hwasan_record_stack_var): New.
>         (hwasan_get_frame_extent): New.
>         (hwasan_increment_frame_tag): New.
>         (hwasan_record_frame_init): New.
>         (hwasan_emit_prologue): New.
>         (hwasan_emit_untag_frame): New.
>         (hwasan_finish_file): New.
>         (hwasan_truncate_to_tag_size): New.
>         * gcc/asan.h (hwasan_record_frame_init): New declaration.
>         (hwasan_record_stack_var): New declaration.
>         (hwasan_emit_prologue): New declaration.
>         (hwasan_emit_untag_frame): New declaration.
>         (hwasan_get_frame_extent): New declaration.
>         (hwasan_maybe_enit_frame_base_init): New declaration.
>         (hwasan_frame_base): New declaration.
>         (stack_vars_base_reg_p): New declaration.
>         (hwasan_current_frame_tag): New declaration.
>         (hwasan_increment_frame_tag): New declaration.
>         (hwasan_truncate_to_tag_size): New declaration.
>         (hwasan_finish_file): New declaration.
>         (hwasan_sanitize_p): New declaration.
>         (hwasan_sanitize_stack_p): New declaration.
>         (hwasan_sanitize_allocas_p): New declaration.
>         (HWASAN_TAG_SIZE): New macro.
>         (HWASAN_TAG_GRANULE_SIZE): New macro.
>         (HWASAN_STACK_BACKGROUND): New macro.
>         * gcc/builtin-types.def (BT_FN_VOID_PTR_UINT8_PTRMODE): New.
>         * gcc/builtins.def (DEF_SANITIZER_BUILTIN): Enable for HWASAN.
>         * gcc/cfgexpand.c (align_local_variable): When using hwasan ensure
>         alignment to tag granule.
>         (align_frame_offset): New.
>         (expand_one_stack_var_at): For hwasan use tag offset.
>         (expand_stack_vars): Record stack objects for hwasan.
>         (expand_one_stack_var_1): Record stack objects for hwasan.
>         (init_vars_expansion): Initialise hwasan state.
>         (expand_used_vars): Emit hwasan prologue and generate hwasan epilogue.
>         (pass_expand::execute): Emit hwasan base initialization if needed.
>         * gcc/doc/tm.texi (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
>         TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
>         TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
>         TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
>         * gcc/doc/tm.texi.in (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
>         TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
>         TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
>         TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
>         * gcc/explow.c (get_dynamic_stack_base): Take new `base` argument.
>         * gcc/explow.h (get_dynamic_stack_base): Take new `base` argument.
>         * gcc/sanitizer.def (BUILT_IN_HWASAN_INIT): New.
>         (BUILT_IN_HWASAN_TAG_MEM): New.
>         * gcc/target.def (target_memtag_tag_size,target_memtag_granule_size,
>         target_memtag_insert_random_tag,target_memtag_add_tag,
>         target_memtag_set_tag,target_memtag_extract_tag,
>         target_memtag_untagged_pointer): New hooks.
>         * gcc/targhooks.c (HWASAN_SHIFT): New.
>         (HWASAN_SHIFT_RTX): New.
>         (default_memtag_tag_size): New default hook.
>         (default_memtag_granule_size): New default hook.
>         (default_memtag_insert_random_tag): New default hook.
>         (default_memtag_add_tag): New default hook.
>         (default_memtag_set_tag): New default hook.
>         (default_memtag_extract_tag): New default hook.
>         (default_memtag_untagged_pointer): New default hook.
>         * gcc/targhooks.h (default_memtag_tag_size): New default hook.
>         (default_memtag_granule_size): New default hook.
>         (default_memtag_insert_random_tag): New default hook.
>         (default_memtag_add_tag): New default hook.
>         (default_memtag_set_tag): New default hook.
>         (default_memtag_extract_tag): New default hook.
>         (default_memtag_untagged_pointer): New default hook.
>         * gcc/toplev.c (compile_file): Call hwasan_finish_file when finished.
>
>
> ###############     Attachment also inlined for ease of reply    ###############
>
>
> diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
> index 4f60bed3fd6e98b47a3a38aea6eba2a7c320da25..91989f4bb1db6ccff564383777757b896645e541 100644
> --- a/config/bootstrap-hwasan.mk
> +++ b/config/bootstrap-hwasan.mk
> @@ -1,7 +1,11 @@
>  # This option enables -fsanitize=hwaddress for stage2 and stage3.
> +# We need to disable random frame tags for bootstrap since the autoconf check
> +# for which direction the stack is growing has UB that a random frame tag
> +# breaks.  Running with a random frame tag gives approx. 50% chance of
> +# bootstrap comparison diff in libiberty/alloca.c.
>
> -STAGE2_CFLAGS += -fsanitize=hwaddress
> -STAGE3_CFLAGS += -fsanitize=hwaddress
> +STAGE2_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
> +STAGE3_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
>  POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
>                       -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
>                       -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
> diff --git a/gcc/asan.h b/gcc/asan.h
> index 114b457ef91c4479d43774bed58c24213196ce12..8d5271e6b575d74da277420798557f3274e966ce 100644
> --- a/gcc/asan.h
> +++ b/gcc/asan.h
> @@ -34,6 +34,22 @@ extern bool asan_expand_mark_ifn (gimple_stmt_iterator *);
>  extern bool asan_expand_poison_ifn (gimple_stmt_iterator *, bool *,
>                                     hash_map<tree, tree> &);
>
> +extern void hwasan_record_frame_init ();
> +extern void hwasan_record_stack_var (rtx, rtx, poly_int64, poly_int64);
> +extern void hwasan_emit_prologue ();
> +extern rtx_insn *hwasan_emit_untag_frame (rtx, rtx);
> +extern rtx hwasan_get_frame_extent ();
> +extern rtx hwasan_frame_base ();
> +extern void hwasan_maybe_emit_frame_base_init (void);
> +extern bool stack_vars_base_reg_p (rtx);
> +extern uint8_t hwasan_current_frame_tag ();
> +extern void hwasan_increment_frame_tag ();
> +extern rtx hwasan_truncate_to_tag_size (rtx, rtx);
> +extern void hwasan_finish_file (void);
> +extern bool hwasan_sanitize_p (void);
> +extern bool hwasan_sanitize_stack_p (void);
> +extern bool hwasan_sanitize_allocas_p (void);
> +
>  extern gimple_stmt_iterator create_cond_insert_point
>       (gimple_stmt_iterator *, bool, bool, bool, basic_block *, basic_block *);
>
> @@ -75,6 +91,26 @@ extern hash_set <tree> *asan_used_labels;
>
>  #define ASAN_USE_AFTER_SCOPE_ATTRIBUTE "use after scope memory"
>
> +/* NOTE: The values below and the hooks under targetm.memtag define an ABI and
> +   are hard-coded to these values in libhwasan, hence they can't be changed
> +   independently here.  */
> +/* How many bits are used to store a tag in a pointer.
> +   The default version uses the entire top byte of a pointer (i.e. 8 bits).  */
> +#define HWASAN_TAG_SIZE targetm.memtag.tag_size ()
> +/* Tag Granule of HWASAN shadow stack.
> +   This is the size in real memory that each byte in the shadow memory refers
> +   to.  I.e. if a variable is X bytes long in memory then its tag in shadow
> +   memory will span X / HWASAN_TAG_GRANULE_SIZE bytes.
> +   Most variables will need to be aligned to this amount since two variables
> +   that are neighbors in memory and share a tag granule would need to share the
> +   same tag (the shared tag granule can only store one tag).  */
> +#define HWASAN_TAG_GRANULE_SIZE targetm.memtag.granule_size ()
> +/* Define the tag for the stack background.
> +   This defines what tag the stack pointer will be and hence what tag all
> +   variables that are not given special tags are (e.g. spilled registers,
> +   and parameters passed on the stack).  */
> +#define HWASAN_STACK_BACKGROUND gen_int_mode (0, QImode)
> +
>  /* Various flags for Asan builtins.  */
>  enum asan_check_flags
>  {
> diff --git a/gcc/asan.c b/gcc/asan.c
> index 0b471afff64ea6a0ffbe0add71333ac688c472c6..d1ede3b62291eba698948e06208c482b6f197be5 100644
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -257,6 +257,58 @@ hash_set<tree> *asan_handled_variables = NULL;
>
>  hash_set <tree> *asan_used_labels = NULL;
>
> +/* Global variables for HWASAN stack tagging.  */
> +/* hwasan_frame_tag_offset records the offset from the frame base tag that the
> +   next object should have.  */
> +static uint8_t hwasan_frame_tag_offset = 0;
> +/* hwasan_frame_base_ptr is a pointer with the same address as
> +   `virtual_stack_vars_rtx` for the current frame, and with the frame base tag
> +   stored in it.  N.b. this global RTX does not need to be marked GTY, but is
> +   done so anyway.  The need is not there since all uses are in just one pass
> +   (cfgexpand) and there are no calls to ggc_collect between the uses.  We mark
> +   it GTY(()) anyway to allow the use of the variable later on if needed by
> +   future features.  */
> +static GTY(()) rtx hwasan_frame_base_ptr = NULL_RTX;
> +/* hwasan_frame_base_init_seq is the sequence of RTL insns that will initialize
> +   the hwasan_frame_base_ptr.  When the hwasan_frame_base_ptr is requested, we
> +   generate this sequence but do not emit it.  If the sequence was created it
> +   is emitted once the function body has been expanded.
> +
> +   This delay is because the frame base pointer may be needed anywhere in the
> +   function body, or needed by the expand_used_vars function.  Emitting once in
> +   a known place is simpler than requiring the emission of the instructions to
> +   be know where it should go depending on the first place the hwasan frame
> +   base is needed.  */
> +static GTY(()) rtx_insn *hwasan_frame_base_init_seq = NULL;
> +
> +/* Structure defining the extent of one object on the stack that HWASAN needs
> +   to tag in the corresponding shadow stack space.
> +
> +   The range this object spans on the stack is between `untagged_base +
> +   nearest_offset` and `untagged_base + farthest_offset`.
> +   `tagged_base` is an rtx containing the same value as `untagged_base` but
> +   with a random tag stored in the top byte.  We record both `untagged_base`
> +   and `tagged_base` so that `hwasan_emit_prologue` can use both without having
> +   to emit RTL into the instruction stream to re-calculate one from the other.
> +   (`hwasan_emit_prologue` needs to use both bases since the
> +   __hwasan_tag_memory call it emits uses an untagged value, and it calculates
> +   the tag to store in shadow memory based on the tag_offset plus the tag in
> +   tagged_base).  */
> +struct hwasan_stack_var
> +{
> +  rtx untagged_base;
> +  rtx tagged_base;
> +  poly_int64 nearest_offset;
> +  poly_int64 farthest_offset;
> +  uint8_t tag_offset;
> +};
> +
> +/* Variable recording all stack variables that HWASAN needs to tag.
> +   Does not need to be marked as GTY(()) since every use is in the cfgexpand
> +   pass and gcc_collect is not called in the middle of that pass.  */
> +static vec<hwasan_stack_var> hwasan_tagged_stack_vars;
> +
> +
>  /* Sets shadow offset to value in string VAL.  */
>
>  bool
> @@ -1359,6 +1411,28 @@ asan_redzone_buffer::flush_if_full (void)
>      flush_redzone_payload ();
>  }
>
> +/* Returns whether we are tagging pointers and checking those tags on memory
> +   access.  */
> +bool
> +hwasan_sanitize_p ()
> +{
> +  return sanitize_flags_p (SANITIZE_HWADDRESS);
> +}
> +
> +/* Are we tagging the stack?  */
> +bool
> +hwasan_sanitize_stack_p ()
> +{
> +  return (hwasan_sanitize_p () && param_hwasan_instrument_stack);
> +}
> +
> +/* Are we tagging alloca objects?  */
> +bool
> +hwasan_sanitize_allocas_p (void)
> +{
> +  return (hwasan_sanitize_stack_p () && param_hwasan_instrument_allocas);
> +}
> +
>  /* Insert code to protect stack vars.  The prologue sequence should be emitted
>     directly, epilogue sequence returned.  BASE is the register holding the
>     stack base, against which OFFSETS array offsets are relative to, OFFSETS
> @@ -2908,6 +2982,11 @@ initialize_sanitizer_builtins (void)
>      = build_function_type_list (void_type_node, uint64_type_node,
>                                 ptr_type_node, NULL_TREE);
>
> +  tree BT_FN_VOID_PTR_UINT8_PTRMODE
> +    = build_function_type_list (void_type_node, ptr_type_node,
> +                               unsigned_char_type_node,
> +                               pointer_sized_int_node, NULL_TREE);
> +
>    tree BT_FN_BOOL_VPTR_PTR_IX_INT_INT[5];
>    tree BT_FN_IX_CONST_VPTR_INT[5];
>    tree BT_FN_IX_VPTR_IX_INT[5];
> @@ -2958,6 +3037,8 @@ initialize_sanitizer_builtins (void)
>  #define BT_FN_I16_CONST_VPTR_INT BT_FN_IX_CONST_VPTR_INT[4]
>  #define BT_FN_I16_VPTR_I16_INT BT_FN_IX_VPTR_IX_INT[4]
>  #define BT_FN_VOID_VPTR_I16_INT BT_FN_VOID_VPTR_IX_INT[4]
> +#undef ATTR_NOTHROW_LIST
> +#define ATTR_NOTHROW_LIST ECF_NOTHROW
>  #undef ATTR_NOTHROW_LEAF_LIST
>  #define ATTR_NOTHROW_LEAF_LIST ECF_NOTHROW | ECF_LEAF
>  #undef ATTR_TMPURE_NOTHROW_LEAF_LIST
> @@ -3709,4 +3790,347 @@ make_pass_asan_O0 (gcc::context *ctxt)
>    return new pass_asan_O0 (ctxt);
>  }
>
> +/* For stack tagging:
> +
> +   Return the offset from the frame base tag that the "next" expanded object
> +   should have.  */
> +uint8_t
> +hwasan_current_frame_tag ()
> +{
> +  return hwasan_frame_tag_offset;
> +}
> +
> +/* For stack tagging:
> +
> +   Return the 'base pointer' for this function.  If that base pointer has not
> +   yet been created then we create a register to hold it and record the insns
> +   to initialize the register in `hwasan_frame_base_init_seq` for later
> +   emission.  */
> +rtx
> +hwasan_frame_base ()
> +{
> +  if (! hwasan_frame_base_ptr)
> +    {
> +      start_sequence ();
> +      hwasan_frame_base_ptr
> +       = force_reg (Pmode,
> +                    targetm.memtag.insert_random_tag (virtual_stack_vars_rtx,
> +                                                      NULL_RTX));
> +      hwasan_frame_base_init_seq = get_insns ();
> +      end_sequence ();
> +    }
> +
> +  return hwasan_frame_base_ptr;
> +}
> +
> +/* For stack tagging:
> +
> +   Check whether this RTX is a standard pointer addressing the base of the
> +   stack variables for this frame.  Returns true if the RTX is either
> +   virtual_stack_vars_rtx or hwasan_frame_base_ptr.  */
> +bool
> +stack_vars_base_reg_p (rtx base)
> +{
> +  return base == virtual_stack_vars_rtx || base == hwasan_frame_base_ptr;
> +}
> +
> +/* For stack tagging:
> +
> +   Emit frame base initialisation.
> +   If hwasan_frame_base has been used before here then
> +   hwasan_frame_base_init_seq contains the sequence of instructions to
> +   initialize it.  This must be put just before the hwasan prologue, so we emit
> +   the insns before parm_birth_insn (which will point to the first instruction
> +   of the hwasan prologue if it exists).
> +
> +   We update `parm_birth_insn` to point to the start of this initialisation
> +   since that represents the end of the initialisation done by
> +   expand_function_{start,end} functions and we want to maintain that.  */
> +void
> +hwasan_maybe_emit_frame_base_init ()
> +{
> +  if (! hwasan_frame_base_init_seq)
> +    return;
> +  emit_insn_before (hwasan_frame_base_init_seq, parm_birth_insn);
> +  parm_birth_insn = hwasan_frame_base_init_seq;
> +}
> +
> +/* Record a compile-time constant size stack variable that HWASAN will need to
> +   tag.  This record of the range of a stack variable will be used by
> +   `hwasan_emit_prologue` to emit the RTL at the start of each frame which will
> +   set tags in the shadow memory according to the assigned tag for each object.
> +
> +   The range that the object spans in stack space should be described by the
> +   bounds `untagged_base + nearest_offset` and
> +   `untagged_base + farthest_offset`.
> +   `tagged_base` is the base address which contains the "base frame tag" for
> +   this frame, and from which the value to address this object with will be
> +   calculated.
> +
> +   We record the `untagged_base` since the functions in the hwasan library we
> +   use to tag memory take pointers without a tag.  */
> +void
> +hwasan_record_stack_var (rtx untagged_base, rtx tagged_base,
> +                        poly_int64 nearest_offset, poly_int64 farthest_offset)
> +{
> +  hwasan_stack_var cur_var;
> +  cur_var.untagged_base = untagged_base;
> +  cur_var.tagged_base = tagged_base;
> +  cur_var.nearest_offset = nearest_offset;
> +  cur_var.farthest_offset = farthest_offset;
> +  cur_var.tag_offset = hwasan_current_frame_tag ();
> +
> +  hwasan_tagged_stack_vars.safe_push (cur_var);
> +}
> +
> +/* Return the RTX representing the farthest extent of the statically allocated
> +   stack objects for this frame.  If hwasan_frame_base_ptr has not been
> +   initialized then we are not storing any static variables on the stack in
> +   this frame.  In this case we return NULL_RTX to represent that.
> +
> +   Otherwise simply return virtual_stack_vars_rtx + frame_offset.  */
> +rtx
> +hwasan_get_frame_extent ()
> +{
> +  return (hwasan_frame_base_ptr
> +         ? plus_constant (Pmode, virtual_stack_vars_rtx, frame_offset)
> +         : NULL_RTX);
> +}
> +
> +/* For stack tagging:
> +
> +   Increment the frame tag offset modulo the size a tag can represent.  */
> +void
> +hwasan_increment_frame_tag ()
> +{
> +  uint8_t tag_bits = HWASAN_TAG_SIZE;
> +  gcc_assert (HWASAN_TAG_SIZE
> +             <= sizeof (hwasan_frame_tag_offset) * CHAR_BIT);
> +  hwasan_frame_tag_offset = (hwasan_frame_tag_offset + 1) % (1 << tag_bits);
> +  /* The "background tag" of the stack is zero by definition.
> +     This is the tag that objects like parameters passed on the stack and
> +     spilled registers are given.  It is handy to avoid this tag for objects
> +     whose tags we decide ourselves, partly to ensure that buffer overruns
> +     can't affect these important variables (e.g. saved link register, saved
> +     stack pointer etc) and partly to make debugging easier (everything with a
> +     tag of zero is space allocated automatically by the compiler).
> +
> +     This is not feasible when using random frame tags (the default
> +     configuration for hwasan) since the tag for the given frame is randomly
> +     chosen at runtime.  In order to avoid any tags matching the stack
> +     background we would need to decide tag offsets at runtime instead of
> +     compile time (and pay the resulting performance cost).
> +
> +     When not using random base tags for each frame (i.e. when compiled with
> +     `--param hwasan-random-frame-tag=0`) the base tag for each frame is zero.
> +     This means the tag that each object gets is equal to the
> +     hwasan_frame_tag_offset used in determining it.
> +     When this is the case we *can* ensure no object gets the tag of zero by
> +     simply ensuring no object has the hwasan_frame_tag_offset of zero.
> +
> +     There is the extra complication that we only record the
> +     hwasan_frame_tag_offset here (which is the offset from the tag stored in
> +     the stack pointer).  In the kernel, the tag in the stack pointer is 0xff
> +     rather than zero.  This does not cause problems since tags of 0xff are
> +     never checked in the kernel.  As mentioned at the beginning of this
> +     comment the background tag of the stack is zero by definition, which means
> +     that for the kernel we should skip offsets of both 0 and 1 from the stack
> +     pointer.  Avoiding the offset of 0 ensures we use a tag which will be
> +     checked, avoiding the offset of 1 ensures we use a tag that is not the
> +     same as the background.  */
> +  if (hwasan_frame_tag_offset == 0 && ! param_hwasan_random_frame_tag)
> +    hwasan_frame_tag_offset += 1;
> +  if (hwasan_frame_tag_offset == 1 && ! param_hwasan_random_frame_tag
> +      && sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS))
> +    hwasan_frame_tag_offset += 1;
> +}
> +
> +/* Clear internal state for the next function.
> +   This function is called before variables on the stack get expanded, in
> +   `init_vars_expansion`.  */
> +void
> +hwasan_record_frame_init ()
> +{
> +  delete asan_used_labels;
> +  asan_used_labels = NULL;
> +
> +  /* If this isn't the case then some stack variable was recorded *before*
> +     hwasan_record_frame_init is called, yet *after* the hwasan prologue for
> +     the previous frame was emitted.  Such stack variables would not have
> +     their shadow stack filled in.  */
> +  gcc_assert (hwasan_tagged_stack_vars.is_empty ());
> +  hwasan_frame_base_ptr = NULL_RTX;
> +  hwasan_frame_base_init_seq = NULL;
> +
> +  /* When not using a random frame tag we can avoid the background stack
> +     color which gives the user a little better debug output upon a crash.
> +     Meanwhile, when using a random frame tag it will be nice to avoid adding
> +     tags for the first object since that is unnecessary extra work.
> +     Hence set the initial hwasan_frame_tag_offset to be 0 if using a random
> +     frame tag and 1 otherwise.
> +
> +     As described in hwasan_increment_frame_tag, in the kernel the stack
> +     pointer has the tag 0xff.  That means that to avoid 0xff and 0 (the tag
> +     which the kernel does not check and the background tag respectively) we
> +     start with a tag offset of 2.  */
> +  hwasan_frame_tag_offset = param_hwasan_random_frame_tag
> +    ? 0
> +    : sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS) ? 2 : 1;
> +}
> +
> +/* For stack tagging:
> +   (Emits HWASAN equivalent of what is emitted by
> +   `asan_emit_stack_protection`).
> +
> +   Emits the extra prologue code to set the shadow stack as required for HWASAN
> +   stack instrumentation.
> +
> +   Uses the vector of recorded stack variables hwasan_tagged_stack_vars.  When
> +   this function has completed hwasan_tagged_stack_vars is empty and all
> +   objects it had pointed to are deallocated.  */
> +void
> +hwasan_emit_prologue ()
> +{
> +  /* We need untagged base pointers since libhwasan only accepts untagged
> +    pointers in __hwasan_tag_memory.  We need the tagged base pointer to obtain
> +    the base tag for an offset.  */
> +
> +  if (hwasan_tagged_stack_vars.is_empty ())
> +    return;
> +
> +  poly_int64 bot = 0, top = 0;
> +  for (hwasan_stack_var &cur : hwasan_tagged_stack_vars)
> +    {
> +      poly_int64 nearest = cur.nearest_offset;
> +      poly_int64 farthest = cur.farthest_offset;
> +
> +      if (known_ge (nearest, farthest))
> +       {
> +         top = nearest;
> +         bot = farthest;
> +       }
> +      else
> +       {
> +         /* Given how these values are calculated, one must be known greater
> +            than the other.  */
> +         gcc_assert (known_le (nearest, farthest));
> +         top = farthest;
> +         bot = nearest;
> +       }
> +      poly_int64 size = (top - bot);
> +
> +      /* Assert the edge of each variable is aligned to the HWASAN tag granule
> +        size.  */
> +      gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
> +      gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
> +      gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
> +
> +      rtx fn = init_one_libfunc ("__hwasan_tag_memory");
> +      rtx base_tag = targetm.memtag.extract_tag (cur.tagged_base, NULL_RTX);
> +      rtx tag = plus_constant (QImode, base_tag, cur.tag_offset);
> +      tag = hwasan_truncate_to_tag_size (tag, NULL_RTX);
> +
> +      rtx bottom = convert_memory_address (ptr_mode,
> +                                          plus_constant (Pmode,
> +                                                         cur.untagged_base,
> +                                                         bot));
> +      emit_library_call (fn, LCT_NORMAL, VOIDmode,
> +                        bottom, ptr_mode,
> +                        tag, QImode,
> +                        gen_int_mode (size, ptr_mode), ptr_mode);
> +    }
> +  /* Clear the stack vars, we've emitted the prologue for them all now.  */
> +  hwasan_tagged_stack_vars.truncate (0);
> +}
> +
> +/* For stack tagging:
> +
> +   Return RTL insns to clear the tags between DYNAMIC and VARS pointers
> +   into the stack.  These instructions should be emitted at the end of
> +   every function.
> +
> +   If `dynamic` is NULL_RTX then no insns are returned.  */
> +rtx_insn *
> +hwasan_emit_untag_frame (rtx dynamic, rtx vars)
> +{
> +  if (! dynamic)
> +    return NULL;
> +
> +  start_sequence ();
> +
> +  dynamic = convert_memory_address (ptr_mode, dynamic);
> +  vars = convert_memory_address (ptr_mode, vars);
> +
> +  rtx top_rtx;
> +  rtx bot_rtx;
> +  if (FRAME_GROWS_DOWNWARD)
> +    {
> +      top_rtx = vars;
> +      bot_rtx = dynamic;
> +    }
> +  else
> +    {
> +      top_rtx = dynamic;
> +      bot_rtx = vars;
> +    }
> +
> +  rtx size_rtx = expand_simple_binop (ptr_mode, MINUS, top_rtx, bot_rtx,
> +                                     NULL_RTX, /* unsignedp = */0,
> +                                     OPTAB_DIRECT);
> +
> +  rtx fn = init_one_libfunc ("__hwasan_tag_memory");
> +  emit_library_call (fn, LCT_NORMAL, VOIDmode,
> +                    bot_rtx, ptr_mode,
> +                    HWASAN_STACK_BACKGROUND, QImode,
> +                    size_rtx, ptr_mode);
> +
> +  do_pending_stack_adjust ();
> +  rtx_insn *insns = get_insns ();
> +  end_sequence ();
> +  return insns;
> +}
> +
> +/* Needs to be GTY(()), because cgraph_build_static_cdtor may
> +   invoke ggc_collect.  */
> +static GTY(()) tree hwasan_ctor_statements;
> +
> +/* Insert module initialization into this TU.  This initialization calls the
> +   initialization code for libhwasan.  */
> +void
> +hwasan_finish_file (void)
> +{
> +  /* Do not emit constructor initialization for the kernel.
> +     (the kernel has its own initialization already).  */
> +  if (flag_sanitize & SANITIZE_KERNEL_HWADDRESS)
> +    return;
> +
> +  /* Avoid instrumenting code in the hwasan constructors/destructors.  */
> +  flag_sanitize &= ~SANITIZE_HWADDRESS;
> +  int priority = MAX_RESERVED_INIT_PRIORITY - 1;
> +  tree fn = builtin_decl_implicit (BUILT_IN_HWASAN_INIT);
> +  append_to_statement_list (build_call_expr (fn, 0), &hwasan_ctor_statements);
> +  cgraph_build_static_cdtor ('I', hwasan_ctor_statements, priority);
> +  flag_sanitize |= SANITIZE_HWADDRESS;
> +}
> +
> +/* For stack tagging:
> +
> +   Truncate `tag` to the number of bits that a tag uses (i.e. to
> +   HWASAN_TAG_SIZE).  Store the result in `target` if it's convenient.  */
> +rtx
> +hwasan_truncate_to_tag_size (rtx tag, rtx target)
> +{
> +  gcc_assert (GET_MODE (tag) == QImode);
> +  if (HWASAN_TAG_SIZE != GET_MODE_PRECISION (QImode))
> +    {
> +      gcc_assert (GET_MODE_PRECISION (QImode) > HWASAN_TAG_SIZE);
> +      rtx mask = gen_int_mode ((HOST_WIDE_INT_1U << HWASAN_TAG_SIZE) - 1,
> +                              QImode);
> +      tag = expand_simple_binop (QImode, AND, tag, mask, target,
> +                                /* unsignedp = */1, OPTAB_WIDEN);
> +      gcc_assert (tag);
> +    }
> +  return tag;
> +}
> +
>  #include "gt-asan.h"
> diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
> index 4a82ee421bef42154ccd88e52f7a19f48b340c73..1ad6657da45cc4976532e1b8bc233f67d8da9ccf 100644
> --- a/gcc/builtin-types.def
> +++ b/gcc/builtin-types.def
> @@ -639,6 +639,8 @@ DEF_FUNCTION_TYPE_3 (BT_FN_PTR_PTR_CONST_SIZE_BOOL,
>                      BT_PTR, BT_PTR, BT_CONST_SIZE, BT_BOOL)
>  DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_PTRMODE,
>                      BT_PTR, BT_SIZE, BT_SIZE, BT_PTRMODE)
> +DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_UINT8_PTRMODE, BT_VOID, BT_PTR, BT_UINT8,
> +                    BT_PTRMODE)
>
>  DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
>                      BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
> diff --git a/gcc/builtins.def b/gcc/builtins.def
> index b4494c712a1751fbb37378f38cc1411d11a37331..97bb5d0b0aee7fa9ee4c82e2d80eae866fc23829 100644
> --- a/gcc/builtins.def
> +++ b/gcc/builtins.def
> @@ -245,6 +245,7 @@ along with GCC; see the file COPYING3.  If not see
>    DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,    \
>                true, true, true, ATTRS, true, \
>               (flag_sanitize & (SANITIZE_ADDRESS | SANITIZE_THREAD \
> +                               | SANITIZE_HWADDRESS \
>                                 | SANITIZE_UNDEFINED \
>                                 | SANITIZE_UNDEFINED_NONDEFAULT) \
>                || flag_sanitize_coverage))
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 1df6f4bc55a39230c98e58af6c2d765652db8324..231c2ee32362fc3967b1cd7b70bd330ce49648d3 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -376,15 +376,18 @@ align_local_variable (tree decl, bool really_expand)
>         align = GET_MODE_ALIGNMENT (mode);
>      }
>    else
> -    {
> -      align = LOCAL_DECL_ALIGNMENT (decl);
> -      /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
> -        That is done before IPA and could bump alignment based on host
> -        backend even for offloaded code which wants different
> -        LOCAL_DECL_ALIGNMENT.  */
> -      if (really_expand)
> -       SET_DECL_ALIGN (decl, align);
> -    }
> +    align = LOCAL_DECL_ALIGNMENT (decl);
> +
> +  if (hwasan_sanitize_stack_p ())
> +    align = MAX (align, (unsigned) HWASAN_TAG_GRANULE_SIZE * BITS_PER_UNIT);
> +
> +  if (TREE_CODE (decl) != SSA_NAME && really_expand)
> +    /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
> +       That is done before IPA and could bump alignment based on host
> +       backend even for offloaded code which wants different
> +       LOCAL_DECL_ALIGNMENT.  */
> +    SET_DECL_ALIGN (decl, align);
> +
>    return align / BITS_PER_UNIT;
>  }
>
> @@ -428,6 +431,14 @@ alloc_stack_frame_space (poly_int64 size, unsigned HOST_WIDE_INT align)
>    return offset;
>  }
>
> +/* Ensure that the stack is aligned to ALIGN bytes.
> +   Return the new frame offset.  */
> +static poly_int64
> +align_frame_offset (unsigned HOST_WIDE_INT align)
> +{
> +  return alloc_stack_frame_space (0, align);
> +}
> +
>  /* Accumulate DECL into STACK_VARS.  */
>
>  static void
> @@ -1004,7 +1015,12 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>    /* If this fails, we've overflowed the stack frame.  Error nicely?  */
>    gcc_assert (known_eq (offset, trunc_int_for_mode (offset, Pmode)));
>
> -  x = plus_constant (Pmode, base, offset);
> +  if (hwasan_sanitize_stack_p ())
> +    x = targetm.memtag.add_tag (base, offset,
> +                               hwasan_current_frame_tag ());
> +  else
> +    x = plus_constant (Pmode, base, offset);
> +
>    x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
>                    ? TYPE_MODE (TREE_TYPE (decl))
>                    : DECL_MODE (decl), x);
> @@ -1013,7 +1029,7 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
>       If it is we generate stack slots only accidentally so it isn't as
>       important, we'll simply set the alignment directly on the MEM.  */
>
> -  if (base == virtual_stack_vars_rtx)
> +  if (stack_vars_base_reg_p (base))
>      offset -= frame_phase;
>    align = known_alignment (offset);
>    align *= BITS_PER_UNIT;
> @@ -1056,13 +1072,13 @@ public:
>  /* A subroutine of expand_used_vars.  Give each partition representative
>     a unique location within the stack frame.  Update each partition member
>     with that location.  */
> -
>  static void
>  expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>  {
>    size_t si, i, j, n = stack_vars_num;
>    poly_uint64 large_size = 0, large_alloc = 0;
>    rtx large_base = NULL;
> +  rtx large_untagged_base = NULL;
>    unsigned large_align = 0;
>    bool large_allocation_done = false;
>    tree decl;
> @@ -1113,7 +1129,7 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>      {
>        rtx base;
>        unsigned base_align, alignb;
> -      poly_int64 offset;
> +      poly_int64 offset = 0;
>
>        i = stack_vars_sorted[si];
>
> @@ -1134,10 +1150,33 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>        if (pred && !pred (i))
>         continue;
>
> +      base = (hwasan_sanitize_stack_p ()
> +             ? hwasan_frame_base ()
> +             : virtual_stack_vars_rtx);
>        alignb = stack_vars[i].alignb;
>        if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
>         {
> -         base = virtual_stack_vars_rtx;
> +         poly_int64 hwasan_orig_offset;
> +         if (hwasan_sanitize_stack_p ())
> +           {
> +             /* There must be no tag granule "shared" between different
> +                objects.  This means that no HWASAN_TAG_GRANULE_SIZE byte
> +                chunk can have more than one object in it.
> +
> +                We ensure this by forcing the end of the last bit of data to
> +                be aligned to HWASAN_TAG_GRANULE_SIZE bytes here, and setting
> +                the start of each variable to be aligned to
> +                HWASAN_TAG_GRANULE_SIZE bytes in `align_local_variable`.
> +
> +                We can't align just one of the start or end, since there are
> +                untagged things stored on the stack which we do not align to
> +                HWASAN_TAG_GRANULE_SIZE bytes.  If we only aligned the start
> +                or the end of tagged objects then untagged objects could end
> +                up sharing the first granule of a tagged object or sharing the
> +                last granule of a tagged object respectively.  */
> +             hwasan_orig_offset = align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +             gcc_assert (stack_vars[i].alignb >= HWASAN_TAG_GRANULE_SIZE);
> +           }
>           /* ASAN description strings don't yet have a syntax for expressing
>              polynomial offsets.  */
>           HOST_WIDE_INT prev_offset;
> @@ -1148,7 +1187,7 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>             {
>               if (data->asan_vec.is_empty ())
>                 {
> -                 alloc_stack_frame_space (0, ASAN_RED_ZONE_SIZE);
> +                 align_frame_offset (ASAN_RED_ZONE_SIZE);
>                   prev_offset = frame_offset.to_constant ();
>                 }
>               prev_offset = align_base (prev_offset,
> @@ -1216,6 +1255,24 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>             {
>               offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
>               base_align = crtl->max_used_stack_slot_alignment;
> +
> +             if (hwasan_sanitize_stack_p ())
> +               {
> +                 /* Align again since the point of this alignment is to handle
> +                    the "end" of the object (i.e. smallest address after the
> +                    stack object).  For FRAME_GROWS_DOWNWARD that requires
> +                    aligning the stack before allocating, but for a frame that
> +                    grows upwards that requires aligning the stack after
> +                    allocation.
> +
> +                    Use `frame_offset` to record the offset value rather than
> +                    offset since the `frame_offset` describes the extent
> +                    allocated for this particular variable while `offset`
> +                    describes the address that this variable starts at.  */
> +                 align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +                 hwasan_record_stack_var (virtual_stack_vars_rtx, base,
> +                                          hwasan_orig_offset, frame_offset);
> +               }
>             }
>         }
>        else
> @@ -1236,14 +1293,33 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>               loffset = alloc_stack_frame_space
>                 (rtx_to_poly_int64 (large_allocsize),
>                  PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
> -             large_base = get_dynamic_stack_base (loffset, large_align);
> +             large_base = get_dynamic_stack_base (loffset, large_align, base);
>               large_allocation_done = true;
>             }
> -         gcc_assert (large_base != NULL);
>
> +         gcc_assert (large_base != NULL);
>           large_alloc = aligned_upper_bound (large_alloc, alignb);
>           offset = large_alloc;
>           large_alloc += stack_vars[i].size;
> +         if (hwasan_sanitize_stack_p ())
> +           {
> +             /* An object with a large alignment requirement means that the
> +                alignment requirement is greater than the required alignment
> +                for tags.  */
> +             if (!large_untagged_base)
> +               large_untagged_base
> +                 = targetm.memtag.untagged_pointer (large_base, NULL_RTX);
> +             /* Ensure the end of the variable is also aligned correctly.  */
> +             poly_int64 align_again
> +               = aligned_upper_bound (large_alloc, HWASAN_TAG_GRANULE_SIZE);
> +             /* For large allocations we always allocate a chunk of space
> +                (which is addressed by large_untagged_base/large_base) and
> +                then use positive offsets from that.  Hence the farthest
> +                offset is `align_again` and the nearest offset from the base
> +                is `offset`.  */
> +             hwasan_record_stack_var (large_untagged_base, large_base,
> +                                      offset, align_again);
> +           }
>
>           base = large_base;
>           base_align = large_align;
> @@ -1254,9 +1330,10 @@ expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
>        for (j = i; j != EOC; j = stack_vars[j].next)
>         {
>           expand_one_stack_var_at (stack_vars[j].decl,
> -                                  base, base_align,
> -                                  offset);
> +                                  base, base_align, offset);
>         }
> +      if (hwasan_sanitize_stack_p ())
> +       hwasan_increment_frame_tag ();
>      }
>
>    gcc_assert (known_eq (large_alloc, large_size));
> @@ -1347,10 +1424,37 @@ expand_one_stack_var_1 (tree var)
>    /* We handle highly aligned variables in expand_stack_vars.  */
>    gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
>
> -  offset = alloc_stack_frame_space (size, byte_align);
> +  rtx base;
> +  if (hwasan_sanitize_stack_p ())
> +    {
> +      /* Allocate zero bytes to align the stack.  */
> +      poly_int64 hwasan_orig_offset
> +       = align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +      offset = alloc_stack_frame_space (size, byte_align);
> +      align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
> +      base = hwasan_frame_base ();
> +      /* Use `frame_offset` to automatically account for machines where the
> +        frame grows upwards.
> +
> +        `offset` will always point to the "start" of the stack object, which
> +        will be the smallest address, for ! FRAME_GROWS_DOWNWARD this is *not*
> +        the "furthest" offset from the base delimiting the current stack
> +        object.  `frame_offset` will always delimit the extent that the frame.
> +        */
> +      hwasan_record_stack_var (virtual_stack_vars_rtx, base,
> +                              hwasan_orig_offset, frame_offset);
> +    }
> +  else
> +    {
> +      offset = alloc_stack_frame_space (size, byte_align);
> +      base = virtual_stack_vars_rtx;
> +    }
>
> -  expand_one_stack_var_at (var, virtual_stack_vars_rtx,
> +  expand_one_stack_var_at (var, base,
>                            crtl->max_used_stack_slot_alignment, offset);
> +
> +  if (hwasan_sanitize_stack_p ())
> +    hwasan_increment_frame_tag ();
>  }
>
>  /* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
> @@ -1950,6 +2054,8 @@ init_vars_expansion (void)
>    /* Initialize local stack smashing state.  */
>    has_protected_decls = false;
>    has_short_buffer = false;
> +  if (hwasan_sanitize_stack_p ())
> +    hwasan_record_frame_init ();
>  }
>
>  /* Free up stack variable graph data.  */
> @@ -2277,10 +2383,26 @@ expand_used_vars (void)
>        expand_stack_vars (NULL, &data);
>      }
>
> +  if (hwasan_sanitize_stack_p ())
> +    hwasan_emit_prologue ();
>    if (asan_sanitize_allocas_p () && cfun->calls_alloca)
>      var_end_seq = asan_emit_allocas_unpoison (virtual_stack_dynamic_rtx,
>                                               virtual_stack_vars_rtx,
>                                               var_end_seq);
> +  else if (hwasan_sanitize_allocas_p () && cfun->calls_alloca)
> +    /* When using out-of-line instrumentation we only want to emit one function
> +       call for clearing the tags in a region of shadow stack.  When there are
> +       alloca calls in this frame we want to emit a call using the
> +       virtual_stack_dynamic_rtx, but when not we use the hwasan_frame_extent
> +       rtx we created in expand_stack_vars.  */
> +    var_end_seq = hwasan_emit_untag_frame (virtual_stack_dynamic_rtx,
> +                                          virtual_stack_vars_rtx);
> +  else if (hwasan_sanitize_stack_p ())
> +    /* If no variables were stored on the stack, `hwasan_get_frame_extent`
> +       will return NULL_RTX and hence `hwasan_emit_untag_frame` will return
> +       NULL (i.e. an empty sequence).  */
> +    var_end_seq = hwasan_emit_untag_frame (hwasan_get_frame_extent (),
> +                                          virtual_stack_vars_rtx);
>
>    fini_vars_expansion ();
>
> @@ -6641,6 +6763,9 @@ pass_expand::execute (function *fun)
>        emit_insn_after (var_ret_seq, after);
>      }
>
> +  if (hwasan_sanitize_stack_p ())
> +    hwasan_maybe_emit_frame_base_init ();
> +
>    /* Zap the tree EH table.  */
>    set_eh_throw_stmt_table (fun, NULL);
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 298fe4b295e2f81d679786f21f499183bc07078f..f06d5e8911241d3fa0f2c7a101a3a2468defd227 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -12230,3 +12230,60 @@ work.
>  At preset, this feature does not support address spaces.  It also requires
>  @code{Pmode} to be the same as @code{ptr_mode}.
>  @end deftypefn
> +
> +@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_TAG_SIZE ()
> +Return the size of a tag (in bits) for this platform.
> +
> +The default returns 8.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_GRANULE_SIZE ()
> +Return the size in real memory that each byte in shadow memory refers to.
> +I.e. if a variable is @var{X} bytes long in memory, then this hook should
> +return the value @var{Y} such that the tag in shadow memory spans
> +@var{X}/@var{Y} bytes.
> +
> +Most variables will need to be aligned to this amount since two variables
> +that are neighbors in memory and share a tag granule would need to share
> +the same tag.
> +
> +The default returns 16.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_INSERT_RANDOM_TAG (rtx @var{untagged}, rtx @var{target})
> +Return an RTX representing the value of @var{untagged} but with a
> +(possibly) random tag in it.
> +Put that value into @var{target} if it is convenient to do so.
> +This function is used to generate a tagged base for the current stack frame.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_ADD_TAG (rtx @var{base}, poly_int64 @var{addr_offset}, uint8_t @var{tag_offset})
> +Return an RTX that represents the result of adding @var{addr_offset} to
> +the address in pointer @var{base} and @var{tag_offset} to the tag in pointer
> +@var{base}.
> +The resulting RTX must either be a valid memory address or be able to get
> +put into an operand with @code{force_operand}.
> +
> +Unlike other memtag hooks, this must return an expression and not emit any
> +RTL.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_SET_TAG (rtx @var{untagged_base}, rtx @var{tag}, rtx @var{target})
> +Return an RTX representing @var{untagged_base} but with the tag @var{tag}.
> +Try and store this in @var{target} if convenient.
> +@var{untagged_base} is required to have a zero tag when this hook is called.
> +The default of this hook is to set the top byte of @var{untagged_base} to
> +@var{tag}.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_EXTRACT_TAG (rtx @var{tagged_pointer}, rtx @var{target})
> +Return an RTX representing the tag stored in @var{tagged_pointer}.
> +Store the result in @var{target} if it is convenient.
> +The default represents the top byte of the original pointer.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} rtx TARGET_MEMTAG_UNTAGGED_POINTER (rtx @var{tagged_pointer}, rtx @var{target})
> +Return an RTX representing @var{tagged_pointer} with its tag set to zero.
> +Store the result in @var{target} if convenient.
> +The default clears the top byte of the original pointer.
> +@end deftypefn
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 8fbd36e2bf31e098f7827ce331fd7059c8a747bc..b08923c8f28455fe77e061625e78ed1bf538e792 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -8186,3 +8186,17 @@ maintainer is familiar with.
>  @hook TARGET_RUN_TARGET_SELFTESTS
>
>  @hook TARGET_MEMTAG_CAN_TAG_ADDRESSES
> +
> +@hook TARGET_MEMTAG_TAG_SIZE
> +
> +@hook TARGET_MEMTAG_GRANULE_SIZE
> +
> +@hook TARGET_MEMTAG_INSERT_RANDOM_TAG
> +
> +@hook TARGET_MEMTAG_ADD_TAG
> +
> +@hook TARGET_MEMTAG_SET_TAG
> +
> +@hook TARGET_MEMTAG_EXTRACT_TAG
> +
> +@hook TARGET_MEMTAG_UNTAGGED_POINTER
> diff --git a/gcc/explow.h b/gcc/explow.h
> index 0df8c62b82a8bf1d8d6baf0b6fb658e66361a407..581831cb19fdf9e8fd969bb30139e1358279a34d 100644
> --- a/gcc/explow.h
> +++ b/gcc/explow.h
> @@ -106,7 +106,7 @@ extern rtx allocate_dynamic_stack_space (rtx, unsigned, unsigned,
>  extern void get_dynamic_stack_size (rtx *, unsigned, unsigned, HOST_WIDE_INT *);
>
>  /* Returns the address of the dynamic stack space without allocating it.  */
> -extern rtx get_dynamic_stack_base (poly_int64, unsigned);
> +extern rtx get_dynamic_stack_base (poly_int64, unsigned, rtx);
>
>  /* Return an rtx doing runtime alignment to REQUIRED_ALIGN on TARGET.  */
>  extern rtx align_dynamic_address (rtx, unsigned);
> diff --git a/gcc/explow.c b/gcc/explow.c
> index 0fbc6d25b816457a3d13ed45d16b5dd0513cfacd..41c3f6ace49c0e55c080e10b917842b1b21d49eb 100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -1583,10 +1583,14 @@ allocate_dynamic_stack_space (rtx size, unsigned size_align,
>     OFFSET is the offset of the area into the virtual stack vars area.
>
>     REQUIRED_ALIGN is the alignment (in bits) required for the region
> -   of memory.  */
> +   of memory.
> +
> +   BASE is the rtx of the base of this virtual stack vars area.
> +   The only time this is not `virtual_stack_vars_rtx` is when tagging pointers
> +   on the stack.  */
>
>  rtx
> -get_dynamic_stack_base (poly_int64 offset, unsigned required_align)
> +get_dynamic_stack_base (poly_int64 offset, unsigned required_align, rtx base)
>  {
>    rtx target;
>
> @@ -1594,7 +1598,7 @@ get_dynamic_stack_base (poly_int64 offset, unsigned required_align)
>      crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
>
>    target = gen_reg_rtx (Pmode);
> -  emit_move_insn (target, virtual_stack_vars_rtx);
> +  emit_move_insn (target, base);
>    target = expand_binop (Pmode, add_optab, target,
>                          gen_int_mode (offset, Pmode),
>                          NULL_RTX, 1, OPTAB_LIB_WIDEN);
> diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
> index a32715ddb92e69b7ca7be28a8f17a369b891bd76..4f854fb994229fd4ed91d3b5cff7c7acff9a55bc 100644
> --- a/gcc/sanitizer.def
> +++ b/gcc/sanitizer.def
> @@ -180,6 +180,12 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_COMPARE, "__sanitizer_ptr_cmp",
>  DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_SUBTRACT, "__sanitizer_ptr_sub",
>                       BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
>
> +/* Hardware Address Sanitizer.  */
> +DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_INIT, "__hwasan_init",
> +                     BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_TAG_MEM, "__hwasan_tag_memory",
> +                     BT_FN_VOID_PTR_UINT8_PTRMODE, ATTR_NOTHROW_LIST)
> +
>  /* Thread Sanitizer */
>  DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_INIT, "__tsan_init",
>                       BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
> diff --git a/gcc/target.def b/gcc/target.def
> index 25f0ae228210f926077020082f129fb2e599f062..44807438431488a5a7aa8f8125d256869e152b68 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -6874,6 +6874,71 @@ At preset, this feature does not support address spaces.  It also requires\n\
>  @code{Pmode} to be the same as @code{ptr_mode}.",
>   bool, (), default_memtag_can_tag_addresses)
>
> +DEFHOOK
> +(tag_size,
> + "Return the size of a tag (in bits) for this platform.\n\
> +\n\
> +The default returns 8.",
> +  uint8_t, (), default_memtag_tag_size)
> +
> +DEFHOOK
> +(granule_size,
> + "Return the size in real memory that each byte in shadow memory refers to.\n\
> +I.e. if a variable is @var{X} bytes long in memory, then this hook should\n\
> +return the value @var{Y} such that the tag in shadow memory spans\n\
> +@var{X}/@var{Y} bytes.\n\
> +\n\
> +Most variables will need to be aligned to this amount since two variables\n\
> +that are neighbors in memory and share a tag granule would need to share\n\
> +the same tag.\n\
> +\n\
> +The default returns 16.",
> +  uint8_t, (), default_memtag_granule_size)
> +
> +DEFHOOK
> +(insert_random_tag,
> + "Return an RTX representing the value of @var{untagged} but with a\n\
> +(possibly) random tag in it.\n\
> +Put that value into @var{target} if it is convenient to do so.\n\
> +This function is used to generate a tagged base for the current stack frame.",
> +  rtx, (rtx untagged, rtx target), default_memtag_insert_random_tag)
> +
> +DEFHOOK
> +(add_tag,
> + "Return an RTX that represents the result of adding @var{addr_offset} to\n\
> +the address in pointer @var{base} and @var{tag_offset} to the tag in pointer\n\
> +@var{base}.\n\
> +The resulting RTX must either be a valid memory address or be able to get\n\
> +put into an operand with @code{force_operand}.\n\
> +\n\
> +Unlike other memtag hooks, this must return an expression and not emit any\n\
> +RTL.",
> +  rtx, (rtx base, poly_int64 addr_offset, uint8_t tag_offset),
> +  default_memtag_add_tag)
> +
> +DEFHOOK
> +(set_tag,
> + "Return an RTX representing @var{untagged_base} but with the tag @var{tag}.\n\
> +Try and store this in @var{target} if convenient.\n\
> +@var{untagged_base} is required to have a zero tag when this hook is called.\n\
> +The default of this hook is to set the top byte of @var{untagged_base} to\n\
> +@var{tag}.",
> +  rtx, (rtx untagged_base, rtx tag, rtx target), default_memtag_set_tag)
> +
> +DEFHOOK
> +(extract_tag,
> + "Return an RTX representing the tag stored in @var{tagged_pointer}.\n\
> +Store the result in @var{target} if it is convenient.\n\
> +The default represents the top byte of the original pointer.",
> +  rtx, (rtx tagged_pointer, rtx target), default_memtag_extract_tag)
> +
> +DEFHOOK
> +(untagged_pointer,
> + "Return an RTX representing @var{tagged_pointer} with its tag set to zero.\n\
> +Store the result in @var{target} if convenient.\n\
> +The default clears the top byte of the original pointer.",
> +  rtx, (rtx tagged_pointer, rtx target), default_memtag_untagged_pointer)
> +
>  HOOK_VECTOR_END (memtag)
>  #undef HOOK_PREFIX
>  #define HOOK_PREFIX "TARGET_"
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index 0065c686978d7120978430013c73b1055aaf95c7..68e8688a32f18481ee61f06879aacff20163105b 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -287,4 +287,12 @@ extern bool speculation_safe_value_not_needed (bool);
>  extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
>
>  extern bool default_memtag_can_tag_addresses ();
> +extern uint8_t default_memtag_tag_size ();
> +extern uint8_t default_memtag_granule_size ();
> +extern rtx default_memtag_insert_random_tag (rtx, rtx);
> +extern rtx default_memtag_add_tag (rtx, poly_int64, uint8_t);
> +extern rtx default_memtag_set_tag (rtx, rtx, rtx);
> +extern rtx default_memtag_extract_tag (rtx, rtx);
> +extern rtx default_memtag_untagged_pointer (rtx, rtx);
> +
>  #endif /* GCC_TARGHOOKS_H */
> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> index 46cb536041d396c32fd08042581d6d5cd5ad0395..e634df3f6c6837e422246a7736c0de4471ce1e77 100644
> --- a/gcc/targhooks.c
> +++ b/gcc/targhooks.c
> @@ -73,6 +73,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "varasm.h"
>  #include "flags.h"
>  #include "explow.h"
> +#include "expmed.h"
>  #include "calls.h"
>  #include "expr.h"
>  #include "output.h"
> @@ -86,6 +87,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "langhooks.h"
>  #include "sbitmap.h"
>  #include "function-abi.h"
> +#include "attribs.h"
> +#include "asan.h"
> +#include "emit-rtl.h"
>
>  bool
>  default_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
> @@ -2415,10 +2419,115 @@ default_speculation_safe_value (machine_mode mode ATTRIBUTE_UNUSED,
>    return result;
>  }
>
> +/* How many bits to shift in order to access the tag bits.
> +   The default is to store the tag in the top 8 bits of a 64 bit pointer, hence
> +   shifting 56 bits will leave just the tag.  */
> +#define HWASAN_SHIFT (GET_MODE_PRECISION (Pmode) - 8)
> +#define HWASAN_SHIFT_RTX GEN_INT (HWASAN_SHIFT)
> +
>  bool
>  default_memtag_can_tag_addresses ()
>  {
>    return false;
>  }
>
> +uint8_t
> +default_memtag_tag_size ()
> +{
> +  return 8;
> +}
> +
> +uint8_t
> +default_memtag_granule_size ()
> +{
> +  return 16;
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_INSERT_RANDOM_TAG.  */
> +rtx
> +default_memtag_insert_random_tag (rtx untagged, rtx target)
> +{
> +  gcc_assert (param_hwasan_instrument_stack);
> +  if (param_hwasan_random_frame_tag)
> +    {
> +      rtx fn = init_one_libfunc ("__hwasan_generate_tag");
> +      rtx new_tag = emit_library_call_value (fn, NULL_RTX, LCT_NORMAL, QImode);
> +      return targetm.memtag.set_tag (untagged, new_tag, target);
> +    }
> +  else
> +    {
> +      /* NOTE: The kernel API does not have __hwasan_generate_tag exposed.
> +        In the future we may add the option emit random tags with inline
> +        instrumentation instead of function calls.  This would be the same
> +        between the kernel and userland.  */
> +      return untagged;
> +    }
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_ADD_TAG.  */
> +rtx
> +default_memtag_add_tag (rtx base, poly_int64 offset, uint8_t tag_offset)
> +{
> +  /* Need to look into what the most efficient code sequence is.
> +     This is a code sequence that would be emitted *many* times, so we
> +     want it as small as possible.
> +
> +     There are two places where tag overflow is a question:
> +       - Tagging the shadow stack.
> +         (both tagging and untagging).
> +       - Tagging addressable pointers.
> +
> +     We need to ensure both behaviors are the same (i.e. that the tag that
> +     ends up in a pointer after "overflowing" the tag bits with a tag addition
> +     is the same that ends up in the shadow space).
> +
> +     The aim is that the behavior of tag addition should follow modulo
> +     wrapping in both instances.
> +
> +     The libhwasan code doesn't have any path that increments a pointer's tag,
> +     which means it has no opinion on what happens when a tag increment
> +     overflows (and hence we can choose our own behavior).  */
> +
> +  offset += ((uint64_t)tag_offset << HWASAN_SHIFT);
> +  return plus_constant (Pmode, base, offset);
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_SET_TAG.  */
> +rtx
> +default_memtag_set_tag (rtx untagged, rtx tag, rtx target)
> +{
> +  gcc_assert (GET_MODE (untagged) == Pmode && GET_MODE (tag) == QImode);
> +  tag = expand_simple_binop (Pmode, ASHIFT, tag, HWASAN_SHIFT_RTX, NULL_RTX,
> +                            /* unsignedp = */1, OPTAB_WIDEN);
> +  rtx ret = expand_simple_binop (Pmode, IOR, untagged, tag, target,
> +                                /* unsignedp = */1, OPTAB_DIRECT);
> +  gcc_assert (ret);
> +  return ret;
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_EXTRACT_TAG.  */
> +rtx
> +default_memtag_extract_tag (rtx tagged_pointer, rtx target)
> +{
> +  rtx tag = expand_simple_binop (Pmode, LSHIFTRT, tagged_pointer,
> +                                HWASAN_SHIFT_RTX, target,
> +                                /* unsignedp = */0,
> +                                OPTAB_DIRECT);
> +  rtx ret = gen_lowpart (QImode, tag);
> +  gcc_assert (ret);
> +  return ret;
> +}
> +
> +/* The default implementation of TARGET_MEMTAG_UNTAGGED_POINTER.  */
> +rtx
> +default_memtag_untagged_pointer (rtx tagged_pointer, rtx target)
> +{
> +  rtx tag_mask = gen_int_mode ((HOST_WIDE_INT_1U << HWASAN_SHIFT) - 1, Pmode);
> +  rtx untagged_base = expand_simple_binop (Pmode, AND, tagged_pointer,
> +                                          tag_mask, target, true,
> +                                          OPTAB_DIRECT);
> +  gcc_assert (untagged_base);
> +  return untagged_base;
> +}
> +
>  #include "gt-targhooks.h"
> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index 2a3e7c064a5fbb6913481104975ca85615e49f8e..9938b6afbd4fa22898dbc3c29b92061a71810b08 100644
> --- a/gcc/toplev.c
> +++ b/gcc/toplev.c
> @@ -512,6 +512,9 @@ compile_file (void)
>        if (flag_sanitize & SANITIZE_THREAD)
>         tsan_finish_file ();
>
> +      if (flag_sanitize & SANITIZE_HWADDRESS)
> +       hwasan_finish_file ();
> +
>        omp_finish_file ();
>
>        output_shared_constant_pool ();
>


-- 
BR,
Hongtao


More information about the Gcc-patches mailing list