[PR103149] introduce asmnesia internal function
Richard Biener
rguenther@suse.de
Tue Dec 7 11:51:07 GMT 2021
On Fri, 3 Dec 2021, Alexandre Oliva wrote:
> On Dec 2, 2021, Richard Biener <rguenther@suse.de> wrote:
>
> > While adding ASMNESIA looks OK at this point, I'm not sure about the
> > asm handling stuff. You mention 'reload' above, do you mean LRA?
>
> I meant the allocation was first visible in -fdump-rtl-reload. I've
> added a note about this problem, and a modified testcase, to PR93027
> once I realized that the problem was present in an unpatched compiler,
> even at -O0.
Ah, I see.
> In the previous patch, I'd already abandoned the intent to use the +X
> constraint for ASMNESIA, because it was not useful and created other
> problems. I'd fixed numerous of those additional problems in the
> recog.c and cfgexpand.c changes, but those turned out to be not needed
> to fix the PR once I backpedaled to +g (or +m). ASMNESIA also turned
> out to be unnecessary, so I prepared and retested another version of the
> patch that uses the same switch-to-MEM logic I wrote for ASMNESIA in the
> previous patch, but now directly in the detach_value implementation
> within the harden-conditional passes, using +m and an addressable
> temporary if we find that +g won't do.
>
> If you find that ASMNESIA is still useful as a reusable internal
> primitive, I can reintroduce it, now or at a later time. It would bring
> some slight codegen benefit, but we could do without it at this stage.
I prefer to be minimalistic at this point. Maybe we can re-use
ASMNESIA for the FENC_ACCESS stuff if somebody picks up Marcs work
during next stage1.
>
> [PR103149] detach values through mem only if general regs won't do
>
> From: Alexandre Oliva <oliva@adacore.com>
>
> When hardening compares or conditional branches, we perform redundant
> tests, and to prevent them from being optimized out, we use asm
> statements that preserve a value used in a compare, but in a way that
> the compiler can no longer assume it's the same value, so it can't
> optimize the redundant test away.
>
> We used to use +g, but that requires general regs or mem. You might
> think that, if a reg constraint can't be satisfied, the register
> allocator will fall back to memory, but that's not so: we decide on
> matching MEMs very early on, by using the same addressable operand on
> both input and output, and only if the constraint does not allow
> registers. If it does, we use gimple registers and then pseudos as
> inputs and outputs, and then inputs can be substituted by equivalent
> expressions, and then, if no register contraint fits (e.g. because
> that mode won't fit in general regs, or won't fit in regs at all), the
> register allocator will give up before even trying to allocate some
> temporary memory to unify input and output.
>
> This patch arranges for us to create and use the temporary stack slot
> if we can tell the mode requires memory, or won't otherwise fit in
> general regs, and thus to use +m for that asm.
>
>
> Regstrapped on x86_64-linux-gnu. Verified that the mentioned aarch64 PR
> is fixed. Also bootstrapping along with a patch that enables both
> compare hardening passes. Ok to install?
OK.
Thanks,
Richard.
>
>
> for gcc/ChangeLog
>
> PR middle-end/103149
> * gimple-harden-conditionals.cc (detach_value): Use memory if
> general regs won't do.
>
> for gcc/testsuite/ChangeLog
>
> PR middle-end/103149
> * gcc.target/aarch64/pr103149.c: New.
> ---
> gcc/gimple-harden-conditionals.cc | 67 +++++++++++++++++++++++++--
> gcc/testsuite/gcc.target/aarch64/pr103149.c | 14 ++++++
> 2 files changed, 75 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/pr103149.c
>
> diff --git a/gcc/gimple-harden-conditionals.cc b/gcc/gimple-harden-conditionals.cc
> index cfa2361d65be0..81867d6e4275f 100644
> --- a/gcc/gimple-harden-conditionals.cc
> +++ b/gcc/gimple-harden-conditionals.cc
> @@ -22,6 +22,8 @@ along with GCC; see the file COPYING3. If not see
> #include "system.h"
> #include "coretypes.h"
> #include "backend.h"
> +#include "target.h"
> +#include "rtl.h"
> #include "tree.h"
> #include "fold-const.h"
> #include "gimple.h"
> @@ -132,25 +134,78 @@ detach_value (location_t loc, gimple_stmt_iterator *gsip, tree val)
> tree ret = make_ssa_name (TREE_TYPE (val));
> SET_SSA_NAME_VAR_OR_IDENTIFIER (ret, SSA_NAME_IDENTIFIER (val));
>
> - /* Output asm ("" : "=g" (ret) : "0" (val)); */
> + /* Some modes won't fit in general regs, so we fall back to memory
> + for them. ??? It would be ideal to try to identify an alternate,
> + wider or more suitable register class, and use the corresponding
> + constraint, but there's no logic to go from register class to
> + constraint, even if there is a corresponding constraint, and even
> + if we could enumerate constraints, we can't get to their string
> + either. So this will do for now. */
> + bool need_memory = true;
> + enum machine_mode mode = TYPE_MODE (TREE_TYPE (val));
> + if (mode != BLKmode)
> + for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> + if (TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], i)
> + && targetm.hard_regno_mode_ok (i, mode))
> + {
> + need_memory = false;
> + break;
> + }
> +
> + tree asminput = val;
> + tree asmoutput = ret;
> + const char *constraint_out = need_memory ? "=m" : "=g";
> + const char *constraint_in = need_memory ? "m" : "0";
> +
> + if (need_memory)
> + {
> + tree temp = create_tmp_var (TREE_TYPE (val), "dtch");
> + mark_addressable (temp);
> +
> + gassign *copyin = gimple_build_assign (temp, asminput);
> + gimple_set_location (copyin, loc);
> + gsi_insert_before (gsip, copyin, GSI_SAME_STMT);
> +
> + asminput = asmoutput = temp;
> + }
> +
> + /* Output an asm statement with matching input and output. It does
> + nothing, but after it the compiler no longer knows the output
> + still holds the same value as the input. */
> vec<tree, va_gc> *inputs = NULL;
> vec<tree, va_gc> *outputs = NULL;
> vec_safe_push (outputs,
> build_tree_list
> (build_tree_list
> - (NULL_TREE, build_string (2, "=g")),
> - ret));
> + (NULL_TREE, build_string (strlen (constraint_out),
> + constraint_out)),
> + asmoutput));
> vec_safe_push (inputs,
> build_tree_list
> (build_tree_list
> - (NULL_TREE, build_string (1, "0")),
> - val));
> + (NULL_TREE, build_string (strlen (constraint_in),
> + constraint_in)),
> + asminput));
> gasm *detach = gimple_build_asm_vec ("", inputs, outputs,
> NULL, NULL);
> gimple_set_location (detach, loc);
> gsi_insert_before (gsip, detach, GSI_SAME_STMT);
>
> - SSA_NAME_DEF_STMT (ret) = detach;
> + if (need_memory)
> + {
> + gassign *copyout = gimple_build_assign (ret, asmoutput);
> + gimple_set_location (copyout, loc);
> + gsi_insert_before (gsip, copyout, GSI_SAME_STMT);
> + SSA_NAME_DEF_STMT (ret) = copyout;
> +
> + gassign *clobber = gimple_build_assign (asmoutput,
> + build_clobber
> + (TREE_TYPE (asmoutput)));
> + gimple_set_location (clobber, loc);
> + gsi_insert_before (gsip, clobber, GSI_SAME_STMT);
> + }
> + else
> + SSA_NAME_DEF_STMT (ret) = detach;
>
> return ret;
> }
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr103149.c b/gcc/testsuite/gcc.target/aarch64/pr103149.c
> new file mode 100644
> index 0000000000000..906bc9ae06612
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr103149.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=armv8-a+sve -O2 -fharden-conditional-branches -fno-tree-scev-cprop" } */
> +
> +/* -fharden-conditional-branches prevents optimization of its redundant
> + compares by detaching values from the operands with asm statements. They
> + used to require GENERAL_REGS, but the vectorized booleans, generated while
> + vectorizing this function, can't be held in GENERAL_REGS. */
> +
> +void
> +foo (int *p)
> +{
> + while (*p < 1)
> + ++*p;
> +}
>
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
More information about the Gcc-patches
mailing list