This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: middle-end/58981: movmem/setmem use mode wider than Pmode for size
- From: Richard Sandiford <rsandifo at linux dot vnet dot ibm dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org, law at redhat dot com, rguenther at suse dot de
- Date: Mon, 04 Nov 2013 11:11:39 +0000
- Subject: Re: PATCH: middle-end/58981: movmem/setmem use mode wider than Pmode for size
- Authentication-results: sourceware.org; auth=none
- References: <20131104103216 dot GA13798 at lucon dot org>
"H.J. Lu" <hongjiu.lu@intel.com> writes:
> emit_block_move_via_movmem and set_storage_via_setmem have
>
> for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode;
> mode = GET_MODE_WIDER_MODE (mode))
> {
> enum insn_code code = direct_optab_handler (movmem_optab, mode);
>
> if (code != CODE_FOR_nothing
> /* We don't need MODE to be narrower than BITS_PER_HOST_WIDE_INT
> here because if SIZE is less than the mode mask, as it is
> returned by the macro, it will definitely be less than the
> actual mode mask. */
> && ((CONST_INT_P (size)
> && ((unsigned HOST_WIDE_INT) INTVAL (size)
> <= (GET_MODE_MASK (mode) >> 1)))
> || GET_MODE_BITSIZE (mode) >= BITS_PER_WORD))
> {
>
> Backend may assume mode of size in movmem and setmem expanders is no
> widder than Pmode since size is within the Pmode address space. X86
> backend expand_set_or_movmem_prologue_epilogue_by_misaligned has
>
> rtx saveddest = *destptr;
>
> gcc_assert (desired_align <= size);
> /* Align destptr up, place it to new register. */
> *destptr = expand_simple_binop (GET_MODE (*destptr), PLUS, *destptr,
> GEN_INT (prolog_size),
> NULL_RTX, 1, OPTAB_DIRECT);
> *destptr = expand_simple_binop (GET_MODE (*destptr), AND, *destptr,
> GEN_INT (-desired_align),
> *destptr, 1, OPTAB_DIRECT);
> /* See how many bytes we skipped. */
> saveddest = expand_simple_binop (GET_MODE (*destptr), MINUS, saveddest,
> *destptr,
> saveddest, 1, OPTAB_DIRECT);
> /* Adjust srcptr and count. */
> if (!issetmem)
> *srcptr = expand_simple_binop (GET_MODE (*srcptr), MINUS, *srcptr, saveddest,
> *srcptr, 1, OPTAB_DIRECT);
> *count = expand_simple_binop (GET_MODE (*count), PLUS, *count,
> saveddest, *count, 1, OPTAB_DIRECT);
>
> saveddest is a negative number in Pmode and *count is in word_mode. For
> x32, when Pmode is SImode and word_mode is DImode, saveddest + *count
> leads to overflow. We could fix it by using mode of saveddest to compute
> saveddest + *count. But it leads to extra conversions and other backends
> may run into the same problem. A better fix is to limit mode of size in
> movmem and setmem expanders to Pmode. It generates better and correct
> memcpy and memset for x32.
>
> There is also a typo in comments. It should be BITS_PER_WORD, not
> BITS_PER_HOST_WIDE_INT.
I don't think it's a typo. It's explaining why we don't have to worry about:
GET_MODE_BITSIZE (mode) > BITS_PER_HOST_WIDE_INT
in the CONST_INT_P test (because in that case the GET_MODE_MASK macro
will be an all-1 HOST_WIDE_INT, even though that's narrower than the
real mask).
I don't think the current comment covers the BITS_PER_WORD test at all.
AIUI it's there because the pattern is defined as taking a length of
at most word_mode, so we should stop once we reach it.
FWIW, I agree Pmode makes more sense on face value. But shouldn't
we replace the BITS_PER_WORD test instead of adding to it? Having both
would only make a difference for Pmode > word_mode targets, which might
be able to handle full Pmode lengths.
Either way, the md.texi documentation should be updated too.
Thanks,
Richard