[PATCH v2] Add TARGET_MOVE_WITH_MODE_P

Richard Sandiford richard.sandiford@arm.com
Mon Mar 14 15:44:24 GMT 2022


Richard Biener <richard.guenther@gmail.com> writes:
> On Wed, Mar 9, 2022 at 7:04 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> > On Wed, Mar 2, 2022 at 10:18 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>> >>
>> >> On Wed, Mar 02, 2022 at 09:51:26AM +0100, Richard Biener wrote:
>> >> > On Tue, Mar 1, 2022 at 11:41 PM H.J. Lu via Gcc-patches
>> >> > <gcc-patches@gcc.gnu.org> wrote:
>> >> > >
>> >> > > Add TARGET_FOLD_MEMCPY_MAX for the maximum number of bytes to fold memcpy.
>> >> > > The default is
>> >> > >
>> >> > > MOVE_MAX * MOVE_RATIO (optimize_function_for_size_p (cfun))
>> >> > >
>> >> > > For x86, it is MOVE_MAX to restore the old behavior before
>> >> >
>> >> > I know we've discussed this to death in the PR, I just want to repeat here
>> >> > that the GIMPLE folding expects to generate a single load and a single
>> >> > store (that is what it does on the GIMPLE level) which is why MOVE_MAX
>> >> > was chosen originally (it's documented to what a "single instruction" does).
>> >> > In practice MOVE_MAX does not seem to cover vector register sizes
>> >> > so Richard pulled MOVE_RATIO which is really intended to cover
>> >> > the case of using multiple instructions for moving memory (but then I
>> >> > don't remember whether for the ARM case the single load/store GIMPLE
>> >> > will be expanded to multiple load/store instructions).
>> >> >
>> >> > TARGET_FOLD_MEMCPY_MAX sounds like a stop-gap solution,
>> >> > being very specific for memcpy folding (we also fold memmove btw).
>> >> >
>> >> > There is also MOVE_MAX_PIECES which _might_ be more appropriate
>> >> > than MOVE_MAX here and still honor the idea of single instructions.
>> >> > Now neither arm nor aarch64 define this and it defaults to MOVE_MAX,
>> >> > not MOVE_MAX * MOVE_RATIO.
>> >> >
>> >> > So if we need a new hook then that hook should at least get the
>> >> > 'speed' argument of MOVE_RATIO and it should get a better name.
>> >> >
>> >> > I still think that it should be possible to improve the insn check to
>> >> > avoid use of "disabled" modes, maybe that's also a point to add
>> >> > a new hook like .move_with_mode_p or so?  To quote, we do
>> >>
>> >> Here is the v2 patch to add TARGET_MOVE_WITH_MODE_P.
>> >
>> > Again I'd like to shine light on MOVE_MAX_PIECES which explicitely
>> > mentions "a load or store used TO COPY MEMORY" (emphasis mine)
>> > and whose x86 implementation would already be fine (doing larger moves
>> > and also not doing too large moves).  But appearantly the arm folks
>> > decided that that's not fit and instead (mis-?)used MOVE_MAX * MOVE_RATIO.
>>
>> It seems like there are old comments and old documentation that justify
>> both interpretations, so there are good arguments on both sides.  But
>> with this kind of thing I think we have to infer the meaning of the
>> macro from the way it's currently used, rather than trusting such old
>> and possibly out-of-date and contradictory information.
>>
>> FWIW, I agree that (if we exclude old reload, which we should!) the
>> only direct uses of MOVE_MAX before the patch were not specific to
>> integer registers and so MOVE_MAX should include vectors if the
>> target wants vector modes to be used for general movement.
>>
>> Even if people disagree that that's the current meaning, I think it's
>> at least a sensible meaning.  It provides information that AFAIK isn't
>> available otherwise, and it avoids overlap with MAX_FIXED_MODE_SIZE.
>>
>> So FWIW, I think it'd be reasonable to change non-x86 targets if they
>> want vector modes to be used for single-insn copies.
>
> Note a slight complication in the GIMPLE folding case is that we
> do not end up using vector modes but we're using "fake"
> integer modes like OImode which x86 has move patterns for.
> If we'd use vector modes we could use existing target hooks to
> eventually decide whether auto-using those is desired or not.

Hmm, yeah.  Certainly we shouldn't require the target to support
a scalar integer equivalent of every vector mode.

Richard


More information about the Gcc-patches mailing list