This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: RFA: tweak integer type used for memcpy folding
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Sandiford <rdsandiford at googlemail dot com>
- Date: Tue, 22 Apr 2014 13:06:50 +0200
- Subject: Re: RFA: tweak integer type used for memcpy folding
- Authentication-results: sourceware.org; auth=none
- References: <87oazxhi4m dot fsf at talisman dot default> <CAFiYyc2f-ynPHuky8Xgbxu6ksKr9g7=zsHUt1yjiGsD54o7vBA at mail dot gmail dot com> <87y4yxg3eg dot fsf at talisman dot default> <CAFiYyc0e7u8AzjBS-u9SyPKUpART84gjHGTU8b4osGzZFPg70w at mail dot gmail dot com> <87tx9lg1xx dot fsf at talisman dot default> <CAFiYyc2aFGaFcu+qhH3M-gOtiVTtccCwwPvUm6kUKbCvs3-gEg at mail dot gmail dot com> <87ppk9g0aj dot fsf at talisman dot default>
On Tue, Apr 22, 2014 at 11:51 AM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford
>> <rdsandiford@googlemail.com> wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford
>>>> <rdsandiford@googlemail.com> wrote:
>>>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>>>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
>>>>>> <rdsandiford@googlemail.com> wrote:
>>>>>>> wide-int fails to build libitm because of a bad interaction between:
>>>>>>>
>>>>>>> /* Keep the OI and XI modes from confusing the compiler into thinking
>>>>>>> that these modes could actually be used for computation. They are
>>>>>>> only holders for vectors during data movement. */
>>>>>>> #define MAX_BITSIZE_MODE_ANY_INT (128)
>>>>>>>
>>>>>>> and the memcpy folding code:
>>>>>>>
>>>>>>> /* Make sure we are not copying using a floating-point mode or
>>>>>>> a type whose size possibly does not match its precision. */
>>>>>>> if (FLOAT_MODE_P (TYPE_MODE (desttype))
>>>>>>> || TREE_CODE (desttype) == BOOLEAN_TYPE
>>>>>>> || TREE_CODE (desttype) == ENUMERAL_TYPE)
>>>>>>> {
>>>>>>> /* A more suitable int_mode_for_mode would return a vector
>>>>>>> integer mode for a vector float mode or a integer complex
>>>>>>> mode for a float complex mode if there isn't a regular
>>>>>>> integer mode covering the mode of desttype. */
>>>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype));
>>>>>>> if (mode == BLKmode)
>>>>>>> desttype = NULL_TREE;
>>>>>>> else
>>>>>>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
>>>>>>> 1);
>>>>>>> }
>>>>>>> if (FLOAT_MODE_P (TYPE_MODE (srctype))
>>>>>>> || TREE_CODE (srctype) == BOOLEAN_TYPE
>>>>>>> || TREE_CODE (srctype) == ENUMERAL_TYPE)
>>>>>>> {
>>>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
>>>>>>> if (mode == BLKmode)
>>>>>>> srctype = NULL_TREE;
>>>>>>> else
>>>>>>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
>>>>>>> 1);
>>>>>>> }
>>>>>>>
>>>>>>> The failure occurs for complex long double, which we try to copy as
>>>>>>> a 256-bit integer type (OImode).
>>>>>>>
>>>>>>> This patch tries to do what the comment suggests by introducing a new
>>>>>>> form of int_mode_for_mode that replaces vector modes with vector modes
>>>>>>> and complex modes with complex modes. The fallback case of using a
>>>>>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
>>>>>>> 128 bits on x86_64.
>>>>>>>
>>>>>>> The question then is what to do about 128-bit types for i386.
>>>>>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
>>>>>>> used for optimisation. However, gcc.target/i386/pr49168-1.c only passes
>>>>>>> for -m32 -msse2 because we use int128_t to copy a float128_t.
>>>>>>>
>>>>>>> I handled that by allowing MODE_VECTOR_INT to be used instead of
>>>>>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
>>>>>>> even if the original type wasn't a vector.
>>>>>>
>>>>>> Hmm. Sounds reasonable unless there are very weird targets that
>>>>>> cannot efficiently load/store vectors unaligned but can handle
>>>>>> efficient load/store of unaligned scalars.
>>>>>
>>>>> Yeah, in general there's no guarantee that even int_mode_for_mode
>>>>> will return a mode with the same alignment as the original. Callers
>>>>> need to check that (like the memcpy folder does).
>>>>>
>>>>>>> It might be that other callers to int_mode_for_mode should use
>>>>>>> the new function too, but I'll look at that separately.
>>>>>>>
>>>>>>> I used the attached testcase (with printfs added to gcc) to check that
>>>>>>> the right modes and types were being chosen. The patch fixes the
>>>>>>> complex float and complex double cases, since the integer type that we
>>>>>>> previously picked had a larger alignment than the original complex type.
>>>>>>
>>>>>> As of complex int modes - are we positively sure that targets even
>>>>>> try to do sth "optimal" for loads/stores of those?
>>>>>
>>>>> Complex modes usually aren't handled directly by .md patterns,
>>>>> either int or float. They're really treated as a pair of values.
>>>>> So IMO it still makes sense to fold this case.
>>>>>
>>>>>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
>>>>>>> is that vectors are copied as integer vectors if the target supports
>>>>>>> them directly but are copied as float vectors otherwise, since in the
>>>>>>> latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the
>>>>>>> test are copied as vector floats and vector doubles both before and
>>>>>>> after the patch.
>>>>>>
>>>>>> That wasn't intended ... the folding should have failed if we can't
>>>>>> copy using an integer mode ...
>>>>>
>>>>> Does that mean that the fold give up if TYPE_MODE is BLKmode?
>>>>> I can do that as a separate patch if so.
>>>>
>>>> Looking at the code again it should always choose an integer mode/type
>>>> via setting desttype/srctype to NULL for BLKmode and
>>>>
>>>> if (!srctype)
>>>> srctype = desttype;
>>>> if (!desttype)
>>>> desttype = srctype;
>>>> if (!srctype)
>>>> return NULL_TREE;
>>>>
>>>> no? Thus if we can't get a integer type for either src or dest then
>>>> we fail. But we should never end up with srctype or desttype
>>>> being a float mode. No?
>>>
>>> Right, they don't have a float _mode_, because the target doesn't
>>> support 1024-bit modes. Like I say, they have BLKmode instead.
>>> But they have a float type.
>>
>> Ok. I clearly used FLOAT_MODE_P (TYPE_MODE ()) because
>> I assumed that only float-modes will eventually end up in FP
>> registers (that's what the bug was about - we load into a FP register
>> and that load can trigger a FP exception).
>
> Yeah. I just flagged it up because it wasn't obvious that native vector
> float types need to be copied as vector ints while non-native ones don't.
> A non-native vector float might be scalarised into individual float
> operations and thus might end up using an FP register load.
That's true - we should fail for that case.
Richard.
> Thanks,
> Richard