RFA: tweak integer type used for memcpy folding
Richard Sandiford
rdsandiford@googlemail.com
Tue Apr 22 09:59:00 GMT 2014
Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford
>>> <rdsandiford@googlemail.com> wrote:
>>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
>>>>> <rdsandiford@googlemail.com> wrote:
>>>>>> wide-int fails to build libitm because of a bad interaction between:
>>>>>>
>>>>>> /* Keep the OI and XI modes from confusing the compiler into thinking
>>>>>> that these modes could actually be used for computation. They are
>>>>>> only holders for vectors during data movement. */
>>>>>> #define MAX_BITSIZE_MODE_ANY_INT (128)
>>>>>>
>>>>>> and the memcpy folding code:
>>>>>>
>>>>>> /* Make sure we are not copying using a floating-point mode or
>>>>>> a type whose size possibly does not match its precision. */
>>>>>> if (FLOAT_MODE_P (TYPE_MODE (desttype))
>>>>>> || TREE_CODE (desttype) == BOOLEAN_TYPE
>>>>>> || TREE_CODE (desttype) == ENUMERAL_TYPE)
>>>>>> {
>>>>>> /* A more suitable int_mode_for_mode would return a vector
>>>>>> integer mode for a vector float mode or a integer complex
>>>>>> mode for a float complex mode if there isn't a regular
>>>>>> integer mode covering the mode of desttype. */
>>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype));
>>>>>> if (mode == BLKmode)
>>>>>> desttype = NULL_TREE;
>>>>>> else
>>>>>> desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
>>>>>> 1);
>>>>>> }
>>>>>> if (FLOAT_MODE_P (TYPE_MODE (srctype))
>>>>>> || TREE_CODE (srctype) == BOOLEAN_TYPE
>>>>>> || TREE_CODE (srctype) == ENUMERAL_TYPE)
>>>>>> {
>>>>>> enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
>>>>>> if (mode == BLKmode)
>>>>>> srctype = NULL_TREE;
>>>>>> else
>>>>>> srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
>>>>>> 1);
>>>>>> }
>>>>>>
>>>>>> The failure occurs for complex long double, which we try to copy as
>>>>>> a 256-bit integer type (OImode).
>>>>>>
>>>>>> This patch tries to do what the comment suggests by introducing a new
>>>>>> form of int_mode_for_mode that replaces vector modes with vector modes
>>>>>> and complex modes with complex modes. The fallback case of using a
>>>>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
>>>>>> 128 bits on x86_64.
>>>>>>
>>>>>> The question then is what to do about 128-bit types for i386.
>>>>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
>>>>>> used for optimisation. However, gcc.target/i386/pr49168-1.c only passes
>>>>>> for -m32 -msse2 because we use int128_t to copy a float128_t.
>>>>>>
>>>>>> I handled that by allowing MODE_VECTOR_INT to be used instead of
>>>>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
>>>>>> even if the original type wasn't a vector.
>>>>>
>>>>> Hmm. Sounds reasonable unless there are very weird targets that
>>>>> cannot efficiently load/store vectors unaligned but can handle
>>>>> efficient load/store of unaligned scalars.
>>>>
>>>> Yeah, in general there's no guarantee that even int_mode_for_mode
>>>> will return a mode with the same alignment as the original. Callers
>>>> need to check that (like the memcpy folder does).
>>>>
>>>>>> It might be that other callers to int_mode_for_mode should use
>>>>>> the new function too, but I'll look at that separately.
>>>>>>
>>>>>> I used the attached testcase (with printfs added to gcc) to check that
>>>>>> the right modes and types were being chosen. The patch fixes the
>>>>>> complex float and complex double cases, since the integer type that we
>>>>>> previously picked had a larger alignment than the original complex type.
>>>>>
>>>>> As of complex int modes - are we positively sure that targets even
>>>>> try to do sth "optimal" for loads/stores of those?
>>>>
>>>> Complex modes usually aren't handled directly by .md patterns,
>>>> either int or float. They're really treated as a pair of values.
>>>> So IMO it still makes sense to fold this case.
>>>>
>>>>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
>>>>>> is that vectors are copied as integer vectors if the target supports
>>>>>> them directly but are copied as float vectors otherwise, since in the
>>>>>> latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the
>>>>>> test are copied as vector floats and vector doubles both before and
>>>>>> after the patch.
>>>>>
>>>>> That wasn't intended ... the folding should have failed if we can't
>>>>> copy using an integer mode ...
>>>>
>>>> Does that mean that the fold give up if TYPE_MODE is BLKmode?
>>>> I can do that as a separate patch if so.
>>>
>>> Looking at the code again it should always choose an integer mode/type
>>> via setting desttype/srctype to NULL for BLKmode and
>>>
>>> if (!srctype)
>>> srctype = desttype;
>>> if (!desttype)
>>> desttype = srctype;
>>> if (!srctype)
>>> return NULL_TREE;
>>>
>>> no? Thus if we can't get a integer type for either src or dest then
>>> we fail. But we should never end up with srctype or desttype
>>> being a float mode. No?
>>
>> Right, they don't have a float _mode_, because the target doesn't
>> support 1024-bit modes. Like I say, they have BLKmode instead.
>> But they have a float type.
>
> Ok. I clearly used FLOAT_MODE_P (TYPE_MODE ()) because
> I assumed that only float-modes will eventually end up in FP
> registers (that's what the bug was about - we load into a FP register
> and that load can trigger a FP exception).
Yeah. I just flagged it up because it wasn't obvious that native vector
float types need to be copied as vector ints while non-native ones don't.
A non-native vector float might be scalarised into individual float
operations and thus might end up using an FP register load.
Thanks,
Richard
More information about the Gcc-patches
mailing list