This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Add a simulate_builin_function_decl langhook
On 10/5/19 5:29 AM, Richard Sandiford wrote:
>
> Sure. This message is going to go to the other extreme, sorry, but I'm
> not sure which part will be the most convincing (if any).
No worries. Worst case going to the other extreme is I have to read it
more than once after nodding off halfway through :-)
Seriously though. THanks for the more expansive write up.
>
> I guess I should start by saying that SVE intrinsics have three types of
> predication (zeroing, merging, and "don't care") that combine with the usual
> type suffixes seen in arm_neon.h. This gives 3,736 functions for baseline
> SVE (SVE2 adds more).
Ugh. That'd be a lot of stuff to add into a header. As you noted
later, there's significant costs to doing so.
>
> The vast majority of those functions can't be open-coded using the
> core parts of C and C++, even with GNU extensions. Some could perhaps
> be coded using new library extensions, but that just shifts the question
> from "how do we implement this feature in arm_sve.h?" to "how we implement
> this feature in the library extension?".
True. But one could ask if those extensions are something that we're
likely to need beyond what you're doing now. But I don't necessarily
think that would override the desire not to have so much stuff in the
header.
I'm guessing that even though you can't describe what you need at the
C/C++ level, but you can describe at least some of what you want in
gimple/rtl? Otherwise I'm not sure what you get above and beyond asms.
>
> So that leaves us using built-in functions for almost all of those 3,736
> functions. With the traditional approach, the target would need to
> register the functions at start-up and then define the header file in
> terms of them.
Yup. And there's a cost you missed -- those things tend to end up in
the debugging output as well. That's caused problems with system tools
in the past.
>
> There are two approaches to doing that. One is to define the built-in
> functions under their header file name so that they become active once
> a prototype is seen. But that's only appropriate for functions like
> printf that have linkage. The arm_sve.h functions don't have linkage
> and there's a chance that non-SVE code could be using the same names for
> something else (perhaps even with the same prototype, in the case of
> things like
>
> uint64_t svcntb (void);
>
> that don't mention SVE types).
>
> The other alternative is to define builtins in the "__builtin_"
> namespace and wrap them in inline wrappers, which I think is what
> most intrinsics header files do. E.g., from arm_neon.h:
>
> __extension__ extern __inline float64x2_t
> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> vabsq_f64 (float64x2_t __a)
> {
> return __builtin_aarch64_absv2df (__a);
> }
>
> But that's 6 lines per function. Counting a blank line between
> each one, we'd end up with a header file of at least 26,151 lines.
> (OK, so arm_neon.h is already longer than that. But with SVE2 and with
> other considerations mentioned below, arm_sve.h could easily end up into
> 6 figures this way.)
Yea, that's kind of what I would expect for exposing this stuff. But I
didn't realize how many of these you were going to need.
[ ... ]
>
> It's very hard to maintain header files like that by hand without
> introducing errors, such as forgetting to put the arguments safely
> in the implementation namespace ("__a" rather than "a" etc.). Kyrill
> fixed some arm_neon.h instances of this the other week. And using macros
> to handle common patterns just makes the error messages worse, since the
> macros then show up in error backtraces.
Yup. We've seen this across multiple targets. Y'all aren't alone here.
>
> An alternative to maintaining the header file by hand is to generate
> it via a script. Ideally the script would use the same metadata as
> the compiler itself uses when registering the built-in functions.
> But this means writing two pieces of code to process the metadata,
> one to generate text for the inline wrappers and one to register the
> built-ins. And we still end up with the same very long header file.
Yup. It's not ideal.
>
> A more fundamental problem with inline wrappers is that they make it
> harder to honour the spec for constant arguments. If an instruction
> requires a constant operand, Arm has traditionally required the
> associated intrinsic argument to be an integer constant expression
> (in the C and C++ sense). GCC has tended to fudge this and instead only
> requires an integer constant at expand time, after inlining and constant
> propagation have taken place. But this means that all sorts of other
> optimisations have happened too, and so what's constant at expand time
> wasn't necessarily constant at the language level.
>
> Admittedly some people like that behaviour :-), but it means that what's
> acceptable depends on the vagaries compiler optimisation. It also means
> that code is often not portable between GCC and clang, which implements
> the spec more closely.
I go back and forth on this kind of thing. Sometimes I'd rather see us
be more strict, other times such strictness seems to be getting in the
way for no good reason, particularly when after inlining/optimization,
etc we're going to get the form we need. I'm sure from a user
standpoint the inconsistency is annoying as hell.
>
> So the pragma approach seemed better for several reasons:
[ ... ]
I tend to agree, particularly due to the number of builtins you'd end up
needing. It's not so bad when the number is small, but at the scale
you're doing the old fashioned way seems inferior.
Let's go with with it. Again, thanks for the write-up.
jeff