Re: Add option for whether ceil etc. can raise "inexact", adjust x86 conditions

On Wed, Aug 16, 2017 at 12:51 PM, Uros Bizjak <> wrote:
> On Wed, Aug 16, 2017 at 12:48 PM, Uros Bizjak <> wrote:
>> On Wed, Aug 16, 2017 at 12:43 PM, Richard Biener
>> <> wrote:
>>> On Tue, Aug 15, 2017 at 9:21 PM, Uros Bizjak <> wrote:
>>>> On Tue, Aug 15, 2017 at 4:59 PM, Richard Biener
>>>> <> wrote:
>>>>> So I'd try the "easy" way of expanding if (__builtin_cpu_supports ("sse4.1"))
>>>>> as the sse4.1 sequence is just a single instruction.  The interesting part
>>>>> of the story will be to make sure we can emit that even if ! TARGET_ROUND ...
>>>>> Uros, any idea how to accomplish this?  Or is the idea of a "local" ifunc
>>>>> better?  Note the ABI boundary will be expensive but I guess the conditional
>>>>> sequence as well (and it will disturb RA even if predicted to have SSE 4.1).
>>>> TARGET_ROUND is just:
>>>> /* SSE4.1 defines round instructions */
>>>> #define    TARGET_ISA_ROUND    ((ix86_isa_flags & OPTION_MASK_ISA_ROUND) != 0)
>>>> I don't remember the history around the #define, once upon a time
>>>> probably made sense, but nowadays it looks that it can be simply
>>>> substituted with TARGET_SSE4_1.
>>> Sure but we want the backend to use a TARGET_ROUND guarded define_insn
>>> when TARGET_ROUND is false but inside a runtime conditional ensuring that
>>> TARGET_ROUND is satisfied.  With doing this with ifuncs we'd mark the function
>>> with a proper target attribute but within a function?
>> How about something intrinsic headers are using?
> (... somehow managed to press send too early ...)
> There we use GCC_push_options and GCC_target pragmas. Maybe we also
> need corresponding __ROUND__ define defined by the compiler.

Those don't work inside a function.  Remember I want to change the expander
of ceil () to

 if (__builtin_cpu_supports ("sse4.1"))
   ceil_for_sse4.1 ();
   ceil ();

from the x86 target code that expands ceil for ! TARGET_ROUND.  I suppose
we could simply use a separate pattern for SSE 4.1 roundsd here (does it
have to be an unspec?  I suppose so to prevent it from being generated by
other means and to prevent code motion out of the conditional?)

Or forgo with the idea to use inline conditional code and emit an ifunc
dispatcher, a function with the sse4.1 instruction, and a call to the dispatcher


> Uros.

