[PATCH, testsuite] Fix sse4_1-round* inline asm statements

Uros Bizjak ubizjak@gmail.com
Wed Dec 9 07:19:00 GMT 2015


On Tue, Dec 8, 2015 at 9:13 PM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> Hi,
>
> Am 08.12.2015 um 19:23 schrieb Uros Bizjak:
>> On Tue, Dec 8, 2015 at 7:10 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> Hello!
>>>
>>>> this patch fixes the asm statements in the gcc.target/i386/sse4_1-round* test cases.
>>>>
>>>> They do lots of things that are just absolutely forbidden, like clobber registers
>>>> that are not mentioned in the clobber list, and create a hidden data flow.
>>>>
>>>> The test cases work just by chance, and You can see the asm statements
>>>> ripped completely apart by the loop optimizer if you try to do the assembler
>>>> part in a loop:
>>>>
>>>>    for (i = 0; i < 10; i++) {
>>>>    __asm__ ("fld" ASM_SUFFIX " %0" : : "m" (*&f));
>>>>
>>>>    __asm__ ("fstcw %0" : "=m" (*&saved_cw));
>>>>    new_cw = saved_cw & clr_mask;
>>>>    new_cw |= type;
>>>>    __asm__ ("fldcw %0" : : "m" (*&new_cw));
>>>>
>>>>    __asm__ ("frndint\n"
>>>>             "fstp" ASM_SUFFIX " %0\n" : "=m" (*&ret));
>>>>    __asm__ ("fldcw %0" : : "m" (*&saved_cw));
>>>>    }
>>>>    return ret;
>>>>
>>>> So this patch avoids the hidden data flow, and
>>>> adds "st" to the clobber list.
>>>>
>>>> Boot-strapped and reg-tested on x86_64-pc-linux-gnu
>>>> OK for trunk?
>>> Uh, no.
>>>
>>> x87 is a strange beast, and even your patch is wrong. The assembly
>>> should be written in this way:
>>>
>>>    __asm__ ("fnstcw %0" : "=m" (saved_cw));
>>>
>>>    new_cw = saved_cw & clr_mask;
>>>    new_cw |= type;
>>>
>>>    __asm__ ("fldcw %2\n\t"
>>>         "frndint\n\t"
>>>         "fldcw %3" : "=t" (ret)
>>>                : "0" (f), "m" (new_cw), "m" (saved_cw));
>>>
>>> I'm testing the attached patch.
>
> Thanks.
>
> That is certainly a good idea to use the "=t"(ret) : "0" (f) !
>
> But could you explain, just for my curiosity, why my approach
> was wrong?  It may have to do with the semantic of "st" clobber, right?
> I thought that allows the inline-assembler to push one value on the
> fpu-stack,
> and it is responsible for removing that value again.

Not "wrong" in the sense that it won't work, but it is not written in
the optimal way. The st(0) register is not clobbered, it is loaded
with the input argument, and the value is returned there. So, when asm
input arguments are described in the right way, there is no need to
explicitly move values to the stack, the compiler will do it for you.

Saying that, I see we don't need to define ASM_SUFFIX anymore. I'll
prepare the patch that removes these #defines.

Uros.



More information about the Gcc-patches mailing list