This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] New *truncsfdf2_i387_1 pattern for i386.md


Hello Roger!

With the patch below we now generate:

foo:    pushl   %edx
       fldl    d
       fstps   f
       popl    %eax
       ret

We still (unfortunately) allocate the stack slot, but combine.c is
now able to recognize the result of merging the float_truncate with
the write to memory.  Interestingly, we already have the equivalent
*truncxfsf2_387_1 and *truncxfdf2_387_1 patterns, so its a surprise
that the df->sf pattern is missing, but I suspect this is due to the
labyrinthine interactions between x87 and sse2 math.  To play it safe,
the new pattern is only enabled if we know we can't use sse2, but
that might be overly conservative.  Uros or RTH?



It is the job of "*truncdfsf_fast_i387" to handle writes to memory. However, this pattern is at the moment enabled only for flag_unsafe_math_optimizations. If your example is compiled with additional -ffast-math flag, the result is indeed the same as with your patch.

To fix this oversight, I suggest to change the constraint of "*truncdfsf_fast_i387" into:

"TARGET_80387 && (MEM_P (operands[0]) || flag_unsafe_math_optimizations)"

to acheive the same functionality as with your proposed patch.

Thanks,
Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]