This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: S/390: Fix string-opt-17.c testsuite failure
- From: Roger Sayle <roger at eyesopen dot com>
- To: Ulrich Weigand <ulrich dot weigand at de dot ibm dot com>
- Cc: <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 10 Sep 2002 19:58:32 -0600 (MDT)
- Subject: Re: S/390: Fix string-opt-17.c testsuite failure
Hi Ulrich,
> Incidentally, this fixes the string-opt-17.c test case, which failed
> on 64-bit because it tried to build a constant using a TImode
> multiplication. Not only do we not support that, but even if we
> would, I wouldn't consider replacing a 16-byte memset with a
> __multi3 call to be an optimization ...
Its strange that the s390 backend would want to call __multi3.
The multiplication is by a simple integer constant, and on most
targets this should be turned into a sequence of shifts and adds.
So GCC should generate a sequence of eight shift and add instructions
to build the value to write in a TI mode register, and then a single
store of that TI register to memory (which should be faster than 16
consecutive bytes stores). On some platforms with fast hardware
multiply, the optimal sequence may actually include a real multiply.
But is sounds like the RTX_COST of a TI mode multiplication on
S/390 is underestimated. The function call to __multi3 must
be more expensive than the shift-add sequence that would
otherwise be prefered by GCC.
memset16:
reg &= 0xff;
reg |= reg<<8;
reg |= reg<<16;
reg |= reg<<32;
reg |= reg<<64;
*ptr = reg;
GCC only considers the TImode operations as the backend claims to
support TImode reads and writes to and from memory. Most architectures
allow at most SImode sequences, performing memset 4 bytes at a time.
Anyway, I hope this explains the motivation for the "optimization".
Roger
--
Roger Sayle, E-mail: roger@eyesopen.com
OpenEye Scientific Software, WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road, Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507. Fax: (+1) 505-473-0833