This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Don't emit cmpstrsi and movstr* insns with -Os
- From: "Ulrich Weigand" <weigand at i1 dot informatik dot uni-erlangen dot de>
- To: roger at eyesopen dot com (Roger Sayle)
- Cc: rth at redhat dot com (Richard Henderson), gcc-patches at gcc dot gnu dot org, weigand at informatik dot uni-erlangen dot de (Ulrich Weigand)
- Date: Tue, 9 Sep 2003 17:36:47 +0200 (CEST)
- Subject: Re: [PATCH] Don't emit cmpstrsi and movstr* insns with -Os
Roger Sayle wrote:
> Hence it looks like disabling movstr* when the user specifies -Os
> is also a "size" win on IBM's s390, for this single test atleast.
> Checking the -S output, the call to memcpy isn't being sibcalled,
> which would bias/invalidate this particular test.
Well, the current s390 movstr implementation generates a code
fragment that is optimized for speed, not size; in fact it is
rather long ...
However, you can change this via the -mmvcle option; if this is
set, movstr should generate a single instruction (modulo setting
up the inputs in the correct registers etc.). -Os should probably
imply -mmvcle on s390, but currently it doesn't.
In any case, this trivial testcase isn't really suitable to
decide the general question because most likely secondary
effects will dominate the total size: for example, the
arguments 'come in' just in the right registers to be passed
on to another function call, while they'd have to be reloaded
into different registers to make use of the mvcle instruction
-- in real code, however, this difference would likely vanish
due to different register allocation choices made earlier.
Also, the question whether to generate a call or inline code
might depend on the particular instance; for example, if the
lenght is a compile-time (small) constant, the certainly best
choice (for both speed and size) on s390 would be a mvc
instruction with the length as immediate operand encoded in
the instruction itself. Maybe we should call the movstr
expanders always, but allow them to FAIL if a regular
function call is preferable in the particular instance?
> Ulrich, if you can, please could you evaluate my patch on your s390?
> If you haven't tried it before, the CSiBE benchmark is remarkably
> easy to use. Very many thanks in advance.
Unfortunately, I'm currently on vacation and don't have easy
access to a s390 from here.
However, on s390 the usual concern is optimization for speed,
not size, and so I'm sure there's a lot of things where -Os
doesn't work as well as it could on our platform. Thus,
I don't have particularly strong objections to your patch
right now -- we can always improve upon it later ...
Bye,
Ulrich
--
Dr. Ulrich Weigand
weigand@informatik.uni-erlangen.de