This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH, i386]: Insert "cld" via optimize mode switching
Michael Matz wrote:
Going a bit further, and based on following assumptions:
It does, on function entry and exit (i386 and x86-64 ABI).
The we in fact need cld insn only after asm statements. MODE_ENTRY can
be set to CLD_FLAG_SET (actually CLD_FLAG_CLEARED...) and call insns
wouldn't set MODE_UNINITIALIZED anymore. If we specify MODE_EXIT as
CLD_FLAG_CLEARED, then optimize_mode_switching pass will automatically
emmit cld after every asm statement, fulfilling ABI requirement about
Is this thinking pushing cld optimization too far?
No, I think that's exactly the right approach. The only problem I could
see are functions written completely in asm, where the author did STD and
forgot about the ABI (i.e. didn't CLD before exit). They happen to work
right now, but will fail working with those changes. IMHO such routines
would be extremely rare, if they exist at all, so that shouldn't hinder
full progress by using mode switching for the direction flag.
1) function entry and exit modes are mandated by ABI, so we _know_ that
direction bit is cleared there
2) asm should take care by itself to issue CLD before exit
3) we never emit STD anywhere
it is possible to argue, that gcc _does not need to emit any_ CLD
Currently all string operations require cleared direction flag. Under
assumptions 1) and 2), there is currently no other way to emit STD, so
we are sure that all string instructions _always_ operate with dirflag
cleared. CLD in front of the instruction is not needed.
This situation is similar to asm that uses mmx regs and "forgets" to
emms on exit. Disaster waiting to happen, but asm's are out of gcc's
control. There is no point to emmit cld after _every_ asm "just in
case". With MODE_EXIT required to be CLD_CLEARED and by having asm
statements set the mode to CLD_UNINITIALIZED, mode switching
infrastructure would issue cld's between latest asm statement and
function exit. For every function that uses asm...
ad 3) _If_ there is ever a need for string operations in reverse
direction, we could use mode switching infrastructure to emmit CLD_SET
before such instruction and CLD_CLEARED before function exit (or before
function call or asm statement) to be compatible with ABI. Perhaps Jan
can tell if there is any need for dirflag to be set for string instructions?
Finally, according to pentium optimization guide by Agner Fog, std and
cld have astonishing latency of 48 and 52 clks (I still hope for the
possibility that there is some kind of error). Considering that string
operations immediatelly follow these insn, we are throwing away cycles...