This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Better Optimization


Hi all,

when inspecting the assembler output of some code which should be
_highly_ optimized, I found some opportunities for producing better
code. I don't want to become a gcc developer, just kindly ask if
someone could look at the ideas and possibly implement it.

I often found code sequences like the following (gcc 3.3.3 for
AMD64):

	call	XXX
	jmp	.L106

IMHO, this could/should be replaced by something like:

	pushq	.L106
	jmp	XXX

Although it does not produce shorter code, the runtime behaviour
should be better, because the ret of the called subroutine will
immediately jump to the new location and thus eliminate a jump chain.
Branch target cache etc should be less polluted.

Is it possible to implement that at a higher level (SSA) for
all targets? Or is it _necessarily_ a low-level peephole
optimization?

Thanks,

Thomas


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]