This is the mail archive of the
mailing list for the GCC project.
Re: Honnor ix86_accumulate_outgoing_args again
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Vladimir Makarov <vmakarov at redhat dot com>
- Cc: Jan Hubicka <hubicka at ucw dot cz>, gcc-patches at gcc dot gnu dot org, hjl dot tools at gmail dot com, ubizjak at gmail dot com, rth at redhat dot com, Ganesh dot Gopalasubramanian at amd dot com
- Date: Thu, 3 Oct 2013 15:05:24 +0200
- Subject: Re: Honnor ix86_accumulate_outgoing_args again
- Authentication-results: sourceware.org; auth=none
- References: <20131002173249 dot GB12304 at kam dot mff dot cuni dot cz> <20131002224516 dot GA26046 at kam dot mff dot cuni dot cz> <524CC41F dot 5090801 at redhat dot com>
> and because LRA still misses some reload functionality for
> elimination. I am a bit embarrassed: I have this thing to do for 4
> months and I still did not start to work on it yet. There are too
> much things on my plate.
> As we are going to use outgoing arg accumulation, this problem is
> becoming higher priority one.
we currently use outgoing arg accumulation always on x86_64, I plan to
re-disable arg accumulation on CPUs that handle push/pop well (i.e. have stack
engine). This brings nice code size savings.
I wonder how much this actually comes from not omitting frame pointer in
non-leaf functions with IRA. EBP based addressing is more compact than ESP
and thus -fomit-frame-pointer is disabled with -Os.
Perhaps frame elimination can be actually decided on by register allocation?
On similar note I just benchmarked -mfpmath=sse for 32bit code and it is quite
big performance win and again causes about 5% code size regression. I want to
propose defaulting to -mfpmath=sse for 32bit for -ffast-math and -Ofast. (in a
way I would like to see -mfpmath=sse by default for 32bit on CPUs supporting
SSE2, but that has been voted down long time ago becuase it loses the 80bit
precision for temporaries in double/float computations).
I wonder if we can eventually make -mfpmath=sse,387 working well (I did not
bechmark it yet, but statically it still produces more spiling than -mfpmath=sse)
and/or if we can possibly decide on fpmath based on hotness of function
(at least with profile around).
Thanks for all the hard work on IRA!