This is the mail archive of the
mailing list for the GCC project.
Re: Enable SSE math on i386 with -Ofast
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Richard Biener <rguenther at suse dot de>
- Cc: Jan Hubicka <hubicka at ucw dot cz>, gcc-patches at gcc dot gnu dot org
- Date: Mon, 7 Oct 2013 11:32:38 +0200
- Subject: Re: Enable SSE math on i386 with -Ofast
- Authentication-results: sourceware.org; auth=none
- References: <20131004105656 dot GA25297 at kam dot mff dot cuni dot cz> <alpine dot LNX dot 2 dot 00 dot 1310071049170 dot 5759 at zhemvz dot fhfr dot qr> <20131007092243 dot GA358 at kam dot mff dot cuni dot cz> <alpine dot LNX dot 2 dot 00 dot 1310071124520 dot 5759 at zhemvz dot fhfr dot qr>
> > In meantime I (partially,
> > since megrez stopped producing 32bit spec2k6 results) benchmarked
> > -mfpmath=sse,387 and it does not seem to be a loss anymore. So perhaps we can
> > give it a try?
> Not sure ... I would guess that it's not a win on any recent architecture
> (and LRA is probably not well-prepared here either).
I think it has chance to win when the input/out registers are forced to be in
387 (because of return value ABI) and perhaps with register pressure in cases
two independent computtions are going on and LRA can home one in SSE and other
in 387 registers. Don't really know.
Main advantage of 387 is that it is significantly more compact than SSE. Last
hardware really favouring 387 was probably original pentium4 (where additions
was better pipelined on 387 path if I recall correctly). I wonder how AVX
changed this. I was thus thining about adding a mode where we chose 387 or SSE
based on fact if function is optimzed for size.
The size difference is quite high - around 5% on specfp.
> > > change the ABI ... (do we change the local functions ABI with
> > > -mfpmath=sse?)
> > We don't. It is probably quite easy to default to sse_regparm and change return value type.
> > I will look into it.
> Thanks. That's independent of enabling -mfpmath=sse at -Ofast of course.
Yep, my plan is to enable fpmath with -Ofast today and look into those two items incrementally.