This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH, rs6000] Use new __builtin_pack_longdouble within libgcc's ibm-ldouble.c
- From: Peter Bergner <bergner at vnet dot ibm dot com>
- To: Mike Stump <mikestump at comcast dot net>
- Cc: David Edelsohn <dje dot gcc at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Alan Modra <amodra at gmail dot com>
- Date: Thu, 31 Jul 2014 11:30:02 -0500
- Subject: Re: [PATCH, rs6000] Use new __builtin_pack_longdouble within libgcc's ibm-ldouble.c
- Authentication-results: sourceware.org; auth=none
- References: <1406645783 dot 6052 dot 52 dot camel at otta> <B2951772-0AFD-46EA-8FD1-0A682992185B at comcast dot net>
On Tue, 2014-07-29 at 10:11 -0700, Mike Stump wrote:
> On Jul 29, 2014, at 7:56 AM, Peter Bergner <firstname.lastname@example.org> wrote:
> > Currently, the IBM long double routines in libgcc use a union to construct
> > a long double from two double values. This causes horrific code generation
> > that copies the two double from the FP registers over to GPRs and back
> > again, giving us two loads and two stores, which leads to two load-hit-store
> > hazzards.
> Gosh, itâs too bad we donât have any sort of technology to optimize moving data around.
Well the problem is we're trying to move it around, when we'd really like
the data to stay in the FP registers the entire time. The problem is that
unions and structs that are the same size as a TImode/TFmode/TDmode are
always converted to TImode and that is what ends up causing the whole
fp -> int -> fp shuffle which leads to crappy code. On power8 where we
have int <-> fp reg copy instructions, it's better than the copy thru
the stack frame, but even that is unnecessary.