This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: followup on ia64 HARD_REGNO_NREGS change


On Fri, 2004-11-12 at 16:13, Richard Henderson wrote:
> While looking at that test case, I noticed that we had similar
> problems with TCmode.  Also, it looks like our formulation of
> MODES_TIEABLE_P is completely confused.

You are right about TCmode of course.  I didn't think of that.  TCmode
has other problems on a linux host though, we don't have many testcases
for this non-standard type, and the ones that we do have fail on my
machine because of missing functions, mostly related to unordered
comparisons.  So I haven't been worrying about TFmode/TCmode problems.

As for MODES_TIEABLE_P, it depends on the semantics.  I just made the
conservative choice, since XFmode and XCmode have the same register
allocation constraints, they are safe to tie to each other.

Looking at the docs, the first paragraph says this is a one way
relationship, which is what you implemented.  The second paragraphs says
this is a bi-directional relationship, which is what I implemented.  So
the docs aren't clear here.

However, I suspect that it is really supposed to be a bi-directional
relationship.  If you tie XFmode to say DFmode, and the DFmode
subsequently gets allocated to an integer register, then we need an
expensive reload to put the value in an FP reg for XFmode operations. 
Since we have reload support for this, the code will certainly work, but
it may be less efficient than if we did not tie the registers together
in the first place.  Also, since long double operations tend to be rare,
it may be hard to notice a performance problem here unless you look for
it.

I am not sure if this is the right interpretation, but it does seem to
match what other ports have done.  The x86 MODES_TIEABLE_P for instance
seems to agree that this needs to be a bi-directional relationship.

By the way, the remaining IA-64 long double failures in compat are due
to a Fariborz Jahanian/David Edelsohn patch that you approved.
   http://gcc.gnu.org/ml/gcc-patches/2003-11/msg00167.html
The problematic part is 2).  If we pass a large long double structure as
an argument, then we have partial == 8 because the integer registers are
full, and reg is a PARALLEL of 4 XFmode values.  The new code then
computes 8 * 128-bits = 128-bytes, but we have stored only 64-bytes in
the register file (8 * 64-bits).  This can result in a memcpy call with
a negative size for some structure sizes.  For others, it just results
in a confused stack frame.  I haven't looked at this problem in detail
yet, but I have a few ideas.  Maybe using
   MAX (GET_MODE_SIZE..., UNITS_PER_WORD)
is OK here.  I think the PowerPC case only involves SImode values when
UNITS_PER_WORD is 64, in which case this should work for both ppc and
IA-64.  Or maybe instead of using partial, we should just add up the
size of the elements in the PARALLEL.  However, I am not sure that is
safe, since a PARALLEL is supposed to be able to represent arbitrary
objects, including objects with holes, in which case we can not assume
that we can compute the size of the object by looking at the elements of
the parallel.  I attached a simplified testcase in case you care.  I
expect I will be looking at this some more next week.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

struct foo { long double a[5]; } foo;

void
sub (struct foo bar)
{
  if (bar.a[4] != 1.0)
    abort ();
}

int
main (void)
{
  foo.a[4] = 1.0;
  sub (foo);
  return 0;
}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]