This is the mail archive of the
mailing list for the GCC project.
Re: PATCH: Add pa8000 scheduling
- To: law at cygnus dot com
- Subject: Re: PATCH: Add pa8000 scheduling
- From: "Jerry Quinn" <jquinn at nortelnetworks dot com>
- Date: Thu, 18 Mar 1999 10:46:26 -0500
- CC: egcs-patches at egcs dot cygnus dot com
- Organization: Nortel Technology
After submitting it, I actually went back and did something similar to
the scheduling change you put in. My version made a few percent
improvement on our software, and a few percent better or worse on the
benchmark suite, depending.
I thought perhaps for fmpyadd and fmpysub, if we make them grab both alu
units, that would effectively keep them from being used - sort of
equivalent to hogging two slots in the reorder buffer. Does that make
Should all the combinations in pa_combine_instructions be prevented or
just the fmpyadd/sub?
I also have a couple more questions:
I've sent in a (incomplete) patch for 2.0 assembler support for the
extra floating point instructions. So at least in theory, an
architecture switch makes sense now. I created an -march= switch based
on discussion on the egcs list a month and a half ago centered around
the ppc port and the arch and schedule switches. The discussion sounded
like people were in favor of a more standard approach to switch naming.
I can do a -mpa-risc-2-0 switch if you prefer.
I figured that TARGET_SNAKE would be set even for 2.0. Does this sound
Also, should we be worried about 32 bit vs. 64 bit?
Finally, I'm trying to add entries to pa.md to generate fmpyfadd
instructions. What I have so far is something like:
[(set (match_operand:DF 0 "register_operand" "=f")
(plus:DF (match_operand:DF 1 "register_operand" "f")
(mult:DF (match_operand:DF 2 "register_operand" "f")
(match_operand:DF 3 "register_operand" "f"))))]
"TARGET_PA20 && ! TARGET_SOFT_FLOAT"
[(set_attr "type" "fpalu")
(set_attr "length" "4")])
with a second pattern reversing the order of the addend operand and the
mult operator (and two more for single precision). My main problem
(maybe there are others :-) is that I don't understand how the pattern
matching is done. Does this make sense? Or should there be two
separate set operators with the mult target being a match_dup of one of
the plus operator's operands? I've been wandering through the docs
reading on RTL and several .md files trying to find a similar
multiply-accumulate pattern, but admit I'm still confused as to how
patterns are applied.
Jerry Quinn Tel: (514) 761-8737
firstname.lastname@example.org Fax: (514) 761-8505
Speech Recognition Research