This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Reorg a reorg.c comment
- From: Steven Bosscher <stevenb dot gcc at gmail dot com>
- To: John David Anglin <dave dot anglin at bell dot net>
- Cc: GCC Mailing List <gcc at gcc dot gnu dot org>, John David Anglin <danglin at gcc dot gnu dot org>
- Date: Sun, 2 Dec 2012 00:45:51 +0100
- Subject: Re: Reorg a reorg.c comment
- References: <CABu31nMmY+qii1eaa8t6pUKXPo8Tj405G+547RCT3R8qAwr58g@mail.gmail.com> <BLU0-SMTP6FE421AE7AC9B406560B197580@phx.gbl>
On Sun, Nov 25, 2012 at 4:44 PM, John David Anglin wrote:
> On 24-Nov-12, at 9:19 PM, Steven Bosscher wrote:
>
>> +;; This machine description is inspired by sparc.md and (to a lesser
>> +;; extend) mips.md.
>
>
> Change "extend" to "extent". Don't need parentheses.
>
>
>> +
>> +;; Possible improvements, if anyone is still interested in working on
>> +;; improving this machine description in 2012:
>
>
> Please remove the "if anyone ..." part.
>
>
>> +;;
>> +;; * With PA2.0, most computational instructions can conditionally
>> nullify
>> +;; the execution of the following instruction. Nullification is
>> performed
>
>
> Add statement to effect that nullification is a very efficient for
> it does not cause the instruction pipeline to stall.
>
>
>> +;; conditionally based on the outcome of a test specified in the
>> opcode.
>> +;; The test result is stored in PSW[N] and can only be used to nullify
>> the
>> +;; instruction following immediately after the test. For example:
>> +;;
>> +;; ldi 10,%r26 ldi 10,%r26
>> +;; ldi 5,%r25 ldi 5,%r25
>> +;; sub,< %r26,%r25,%r28 sub,> %r26,%r25,%r28
>> +;; sub %r28,%r25,%r28 sub %r28,%r25,%r28
>> +;; ; %r28 == 0 ; %r28 == 5
>
>
> Find the parallel layout somewhat confusing. Maybe we just need one.
>
>
>> +;;
>> +;; This could be tricky to implement because the result of the test has
>> +;; to be propagated one instruction forward, which, in the worst case,
>> +;; would involve (1) adding a fake register for PSW[N]; (2) adding the
>> +;; variants of the computational instructions that set or consume this
>> +;; fake register. The cond_exec infrastructure is probably not helpful
>> +;; for this.
>> +;;
>
>
> Another improvement might be to implement the static branch prediction
> hints for conditional branches (Section I-3 in PA-RISC 2.0 Architecture).
OK, I didn't know GCC doesn't already use the branch hints. That
shouldn't be very hard to implement, AFAIU other targets also provide
such hints.
How about this?
Ciao!
Steven
Index: config/pa/pa.md
===================================================================
--- config/pa/pa.md (revision 194037)
+++ config/pa/pa.md (working copy)
@@ -1,6 +1,5 @@
;;- Machine description for HP PA-RISC architecture for GCC compiler
-;; Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
-;; 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2010
+;; Copyright (C) 1992-2012
;; Free Software Foundation, Inc.
;; Contributed by the Center for Software Science at the University
;; of Utah.
@@ -21,8 +20,52 @@
;; along with GCC; see the file COPYING3. If not see
;; <http://www.gnu.org/licenses/>.
-;; This gcc Version 2 machine description is inspired by sparc.md and
-;; mips.md.
+;; This machine description is inspired by sparc.md and to a lesser
+;; extent mips.md.
+
+;; Possible improvements:
+;;
+;; * With PA1.1, most computational instructions can conditionally nullify
+;; the execution of the following instruction. A nullified instruction
+;; does no cause the instruction pipeline to stall, making it a very
+;; efficient alternative to e.g. branching or conditional moves.
+;;
+;; Nullification is performed conditionally based on the outcome of a
+;; test specified in the opcode. The test result is stored in PSW[N]
+;; and can only be used to nullify the instruction following immediately
+;; after the test. For example:
+;;
+;; ldi 10,%r26
+;; ldi 5,%r25
+;; sub,< %r26,%r25,%r28
+;; sub %r28,%r25,%r28 ; %r28 == 0
+;; sub,> %r26,%r25,%r29
+;; sub %r29,%r25,%r29 ; %r29 == 0
+;;
+;; This could be tricky to implement because the result of the test has
+;; to be propagated one instruction forward, which, in the worst case,
+;; would involve (1) adding a fake register for PSW[N]; (2) adding the
+;; variants of the computational instructions that set or consume this
+;; fake register. The cond_exec infrastructure is probably not helpful
+;; for this.
+;;
+;; * PA-RISC includes a set of conventions for branch instruction usage
+;; to indicate whether a particular branch is more likely to be taken
+;; or not taken. For example, the prediction for CMPB instructions
+;; (CMPB,cond,n r1,r2,target) depends on the direction of the branch
+;; (forward or backward) and on the order of the operands:
+;;
+;; | branch | operand | branch |
+;; | direction | compare | prediction |
+;; +-----------+----------+------------+
+;; | backward | r1 < r2 | taken |
+;; | backward | r1 >= r2 | not taken |
+;; | forward | r1 < r2 | not taken |
+;; | forward | r1 >= r2 | taken |
+;;
+;; By choosing instructions and operand order carefully, the compiler
+;; could give the CPU branch predictor some help.
+;;
;;- See file "rtl.def" for documentation on define_insn, match_*, et. al.
Index: reorg.c
===================================================================
--- reorg.c (revision 194037)
+++ reorg.c (working copy)
@@ -100,16 +100,7 @@ along with GCC; see the file COPYING3.
delay slot. In that case, we point each insn at the other with REG_CC_USER
and REG_CC_SETTER notes. Note that these restrictions affect very few
machines because most RISC machines with delay slots will not use CC0
- (the RT is the only known exception at this point).
-
- Not yet implemented:
-
- The Acorn Risc Machine can conditionally execute most insns, so
- it is profitable to move single insns into a position to execute
- based on the condition code of the previous insn.
-
- The HP-PA can conditionally nullify insns, providing a similar
- effect to the ARM, differing mostly in which insn is "in charge". */
+ (the RT is the only known exception at this point). */
#include "config.h"
#include "system.h"