This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: short integer multiplication problem on IA64
On Tue, Feb 26, 2002 at 01:30:13PM -0800, Reva Cuthbertson wrote:
> At the first if statement, t is set to ITANIUM_CLASS_MMMUL so the
> if statement will fail and we will never call nop_cycles_until().
No, you're looking at this wrong: T is the type of
the consumer, that is, a following add instruction.
As for the example you're looking at,
pmpy2.r r8 = r8, r33
sxt2 r8 = r8
note that SXT2 is type XTD, and note that section 4.4 of
ftp://download.intel.com/design/Itanium/Downloads/24547402.pdf
lists only IALU, ILOG, ISHF, ST and LD as consumers for
which the pipeline flush applies.
If we change your example to
return a * b + 1;
we do indeed see
// cycle 1
nop.m 0
pmpy2.r r8 = r8, r33
nop.i 0
;;
.mii
nop.m 0
nop.i 0
;;
nop.i 0
;;
.mii
nop.m 0
nop.i 0
nop.i 0
;;
;;
.mmi
// cycle 5
adds r8 = 1, r8
(There's actually a bug here with the double stop bit at the end,
but that's minor, and I've got a fix for it. It would also be
possible to get 3 stop bits in two bundles instead of three:
.mii
nop.m 0
pmpy2.r r8 = r8, r33
;;
nop.i 0
;;
.mii
nop.m 0
nop.i 0
;;
nop.i 0
;;
.mmi
adds r8 = 1, r8
I've not tried to address this.)
So this code _is_ working as intended.
If you've experimentally determined that XTD insns are in fact
subject to the pipeline flush, but omitted from Intel's docs,
then that can be addressed trivially.
Fix for double stop bit follows.
r~
* config/ia64/ia64.c (nop_cycles_until): Do init_insn_group_barriers
if we emitted a stop bit.
Index: ia64.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/ia64/ia64.c,v
retrieving revision 1.139
diff -c -p -d -r1.139 ia64.c
*** ia64.c 2002/01/21 02:24:02 1.139
--- ia64.c 2002/02/26 22:30:47
*************** nop_cycles_until (clock_var, dump)
*** 6090,6101 ****
--- 6090,6103 ----
{
int prev_clock = prev_cycle;
int cycles_left = clock_var - prev_clock;
+ bool did_stop = false;
/* Finish the previous cycle; pad it out with NOPs. */
if (sched_data.cur == 3)
{
rtx t = gen_insn_group_barrier (GEN_INT (3));
last_issued = emit_insn_after (t, last_issued);
+ did_stop = true;
maybe_rotate (dump);
}
else if (sched_data.cur > 0)
*************** nop_cycles_until (clock_var, dump)
*** 6148,6153 ****
--- 6150,6156 ----
{
rtx t = gen_insn_group_barrier (GEN_INT (3));
last_issued = emit_insn_after (t, last_issued);
+ did_stop = true;
}
maybe_rotate (dump);
}
*************** nop_cycles_until (clock_var, dump)
*** 6171,6178 ****
--- 6174,6185 ----
last_issued = emit_insn_after (t, last_issued);
t = gen_insn_group_barrier (GEN_INT (3));
last_issued = emit_insn_after (t, last_issued);
+ did_stop = true;
cycles_left--;
}
+
+ if (did_stop)
+ init_insn_group_barriers ();
}
/* We are about to being issuing insns for this clock cycle.