This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: SMS in gcc4.0
- From: Canqun Yang <canqun at nudt dot edu dot cn>
- To: Steven Bosscher <stevenb at suse dot de>
- Cc: gcc at gcc dot gnu dot org, Mostafa Hagog <MUSTAFA at il dot ibm dot com>,Ayal Zaks <ZAKS at il dot ibm dot com>
- Date: Thu, 2 Jun 2005 09:29:17 +0800 (HKT)
- Subject: Re: SMS in gcc4.0
- References: <OFC824C841.6870607F-ON43256FFC.002C921B-43256FFC.002D11B6@il.ibm.com> <20050601144330.C7B835BB2D@ds20.nudt.edu.cn> <200506011635.21020.stevenb@suse.de>
- Reply-to: Canqun Yang <canqun at nudt dot edu dot cn>
Steven Bosscher <stevenb@suse.de>:
> On Wednesday 01 June 2005 16:43, Canqun Yang wrote:
> > Hi, all
> >
> > I've taken a look on modulo-sched.c recently, and
found
> > that both new_cycles and orig_cycles are
imprecise. The
> > reason is that kernel_number_of_cycles does not
take the
> > data dependences of insns into account as the DFA
> > scheduler does in haifa-sched.c.
>
> How does this affect the cycles computation?
>
An insns is ready for schedule only when all the insns
it dependent on have already be scheduled. In haifa-
sched.c, there is a queue to hold the insns which are
ready for schedule.
To find how the data dependence affect the cycles
computation, the more simple way is to compare the
two versions of assembly code generated by GCC
respectively, one is generated by turning on '-fmodulo-
sched', the other not. Without SMS, the code in loop
has many stops ';;' to seperate the instrcutions which
have data dependence, while with SMS, though the
kernel code of the loop has more instructions, but
less stops ';;'.
> > On IA-64, three improvements are needed to let SMS
work.
> > 1) Modify doloop_register_get or the similar
function
> > defined in doloop.c to recognize the loop count
> > register. I have supplied a patch about this in
April.
>
> Mustafa and I have a patch that has a similar
effect, see
> http://gcc.gnu.org/ml/gcc-patches/2005-
06/msg00035.html.
>
> > 2) Use more precise way to calculate the values of
the
> > two kind of cycles, or just ignore this benefit
assertion.
>
> Probably need to be more precise :-/
>
> When I manually hacked modulo-sched.c to ignore this
test, I
> did see loops getting scheduled, but I also ran into
ICEs in
> cfglayout.
There are no ICEs for pi.f90, swim.f, and mgrid.f
according to my test. But, an internal compile error
of 'unrecognizable insn' is produced
by 'gen_sub2_insn' which explicitly minus 'ar.lc' when
swim.f and mgrid.f are being compiled.
>
> > 3) The counted loop register 'ar.lc' of IA-64 can
not be
> > updated directly. Another temporary register is
needed
> > to evaluate the value of the actural loop count
after
> > SMS schedule, and assign its value to 'ar.lc'.
>
> Actually, should SMS just not update the loop
register in place?
> I never figured out why it tries to produce a sub
insns (using
> gen_sub2_insn which is also wrong btw).
>
The current implementation of SMS does not use IA-64's
epilog register (ar.ec). After SMS, the loop count is
just used to control the execution times of the kernel
code, and the kernel code will execute
loop_count - (stage_count - 1) times
The sub insns generated by gen_sub2_insn is used to
produce this value.
> Gr.
> Steven
>
>
Canqun Yang
Creative Compiler Research Group.
National University of Defense Technology, China.