This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: DFA Scheduler - unable to pipeline loads
- From: "Matt Lee" <reachmatt dot lee at gmail dot com>
- To: "Adam Nemet" <anemet at caviumnetworks dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Tue, 4 Sep 2007 17:03:19 -0700
- Subject: Re: DFA Scheduler - unable to pipeline loads
- References: <e94302560708311457l59598381tae5a2588fff63c43@mail.gmail.com> <87k5rbxp38.fsf@localhost.localdomain.i-did-not-set--mail-host-address--so-tickle-me>
On 8/31/07, Adam Nemet <anemet@caviumnetworks.com> wrote:
> "Matt Lee" <reachmatt.lee@gmail.com> writes:
>
> > I am seeing poor scheduling in Dhrystone where a memcpy call is
> > expanded inline.
> >
> > memcpy (&dst, &src, 16) ==>
> >
> > load 1, rA + 4
> > store 1, rB + 4
> > load 2, rA + 8
> > store 2, rB + 8
> > ...
>
> Are you sure that there are no dependencies due to aliasing here. The
> only similar thing that Dhrystone has to what you quote is between a
> pointer and a global variable and in fact there is an aliasing
> conflict there.
>
> If that is the case you can define a movmem pattern where you first
> load everthing in one chunk and then store it later. See MIPS's
> movmemsi pattern and the function mips_block_move_straight.
>
> Adam
>
Adam, you are right. There is an aliasing conflict in my test case.
However, I get the same effect when I use the restrict keyword on the
pointers. Here is an even more reduced test case, that shows the same
problem.
#include <string.h>
struct foo {
int a[4];
} ;
void func (struct foo * restrict p, struct foo * restrict q)
{
memcpy (p, q, sizeof (struct foo));
}
Perhaps restrict doesn't work. In any case, I am trying to optimize
the case where there is clearly no aliasing. Your suggestion regarding
movmemsi is interesting. I have not used this pattern before and
assumed that it was required only when something special must be done
to do block moves. In my architecture, a block move is not special and
is equivalent a series of loads and stores. Why do I need this pattern
and why/how does the aliasing conflict become a non-issue when
defining this pattern? I apologize if i am missing something basic
here, but the GCC documentation regarding this pattern doesn't tell me
much about why it is required.
--
thanks,
Matt