This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
- From: "dave at hiauly1 dot hia dot nrc dot ca" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 24 Sep 2006 22:15:38 -0000
- Subject: [Bug rtl-optimization/17264] [hppa] Missing address increment optimization for fp load/stores
- References: <bug-17264-581@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #2 from dave at hiauly1 dot hia dot nrc dot ca 2006-09-24 22:15 -------
Subject: Re: [hppa] Missing address increment optimization for fp load/stores
> For this test case:
>
> void f(double *pds, double *pdd, unsigned long len) {
> while (len >= 8*sizeof(double)) {
> register double r1,r2,r3,r4;
> r1 = *pds++;
> r2 = *pds++;
> r3 = *pds++;
> r4 = *pds++;
> *pdd++ = r1;
> *pdd++ = r2;
> *pdd++ = r3;
> *pdd++ = r4;
> }
> }
>
> gcc starting from 4.0 produces this:
>
> .L3:
> fldds -16(%r26),%fr22
> fldds -8(%r26),%fr23
> fldds 0(%r26),%fr24
> fldds 8(%r26),%fr25
> ldo 32(%r26),%r26
> fstds %fr22,-16(%r25)
> fstds %fr23,-8(%r25)
> fstds %fr24,0(%r25)
> fstds %fr25,8(%r25)
> b .L3
>
> which I suspect is actually better, since it avoids dependencies between the
> loads. But I'm not familiar with hppa, can anybody comment?
It looks close to optimal to me. The code is better than that generated
by 3.4.x or HP cc. Using the auto-increment forms would allow elimination
of the two ldo instructions to increment r25 and r26.
Dave
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17264