Bug 56897 - unaligned memory access on alpha
Summary: unaligned memory access on alpha
Status: WAITING
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-10 03:53 UTC by Martynas Venckus
Modified: 2022-01-09 04:32 UTC (History)
1 user (show)

See Also:
Host:
Target: alpha
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-04-15 00:00:00


Attachments
gcc-unalign-alpha.diff (430 bytes, patch)
2013-04-10 03:55 UTC, Martynas Venckus
Details | Diff
ab-pre.tgz (64.92 KB, application/x-compressed-tar)
2013-04-16 04:16 UTC, Martynas Venckus
Details
ab-post.tgz (74.03 KB, application/x-compressed-tar)
2013-04-16 04:18 UTC, Martynas Venckus
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martynas Venckus 2013-04-10 03:53:39 UTC
Hi,

After looking at the RTL dumps here and there, I've noticed that
GCC generates unaligned loads in the following manner:

foo = ([reg:X & -8] << (64 - (((reg:FP+reg:Y) & 0x7) << 3))) >> 56

Obviously, we're losing every eighth byte here (consider FP-0,FP-8).
Since FP is aligned, GCC optimizes this to foo = 0.  The right way
to do this is:

foo = ([reg:X & -8] << (56 - (((reg:FP+reg:Y-1) & 0x7) << 3))) >> 56

To reproduce, try this out:

unsigned long foo = 0x010203040a0b0c0d;
printf("%02x",   *((char *)&foo + 7));

With -O (and onwards) it will turn out to be zero;  at every &X+15,
&X+23, etc.  Depending on the offset from the frame pointer.

Attached the diff against trunk.

Martynas.
Comment 1 Martynas Venckus 2013-04-10 03:55:11 UTC
Created attachment 29845 [details]
gcc-unalign-alpha.diff
Comment 2 Uroš Bizjak 2013-04-10 06:43:14 UTC
> Attached the diff against trunk.

Please post patches to gcc-patches mailing list, as described in [1].

[1] http://gcc.gnu.org/contribute.html
Comment 3 Uroš Bizjak 2013-04-15 07:53:58 UTC
(In reply to comment #0)

> To reproduce, try this out:
> 
> unsigned long foo = 0x010203040a0b0c0d;
> printf("%02x",   *((char *)&foo + 7));
> 
> With -O (and onwards) it will turn out to be zero;  at every &X+15,
> &X+23, etc.  Depending on the offset from the frame pointer.

Please create a self-sufficient executable testcase, following the instructions at [1]. I was not able to confirm the problem from the lines you posted.

[1] http://gcc.gnu.org/bugs/
Comment 4 Martynas Venckus 2013-04-16 04:11:52 UTC
Hi,

(In reply to comment #3)
> Please create a self-sufficient executable testcase, following the instructions
> at [1]. I was not able to confirm the problem from the lines you posted.

Thanks for the feedback, Uros.  Did you try it together with the frame 
growing downwards diff posted in #56898?  If so, the locals are actually
at the negative offsets and unaligned loads like foo%8-5 will expose this,
instead of foo%8-1.

I'm attaching the ab-pre.tgz (before the diff) and ab-post.tgz (after 
the diff) which exercise both frame growing upwards and downwards;  RTL
dumps included.  Feel free to turn it into a testcase (I can't do that
at the moment).

I'm 100% busy at work this week, so I would appreciate if you took 
care of this (and #56898).  Otherwise, I'll follow up with the official
contribution guidelines in the weekend.

Martynas.
Comment 5 Martynas Venckus 2013-04-16 04:16:39 UTC
Created attachment 29878 [details]
ab-pre.tgz

> gcc a.c; ./a.out

0d0a
0d0a0401
0d0a0401

> gcc -O a.c; ./a.out

0d0a
0d0a0400
0d0a0400

> cp a.c b.c
> gcc -S -dall a.c
> gcc -O -S -dall b.c
Comment 6 Martynas Venckus 2013-04-16 04:18:16 UTC
Created attachment 29879 [details]
ab-post.tgz

> gcc a.c; ./a.out

0d0a
0d0a0401
0d0a0401

> gcc -O a.c; ./a.out

0d0a
0d0a0401
0d0a0401

> cp a.c b.c
> gcc -S -dall a.c
> gcc -O -S -dall b.c
Comment 7 Uroš Bizjak 2013-04-16 15:15:14 UTC
(In reply to comment #4)
> Hi,
> 
> (In reply to comment #3)
> > Please create a self-sufficient executable testcase, following the instructions
> > at [1]. I was not able to confirm the problem from the lines you posted.
> 
> Thanks for the feedback, Uros.  Did you try it together with the frame 
> growing downwards diff posted in #56898?  If so, the locals are actually
> at the negative offsets and unaligned loads like foo%8-5 will expose this,
> instead of foo%8-1.

No, I am using unpatched compiler. Compiling your ab-pre.tgz test, I got:

~/gcc-build-47/gcc/xgcc -B ~/gcc-build-47/gcc -O a.c
uros@monolith ~/test $ ./a.out
0d0a
0d0a0401
0d0a0401

~/gcc-build-47/gcc/xgcc -B ~/gcc-build-47/gcc -O -mcpu=ev4 a.c
uros@monolith ~/test $ ./a.out
0d0a
0d0a0401
0d0a0401

with:

GNU C (GCC) version 4.7.3 20130228 (prerelease) [gcc-4_7-branch revision 196343] (alphaev68-unknown-linux-gnu)
 
and the same result (with the same flags) with:

GNU C (GCC) version 4.9.0 20130407 (experimental) [trunk revision 197551] (alphaev68-unknown-linux-gnu)

The compilers imply -mcpu=ev67 when invoked without -mcpu command.