56897 – unaligned memory access on alpha

Bug 56897 - unaligned memory access on alpha

Summary: unaligned memory access on alpha

Status:	WAITING

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	unknown

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2013-04-10 03:53 UTC by Martynas Venckus
Modified:	2022-01-09 04:32 UTC (History)
CC List:	1 user (show)

See Also:
Host:
Target:	alpha
Build:
Known to work:
Known to fail:
Last reconfirmed:	2013-04-15 00:00:00

Attachments
gcc-unalign-alpha.diff (430 bytes, patch) 2013-04-10 03:55 UTC, Martynas Venckus	Details \| Diff
ab-pre.tgz (64.92 KB, application/x-compressed-tar) 2013-04-16 04:16 UTC, Martynas Venckus	Details
ab-post.tgz (74.03 KB, application/x-compressed-tar) 2013-04-16 04:18 UTC, Martynas Venckus	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Martynas Venckus 2013-04-10 03:53:39 UTC

Hi,

After looking at the RTL dumps here and there, I've noticed that
GCC generates unaligned loads in the following manner:

foo = ([reg:X & -8] << (64 - (((reg:FP+reg:Y) & 0x7) << 3))) >> 56

Obviously, we're losing every eighth byte here (consider FP-0,FP-8).
Since FP is aligned, GCC optimizes this to foo = 0.  The right way
to do this is:

foo = ([reg:X & -8] << (56 - (((reg:FP+reg:Y-1) & 0x7) << 3))) >> 56

To reproduce, try this out:

unsigned long foo = 0x010203040a0b0c0d;
printf("%02x",   *((char *)&foo + 7));

With -O (and onwards) it will turn out to be zero;  at every &X+15,
&X+23, etc.  Depending on the offset from the frame pointer.

Attached the diff against trunk.

Martynas.

Comment 1 Martynas Venckus 2013-04-10 03:55:11 UTC

Created attachment 29845 [details]
gcc-unalign-alpha.diff

Comment 2 Uroš Bizjak 2013-04-10 06:43:14 UTC

> Attached the diff against trunk.

Please post patches to gcc-patches mailing list, as described in [1].

[1] http://gcc.gnu.org/contribute.html

Comment 3 Uroš Bizjak 2013-04-15 07:53:58 UTC

(In reply to comment #0)

> To reproduce, try this out:
> 
> unsigned long foo = 0x010203040a0b0c0d;
> printf("%02x",   *((char *)&foo + 7));
> 
> With -O (and onwards) it will turn out to be zero;  at every &X+15,
> &X+23, etc.  Depending on the offset from the frame pointer.

Please create a self-sufficient executable testcase, following the instructions at [1]. I was not able to confirm the problem from the lines you posted.

[1] http://gcc.gnu.org/bugs/

Comment 4 Martynas Venckus 2013-04-16 04:11:52 UTC

Hi,

(In reply to comment #3)
> Please create a self-sufficient executable testcase, following the instructions
> at [1]. I was not able to confirm the problem from the lines you posted.

Thanks for the feedback, Uros.  Did you try it together with the frame 
growing downwards diff posted in #56898?  If so, the locals are actually
at the negative offsets and unaligned loads like foo%8-5 will expose this,
instead of foo%8-1.

I'm attaching the ab-pre.tgz (before the diff) and ab-post.tgz (after 
the diff) which exercise both frame growing upwards and downwards;  RTL
dumps included.  Feel free to turn it into a testcase (I can't do that
at the moment).

I'm 100% busy at work this week, so I would appreciate if you took 
care of this (and #56898).  Otherwise, I'll follow up with the official
contribution guidelines in the weekend.

Martynas.

Comment 5 Martynas Venckus 2013-04-16 04:16:39 UTC

Created attachment 29878 [details]
ab-pre.tgz

> gcc a.c; ./a.out

0d0a
0d0a0401
0d0a0401

> gcc -O a.c; ./a.out

0d0a
0d0a0400
0d0a0400

> cp a.c b.c
> gcc -S -dall a.c
> gcc -O -S -dall b.c

Comment 6 Martynas Venckus 2013-04-16 04:18:16 UTC

Created attachment 29879 [details]
ab-post.tgz

> gcc a.c; ./a.out

0d0a
0d0a0401
0d0a0401

> gcc -O a.c; ./a.out

0d0a
0d0a0401
0d0a0401

> cp a.c b.c
> gcc -S -dall a.c
> gcc -O -S -dall b.c

Comment 7 Uroš Bizjak 2013-04-16 15:15:14 UTC

(In reply to comment #4)
> Hi,
> 
> (In reply to comment #3)
> > Please create a self-sufficient executable testcase, following the instructions
> > at [1]. I was not able to confirm the problem from the lines you posted.
> 
> Thanks for the feedback, Uros.  Did you try it together with the frame 
> growing downwards diff posted in #56898?  If so, the locals are actually
> at the negative offsets and unaligned loads like foo%8-5 will expose this,
> instead of foo%8-1.

No, I am using unpatched compiler. Compiling your ab-pre.tgz test, I got:

~/gcc-build-47/gcc/xgcc -B ~/gcc-build-47/gcc -O a.c
uros@monolith ~/test $ ./a.out
0d0a
0d0a0401
0d0a0401

~/gcc-build-47/gcc/xgcc -B ~/gcc-build-47/gcc -O -mcpu=ev4 a.c
uros@monolith ~/test $ ./a.out
0d0a
0d0a0401
0d0a0401

with:

GNU C (GCC) version 4.7.3 20130228 (prerelease) [gcc-4_7-branch revision 196343] (alphaev68-unknown-linux-gnu)
 
and the same result (with the same flags) with:

GNU C (GCC) version 4.9.0 20130407 (experimental) [trunk revision 197551] (alphaev68-unknown-linux-gnu)

The compilers imply -mcpu=ev67 when invoked without -mcpu command.