Bug 54821 - Microblaze: Position independent code for byte access is incorrect.
Summary: Microblaze: Position independent code for byte access is incorrect.
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.7.2
: P3 major
Target Milestone: ---
Assignee: Not yet assigned to anyone
Depends on:
Reported: 2012-10-05 08:45 UTC by qball
Modified: 2013-11-30 02:12 UTC (History)
1 user (show)

See Also:
Known to work:
Known to fail:
Last reconfirmed:


Note You need to log in before you can comment on or make changes to this bug.
Description qball 2012-10-05 08:45:36 UTC
When enabling position independent code gcc generates incorrect assembly for byte access. Works fine when generating word (32bit) loads/writes.

When doing a word load results in the following correct assembly:
             lwi          r3, r20, 20            # load the address at .got.plt + 20 (address of a variable in the .got section)
             lwi          r4, r3, 0                 # load the value of the variable in r4

However when doing byte load:

            imm       4                          
            lbui        r3, r20, 23884     # load the value of the variable in r3. The address used is r20 + 0x45d4c.

There are 2 things going wrong..  
1.) It tries to load the r20 + the absolute offset of the variable.
2.) It should first load the address (from the got) then load the value from that address
I would expect something like:
lwi         r3, r20, 16
lbui        r4, r3, 0

A similar things goes wrong for half word access.

This problem can be reproduced with GCC 4.1.2 (from xilinx) up to 4.7.2 (When looking in the object files, generating elf files fails as of bug #54819)
A script to build the cross-compiler can be found here: https://github.com/DaveDavenport/CrossCompilerGCCScript/blob/master/gcc_cross_compiler.sh

C code that shows the issue:
char temp = 3;
int temp2 = 12;
short int temp4 =13;
void main()
	int a = temp+ temp2+temp4;

Output generated for this: (by 4.1.2)
000001b0 <main>:
 1b0:	3021fff0 	addik	r1, r1, -16
 1b4:	fa610008 	swi	r19, r1, 8
 1b8:	fa81000c 	swi	r20, r1, 12
 1bc:	12610000 	addk	r19, r1, r0
 1c0:	e0740444 	lbui	r3, r20, 1092
 1c4:	90630060 	sext8	r3, r3
 1c8:	10830000 	addk	r4, r3, r0
 1cc:	b0000000 	imm	0
 1d0:	e874000c 	lwi	r3, r20, 12
 1d4:	e8630000 	lwi	r3, r3, 0
 1d8:	10841800 	addk	r4, r4, r3
 1dc:	e474044c 	lhui	r3, r20, 1100
 1e0:	90630061 	sext16	r3, r3
 1e4:	10641800 	addk	r3, r4, r3
 1e8:	f8730004 	swi	r3, r19, 4
 1ec:	10330000 	addk	r1, r19, r0
 1f0:	ea610008 	lwi	r19, r1, 8
 1f4:	ea81000c 	lwi	r20, r1, 12
 1f8:	30210010 	addik	r1, r1, 16
 1fc:	b60f0008 	rtsd	r15, 8
 200:	80000000 	or	r0, r0, r0

Only the load word is done correctly.
Comment 1 qball 2012-10-05 13:17:17 UTC
For 4.6.2 (xilinx) this seems to fix the code generation:
@@ -558,7 +558,7 @@
 	    return !(flag_pic && pic_address_needs_scratch (x));
-	else if (flag_pic == 2)
+	else if (flag_pic)
 	    return false;

    addik    r1,r1,-16
    swi    r19,r1,8
    swi    r20,r1,12
    addk    r19,r1,r0
    lwi    r3,r20,temp@GOT
    lbui    r3,r3,0
    sext8    r3,r3
    addk    r4,r3,r0
    lwi    r3,r20,temp2@GOT
    lwi    r3,r3,0
    addk    r4,r4,r3
    lwi    r3,r20,temp4@GOT
    lhui    r3,r3,0
    sext16    r3,r3
    addk    r3,r4,r3
    swi    r3,r19,4
    addk    r1,r19,r0
    lwi    r19,r1,8
    lwi    r20,r1,12
    addik    r1,r1,16
    rtsd    r15,8
    nop        # Unfilled delay slot