Bug 96236 - __builtin_mma_disassemble_acc() doesn't store elements correctly in LE mode
Summary: __builtin_mma_disassemble_acc() doesn't store elements correctly in LE mode
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 11.0
: P3 normal
Target Milestone: 10.3
Assignee: Peter Bergner
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2020-07-17 20:08 UTC by Peter Bergner
Modified: 2020-07-23 19:35 UTC (History)
2 users (show)

See Also:
Host:
Target: powerpc64le-linux
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-07-17 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Bergner 2020-07-17 20:08:07 UTC
The __builtin_mma_disassemble_acc built-in doesn't correctly account for little endian byte ordering of the pointer type passed to it is not a __vector_quad pointer as the following test case shows:

bergner@pike:~/$ cat disassemble.c 
void
buggy (void *dst)
{
  __vector_quad acc;
  __builtin_mma_xxsetaccz (&acc);
  __builtin_mma_disassemble_acc (dst, &acc);
}

void
foo (__vector_quad *dst)
{
  __vector_quad acc;
  __builtin_mma_xxsetaccz (&acc);
  __builtin_mma_disassemble_acc (dst, &acc);
}
bergner@pike:~/$ gcc -S -O2 -mcpu=power10 disassemble.c 
bergner@pike:~/$ cat disassemble.s 
buggy:
	xxsetaccz 0
	xxmfacc 0
	stxv 0,0(3)
	stxv 1,16(3)
	stxv 2,32(3)
	stxv 3,48(3)
	blr

foo:
	xxsetaccz 0
	xxmfacc 0
	stxvp 2,0(3)
	stxvp 0,32(3)
	blr
Comment 1 Peter Bergner 2020-07-17 20:09:16 UTC
Mine.  This is broken in the FSF GCC 10 branch as well.
Comment 2 GCC Commits 2020-07-22 18:37:06 UTC
The master branch has been updated by Peter Bergner <bergner@gcc.gnu.org>:

https://gcc.gnu.org/g:ae575662833d70cb7d74b9538096c7becc79af14

commit r11-2278-gae575662833d70cb7d74b9538096c7becc79af14
Author: Peter Bergner <bergner@linux.ibm.com>
Date:   Wed Jul 22 11:44:35 2020 -0500

    rs6000: __builtin_mma_disassemble_acc() doesn't store elements correctly in LE mode
    
    PR96236 shows a problem where we don't correctly store our 512-bit accumulators
    correctly in little-endian mode.  The patch below detects when we're doing a
    little-endian memory access and stores to the correct memory locations.
    
    2020-07-22  Peter Bergner  <bergner@linux.ibm.com>
    
    gcc/
            PR target/96236
            * config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Handle
            little-endian memory ordering.
    
    gcc/testsuite/
            PR target/96236
            * gcc.target/powerpc/mma-double-test.c: Update storing results for
            correct little-endian ordering.
            * gcc.target/powerpc/mma-single-test.c: Likewise.
Comment 3 Peter Bergner 2020-07-22 18:39:50 UTC
Fixed on trunk.  I will backport to the GCC 10 release branch once it reopens.

I would have set the target milestone to 10.3, but that version isn't an option right now.
Comment 4 GCC Commits 2020-07-23 17:17:41 UTC
The releases/gcc-10 branch has been updated by Peter Bergner <bergner@gcc.gnu.org>:

https://gcc.gnu.org/g:5497677b497b95a261089d19f5295cc80f99a2b6

commit r10-8522-g5497677b497b95a261089d19f5295cc80f99a2b6
Author: Peter Bergner <bergner@linux.ibm.com>
Date:   Wed Jul 22 11:44:35 2020 -0500

    rs6000: __builtin_mma_disassemble_acc() doesn't store elements correctly in LE mode
    
    PR96236 shows a problem where we don't correctly store our 512-bit accumulators
    correctly in little-endian mode.  The patch below detects when we're doing a
    little-endian memory access and stores to the correct memory locations.
    
    2020-07-22  Peter Bergner  <bergner@linux.ibm.com>
    
    gcc/
            PR target/96236
            * config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Handle
            little-endian memory ordering.
    
    gcc/testsuite/
            PR target/96236
            * gcc.target/powerpc/mma-double-test.c: Update storing results for
            correct little-endian ordering.
            * gcc.target/powerpc/mma-single-test.c: Likewise.
    
    (cherry picked from commit ae575662833d70cb7d74b9538096c7becc79af14)
Comment 5 Peter Bergner 2020-07-23 19:35:43 UTC
Fixed everywhere.