Bug 16570 - missing _mm_malloc and _mm_free functions in xmmintrin.h
Summary: missing _mm_malloc and _mm_free functions in xmmintrin.h
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.4.0
: P2 normal
Target Milestone: 4.0.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-07-15 17:18 UTC by Tanguy Fautré
Modified: 2024-03-26 16:00 UTC (History)
3 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2004-07-15 20:08:36


Attachments
Contains the code that should be added to xmmintrin.h (172 bytes, text/plain)
2004-07-15 17:21 UTC, Tanguy Fautré
Details
Implementation following Falk's idea + a small test case. (495 bytes, text/plain)
2004-07-15 20:41 UTC, Tanguy Fautré
Details
corrected an evil bug in _mm_free if called with ptr = 0. (497 bytes, text/plain)
2004-07-15 20:53 UTC, Tanguy Fautré
Details
ok, fixed more bugs. (537 bytes, text/plain)
2004-07-15 21:09 UTC, Tanguy Fautré
Details
An imcomplete patch (1.77 KB, patch)
2004-07-20 19:51 UTC, H.J. Lu
Details | Diff
An updated patch (1.71 KB, patch)
2004-07-21 21:41 UTC, H.J. Lu
Details | Diff
Generic i386 _mm_malloc (1.17 KB, patch)
2004-07-23 13:04 UTC, Danny Smith
Details | Diff
A complete patch for _mm_malloc/_mm_free (2.00 KB, patch)
2004-07-23 15:36 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tanguy Fautré 2004-07-15 17:18:00 UTC
To be 100% compatible with Intel C++ Compiler and MS VC++, there should be the
functions void * _mm_malloc(size_t size, size_t alignment) and void
_mm_free(void * ptr) in xmmintrin.h.
However these are missing.

Suggestion: add the following piece of code to xmmintrin.h (see further attachment).
Comment 1 Tanguy Fautré 2004-07-15 17:21:39 UTC
Created attachment 6754 [details]
Contains the code that should be added to xmmintrin.h
Comment 2 H.J. Lu 2004-07-15 19:12:23 UTC
Is there a portable way to implement _mm_malloc?
Comment 3 Falk Hueffner 2004-07-15 19:30:48 UTC
Sure. Allocate ALIGN bytes more than needed, clear the first ALIGN bytes, mark
the first byte with 1, and round up to proper alignment. On freeing you can then
scan back to the mark byte.
Comment 4 Tanguy Fautré 2004-07-15 19:47:39 UTC
Isn't posix_memalign() portable enough ?

Comment 5 Andrew Pinski 2004-07-15 20:08:36 UTC
No posix_memalign is not potable enough as Windows does not provide it.
Comment 6 Tanguy Fautré 2004-07-15 20:41:46 UTC
Created attachment 6760 [details]
Implementation following Falk's idea + a small test case.

Well, I implemented Falk's idea with the marker. I tested the implementation
with a small test case, so far it's working.

IMHO The marker is not a good solution with big alignments (too slow and takes
a lot of mem), but since _mm_malloc() is generally used for alignement like 16
it's not a real issue here.

For Windows: MS' malloc.h provides _aligned_malloc() and _aligned_free().
Thus we can either go for the marker idea, or use preprocessor conditional
paths using either _aligned_malloc/_aligned_free or posix_memalign/free.
Comment 7 Tanguy Fautré 2004-07-15 20:53:08 UTC
Created attachment 6761 [details]
corrected an evil bug in _mm_free if called with ptr = 0.
Comment 8 Tanguy Fautré 2004-07-15 21:09:57 UTC
Created attachment 6762 [details]
ok, fixed more bugs.

Fixed corner case: alignement = 0.
Fixed _mm_free starting at ptr and not (ptr - 1).
Extended the test cases.

Now, it should be ok (I certainly hope so). Tested the whole thing also on
Visual C++ 7.1 in Debug mode to be sure.
Comment 9 Danny Smith 2004-07-16 11:25:33 UTC
Hi, 
A version of _aligned_malloc and friends was submitted to mingw project and
declared as public domain.  I have been using it for some time with no problem. 
It works on i686 linux and mingw. You may want to look at it for a portable 
implemtation that could be used as mm_malloc.  It would also be useful for an 
overloaded (aligned) new/delete implementation


See the latest file attachments at this link:
https://sourceforge.net/tracker/index.php?
func=detail&aid=668224&group_id=2435&atid=102435

Danny
Comment 10 H.J. Lu 2004-07-16 18:38:53 UTC
I think we should have 2 versions. One uses posix_memalign and the other one
doesn't. A target can pick the best one.
Comment 11 Tanguy Fautré 2004-07-20 15:19:04 UTC
> I think we should have 2 versions. One uses posix_memalign and the other one
> doesn't. A target can pick the best one.

Agreed.
As for the "other one", I like the _aligned_malloc implementation posted by
Danny, even though it's using more memory (from 32 to 64 bits) than the marker idea.

Comment 12 H.J. Lu 2004-07-20 19:51:36 UTC
Created attachment 6787 [details]
An imcomplete patch

This patch provides the posix_memalign version of <mm_malloc.h>.
Someone can fill in the generic version.
Comment 13 H.J. Lu 2004-07-21 21:41:03 UTC
Created attachment 6794 [details]
An updated patch

I fixed a few minor problems in my previous patch.
Comment 14 Danny Smith 2004-07-23 13:04:36 UTC
Created attachment 6808 [details]
Generic i386  _mm_malloc
Comment 15 H.J. Lu 2004-07-23 15:36:24 UTC
Created attachment 6812 [details]
A complete patch for _mm_malloc/_mm_free

Please give it a try. Can someone please come up with a testcase? Thanks.
Comment 16 H.J. Lu 2004-08-02 16:56:07 UTC
An updated patch with testcase is posted at

http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00053.html
Comment 17 GCC Commits 2004-08-03 19:52:54 UTC
Subject: Bug 16570

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	hjl@gcc.gnu.org	2004-08-03 19:52:52

Modified files:
	gcc            : ChangeLog config.gcc 
	gcc/config/i386: xmmintrin.h 
	gcc/testsuite  : ChangeLog 
Added files:
	gcc/config/i386: gmm_malloc.h pmm_malloc.h t-gmm_malloc 
	                 t-pmm_malloc 
	gcc/testsuite/gcc.dg: i386-sse-9.c 

Log message:
	gcc/
	
	2004-08-03  H.J. Lu  <hongjiu.lu@intel.com>
	
	PR target/16570
	* config.gcc (i[34567]86-*-* | x86_64-*-*): Add i386/t-gmm_malloc
	to tmake_file.
	(i[34567]86-*-linux*aout* | i[34567]86-*-linux*libc1): Likewise.
	(i[34567]86-*-linux* | x86_64-*-linux*): Add i386/t-pmm_malloc
	to tmake_file.
	
	* config/i386/t-gmm_malloc: New file.
	* config/i386/t-pmm_malloc: Likewise.
	
	* config/i386/xmmintrin.h: Include <mm_malloc.h>.
	
	2004-08-03  H.J. Lu  <hongjiu.lu@intel.com>
	Tanguy Fautrà <tfautre@pandora.be>
	
	* config/i386/pmm_malloc.h: New file.
	
	2004-08-03  Danny Smith  <dannysmith@users.sourceforge.net>
	
	* config/i386/gmm_malloc.h: New file.
	
	gcc/testsuite/
	
	2004-08-03  H.J. Lu  <hongjiu.lu@intel.com>
	
	PR target/16570
	* gcc.dg/i386-sse-9.c: New test.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.4780&r2=2.4781
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config.gcc.diff?cvsroot=gcc&r1=1.475&r2=1.476
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/gmm_malloc.h.diff?cvsroot=gcc&r1=NONE&r2=1.1
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/pmm_malloc.h.diff?cvsroot=gcc&r1=NONE&r2=1.1
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/t-gmm_malloc.diff?cvsroot=gcc&r1=NONE&r2=1.1
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/t-pmm_malloc.diff?cvsroot=gcc&r1=NONE&r2=1.1
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/xmmintrin.h.diff?cvsroot=gcc&r1=1.29&r2=1.30
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/ChangeLog.diff?cvsroot=gcc&r1=1.4088&r2=1.4089
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.dg/i386-sse-9.c.diff?cvsroot=gcc&r1=NONE&r2=1.1

Comment 18 Andrew Pinski 2004-08-03 20:33:33 UTC
Fixed.