This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/53712] Does not combine unaligned load with _mm_cmpistri, redundant instruction at -O0
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 18 Jun 2012 08:45:51 +0000
- Subject: [Bug target/53712] Does not combine unaligned load with _mm_cmpistri, redundant instruction at -O0
- Auto-submitted: auto-generated
- References: <bug-53712-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53712
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |x86_64-*-*
Status|UNCONFIRMED |NEW
Keywords| |missed-optimization
Last reconfirmed| |2012-06-18
CC| |uros at gcc dot gnu.org
Ever Confirmed|0 |1
Summary|SEGV in generated code for |Does not combine unaligned
|_mm_cmpistri with unaligned |load with _mm_cmpistri,
|operand when using -O0 |redundant instruction at
| |-O0
Known to fail| |4.8.0
--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-06-18 08:45:51 UTC ---
You have an unaligned load in the _mm_cmpistri arguments:
* (const __m128i *) (s1)
s1 is not properly aligned.
At -O0 _mm_cmpistri is a macro while with optimization it is an inline
function. Not sure where the pcmpistrm instruction is from.
Using
#include <nmmintrin.h>
#include <stdio.h>
int test( const char* s1, const char * s2 )
{
__m128i s1chars = _mm_loadu_si128( (const __m128i *) s2 );
__m128i s2chars = _mm_loadu_si128( (const __m128i *) (s1));
return _mm_cmpistri( s1chars, s2chars, _SIDD_CMP_EQUAL_ANY );
}
int main( int argc, char * argv[] )
{
const char* s1 = "1234567890b1234567890";
const char* s2 = "abcdefghijklmnop";
int r = test( s1, s2 );
fprintf( stderr, "\nResult: %d", r );
r = test( s1, s2+1 ); // misaligned s2
fprintf( stderr, "\nResult: %d", r );
return 0;
}
the testcase works as expected. Still with the "redundant"(?) instruction
though. Thus your source is invalid but the missed-optimization looks
odd (though it's only there at -O0). It also misses to combine the
unaligned load into the cmpistri instruction.