Summary: | [5 Regression] Bswap load miscompilation | ||
---|---|---|---|
Product: | gcc | Reporter: | Jakub Jelinek <jakub> |
Component: | tree-optimization | Assignee: | Jakub Jelinek <jakub> |
Status: | RESOLVED FIXED | ||
Severity: | normal | Keywords: | wrong-code |
Priority: | P1 | ||
Version: | 5.0 | ||
Target Milestone: | 5.0 | ||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | 2015-02-25 00:00:00 | |
Attachments: |
gcc5-pr65215.patch
gcc5-pr65215.patch |
Description
Jakub Jelinek
2015-02-25 22:48:17 UTC
Created attachment 34879 [details] gcc5-pr65215.patch Untested fix. There are still issues left, e.g. I can't understand the "bswap &&" part in if (bswap && align < GET_MODE_ALIGNMENT (TYPE_MODE (load_type)) && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align)) return false; Don't you use the new MEM_REF even for the !bswap (aka nop) case? So, I don't see how it would be safe to generate that. And the testsuite coverage of this is definitely suboptimal, from endianity POV, bitfields etc. I mean something like: unsigned int foo (unsigned char *p) { return ((unsigned int) p[0] << 24) | (p[1] << 16) | (p[2] << 8) | p[3]; } on strict align big endian machines, let me add another testcase. Created attachment 34880 [details] gcc5-pr65215.patch Updated patch that performs the alignment check unconditionally. (In reply to Jakub Jelinek from comment #1) > Created attachment 34879 [details] > gcc5-pr65215.patch > > Untested fix. There are still issues left, e.g. I can't understand the > "bswap &&" part in > if (bswap > && align < GET_MODE_ALIGNMENT (TYPE_MODE (load_type)) > && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align)) > return false; > Don't you use the new MEM_REF even for the !bswap (aka nop) case? So, I > don't see how it would be safe to generate that. > And the testsuite coverage of this is definitely suboptimal, from endianity > POV, bitfields etc. I suggested that change (remove bswap &&) multiple times, but it got lost appearantly. (I even remember applying that change myself!?) (In reply to Richard Biener from comment #4) > (In reply to Jakub Jelinek from comment #1) > > Created attachment 34879 [details] > > gcc5-pr65215.patch > > > > Untested fix. There are still issues left, e.g. I can't understand the > > "bswap &&" part in > > if (bswap > > && align < GET_MODE_ALIGNMENT (TYPE_MODE (load_type)) > > && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align)) > > return false; > > Don't you use the new MEM_REF even for the !bswap (aka nop) case? So, I > > don't see how it would be safe to generate that. > > And the testsuite coverage of this is definitely suboptimal, from endianity > > POV, bitfields etc. > > I suggested that change (remove bswap &&) multiple times, but it got lost > appearantly. (I even remember applying that change myself!?) I remember the contrary as this was the reason PR61320 was investigated in the first place. This idea is that if it's unaligned the expander will take care of this and that they would be at least as efficient as what the user code was doing to avoid doing an unaligned load. I'm happy to revisit that position though. Best regards, Thomas I can certainly remove that hunk from the patch, if the expander and other passes handle it well. The test can stay I guess. (In reply to Jakub Jelinek from comment #6) > I can certainly remove that hunk from the patch, if the expander and other > passes handle it well. The test can stay I guess. Things are at least working on x86 (obviously), ARM and SPARC (after PR61320's fix). Also this code is there since a long time without any bug report problems due to unalignment. One question about the patch: is there a reason not to use n->range instead of GET_MODE_BITSIZE (TYPE_MODE (load_type))? Thanks for finding and fixing this. Best regards. (In reply to Thomas Preud'homme from comment #7) > (In reply to Jakub Jelinek from comment #6) > > I can certainly remove that hunk from the patch, if the expander and other > > passes handle it well. The test can stay I guess. > > Things are at least working on x86 (obviously), ARM and SPARC (after > PR61320's fix). Also this code is there since a long time without any bug > report problems due to unalignment. Ok. > One question about the patch: is there a reason not to use n->range instead > of GET_MODE_BITSIZE (TYPE_MODE (load_type))? Supposedly that can be used too, that should probably always be the case. (In reply to Thomas Preud'homme from comment #5) > (In reply to Richard Biener from comment #4) > > (In reply to Jakub Jelinek from comment #1) > > > Created attachment 34879 [details] > > > gcc5-pr65215.patch > > > > > > Untested fix. There are still issues left, e.g. I can't understand the > > > "bswap &&" part in > > > if (bswap > > > && align < GET_MODE_ALIGNMENT (TYPE_MODE (load_type)) > > > && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align)) > > > return false; > > > Don't you use the new MEM_REF even for the !bswap (aka nop) case? So, I > > > don't see how it would be safe to generate that. > > > And the testsuite coverage of this is definitely suboptimal, from endianity > > > POV, bitfields etc. > > > > I suggested that change (remove bswap &&) multiple times, but it got lost > > appearantly. (I even remember applying that change myself!?) > > > I remember the contrary as this was the reason PR61320 was investigated in > the first place. This idea is that if it's unaligned the expander will take > care of this and that they would be at least as efficient as what the user > code was doing to avoid doing an unaligned load. Ah indeed. The idea is that the expander should be able to produce the most optimal and canonical form for an unaligned load on those targets, especially for say loading a 4-byte value from a 2-byte aligned source where the source had 4 individual char loads. > I'm happy to revisit that position though. No - I just stand corrected. > Best regards, > > Thomas (In reply to Richard Biener from comment #9) > (In reply to Thomas Preud'homme from comment #5) > > (In reply to Richard Biener from comment #4) > > > (In reply to Jakub Jelinek from comment #1) > > > > Created attachment 34879 [details] > > > > gcc5-pr65215.patch > > > > > > > > Untested fix. There are still issues left, e.g. I can't understand the > > > > "bswap &&" part in > > > > if (bswap > > > > && align < GET_MODE_ALIGNMENT (TYPE_MODE (load_type)) > > > > && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align)) > > > > return false; > > > > Don't you use the new MEM_REF even for the !bswap (aka nop) case? So, I > > > > don't see how it would be safe to generate that. > > > > And the testsuite coverage of this is definitely suboptimal, from endianity > > > > POV, bitfields etc. > > > > > > I suggested that change (remove bswap &&) multiple times, but it got lost > > > appearantly. (I even remember applying that change myself!?) And what I changed was Index: tree-ssa-math-opts.c =================================================================== --- tree-ssa-math-opts.c (revision 212067) +++ tree-ssa-math-opts.c (revision 212068) @@ -2179,7 +2179,9 @@ bswap_replace (gimple cur_stmt, gimple_s unsigned align; align = get_object_alignment (src); - if (bswap && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align)) + if (bswap + && align < GET_MODE_ALIGNMENT (TYPE_MODE (load_type)) + && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align)) return false; gsi_move_before (&gsi, &gsi_ins); > > > > I remember the contrary as this was the reason PR61320 was investigated in > > the first place. This idea is that if it's unaligned the expander will take > > care of this and that they would be at least as efficient as what the user > > code was doing to avoid doing an unaligned load. > > Ah indeed. The idea is that the expander should be able to produce the > most optimal and canonical form for an unaligned load on those targets, > especially for say loading a 4-byte value from a 2-byte aligned source > where the source had 4 individual char loads. > > > I'm happy to revisit that position though. > > No - I just stand corrected. > > > Best regards, > > > > Thomas Author: jakub Date: Thu Feb 26 21:01:59 2015 New Revision: 221033 URL: https://gcc.gnu.org/viewcvs?rev=221033&root=gcc&view=rev Log: PR tree-optimization/65215 * tree-ssa-math-opts.c (find_bswap_or_nop_load): Return false for PDP endian targets. (perform_symbolic_merge, find_bswap_or_nop_1, find_bswap_or_nop): Fix up formatting issues. (bswap_replace): Likewise. For BYTES_BIG_ENDIAN, if the final access size is smaller than the original, adjust MEM_REF offset by the difference of sizes. Use is_gimple_mem_ref_addr instead of is_gimple_min_invariant test to avoid adding address temporaries. * gcc.c-torture/execute/pr65215-1.c: New test. * gcc.c-torture/execute/pr65215-2.c: New test. * gcc.c-torture/execute/pr65215-3.c: New test. * gcc.c-torture/execute/pr65215-4.c: New test. * gcc.c-torture/execute/pr65215-5.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr65215-1.c trunk/gcc/testsuite/gcc.c-torture/execute/pr65215-2.c trunk/gcc/testsuite/gcc.c-torture/execute/pr65215-3.c trunk/gcc/testsuite/gcc.c-torture/execute/pr65215-4.c trunk/gcc/testsuite/gcc.c-torture/execute/pr65215-5.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-math-opts.c Fixed. |