This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: bswap PRs 69714, 67781
- From: Thomas Preud'homme <thomas dot preudhomme at foss dot arm dot com>
- To: Bernd Schmidt <bschmidt at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, John David Anglin <dave dot anglin at bell dot net>
- Date: Mon, 15 Feb 2016 11:47:00 +0800
- Subject: Re: bswap PRs 69714, 67781
- Authentication-results: sourceware.org; auth=none
- References: <56BE083B dot 3090806 at redhat dot com>
Hi Bernd,
First of all, my apologize for the late reply. I was in holidays the past week
to celebrate Chinese new year.
On Friday, February 12, 2016 05:28:43 PM Bernd Schmidt wrote:
> PR69714 is an issue where the bswap pass makes an incorrect
> transformation on big-endian targets. The source has a 32-bit bswap, but
> PA doesn't have a pattern for that. Still, we recognize that there is a
> 16-bit bswap involved, and generate code for that - loading the halfword
> at offset 2 from the original memory, as per the proper big-endian
> correction.
>
> The problem is that we recognized the rotation of the _high_ part, which
> is at offset 0 on big-endian. The symbolic number is 0x0304, rather than
> 0x0102 as it should be. Only the latter form should ever be matched.
Which is exactly what the patch for PR67781 was set out to do (see the if
(BYTES_BIG_ENDIAN) block in find_bswap_or_nop.
The reason why the offset is wrong is due to another if (BYTES_BIG_ENDIAN)
block in bswap_replace. I will check the testcase added with that latter
block, my guess is that the change was trying to fix a similar issue to
PR67781 and PR69714. When removing it the load in avcrc is done without an
offset. I should have run the full testsuite also on a big endian system
instead of a few selected testcases and a bootstrap in addition to the little
endian bootstrap+testsuite. Lesson learned.
> The
> problem is caused by the patch for PR67781, which was intended to solve
> a different big-endian problem. Unfortunately, I think it is based on an
> incorrect analysis.
>
> The real issue with the PR67781 testcase is in fact the masking loop,
> identified by Thomas in comment #7 for 67781.
>
> for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++)
> ;
> n->range = rsize;
>
> If we have a value of 0x01020304, but a range of 5, it means that
> there's an "invisible" high-order byte that we don't care about. On
> little-endian, we can just ignore it. On big-endian, this implies that
> the data we're interested in is located at an offset. The code that does
> the replacements does not use the offset or bytepos fields, it assumes
> that the bytepos always matches that of the load instruction.
Yes, but the change in find_bswap_or_nop aims at checking that we have
0x05040302 or 0x02030405 for big endian targets and 0x04030201 or 0x01020304
for little endian targets.
Before the "if (rsize < n->range)" block, cmpnop and cmpxchg are respectively
0x0504030201 and 0x0102030405. Then for big endian it will only keep the 4
least significant symbolic bytes of cmpxchg (if performs a bitwise and) and
the 4 most significant symbolic bytes of cmpnop (it performs a right shift) so
you'd get 0x05040302 for cmpnop and 0x02030405 for cmpxchg. Both would
translate to a load at offset 0, and then a byteswap for the latter.
As said earlier, the problem is in bswap_replace which tries to adjust the
address of the load for big endian targets by adding a load offset. With the
change in find_bswap_or_nop, an offset is never needed because only pattern
that correspond to a load at offset 0 are recognized. I kept for GCC 7 to
change that to allow offset and recognize all sub-load and sub-bswap.
> The only
> offset we can introduce is the big-endian correction, but that assumes
> we're always dealing with lowparts.
>
> So, I think the correct/conservative fix for both bugs is to revert the
> earlier change for PR67781, and then apply the following on top:
>
> --- revert.tree-ssa-math-opts.c 2016-02-12 15:22:57.098895058 +0100
> +++ tree-ssa-math-opts.c 2016-02-12 15:23:08.482228474 +0100
> @@ -2473,10 +2473,14 @@ find_bswap_or_nop (gimple *stmt, struct
> /* Find real size of result (highest non-zero byte). */
> if (n->base_addr)
> {
> - int rsize;
> + unsigned HOST_WIDE_INT rsize;
> uint64_t tmpn;
>
> for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER,
> rsize++);
> + if (BYTES_BIG_ENDIAN && n->range != rsize)
> + /* This implies an offset, which is currently not handled by
> + bswap_replace. */
> + return NULL;
> n->range = rsize;
> }
This works too yes with less optimizations for big endian. I'm fine with
either solutions. This one is indeed a bit more conservative so I see the
appeal to use it for GCC 5 and 6.
Best regards,
Thomas