This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RS6000] PR60737, expand_block_clear uses word stores


On Fri, May 2, 2014 at 6:20 AM, Alan Modra <amodra@gmail.com> wrote:
> In cases where the compiler has no alignment info, powerpc64le-linux
> gcc generates byte at a time copies for -mstrict-align (which is on
> for little-endian power7).  That's awful code, a problem shared by
> other strict-align targets, see pr50417.  However, we also have a case
> when -mno-strict-align generates less than ideal code, which I believe
> stems from using alignment as a proxy for testing an address offset.
> See http://gcc.gnu.org/ml/gcc-patches/1999-09n/msg01072.html.
>
> So my first attempt at fixing this problem looked at address offsets
> directly.  That worked fine too, but on thinking some more, I believe
> we no longer have the movdi restriction.  Nowadays we'll reload the
> address if we have an offset that doesn't satisfy the "Y" constraint
> (ie. a multiple of 4 offset).  Which led to this simpler patch.
> Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux
> and powerpc-linux.  OK to apply?

Hi, Alan

Thanks for finding and addressing this.

As you mention, recent server-class processors, at least POWER8, do
not have the performance degradation for common, mis-aligned loads and
stores of wider modes. But the patch should not impose this default on
the large, installed based of processors, where mis-aligned loads can
be a severe performance penalty. This heuristic has become
processor-dependent and should not be hard-coded in the block_move and
block_clear algorithms.

PROCESSOR_DEFAULT is POWER8 for ELFv2 (and should be updated as the
default for PowerLinux in general). Please update the patch to test
rs6000_cpu, probably another boolean flag set in
rs6000_option_override_internal(). Because of the processor defaults,
the preferred instruction sequence will be the default without
encoding an assumption about the heuristics in the algorithm itself.

Thanks, David


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]