Finish up PR rtl-optimization/44194

H.J. Lu hjl.tools@gmail.com
Wed Sep 12 17:20:00 GMT 2012


On Wed, Sep 12, 2012 at 8:37 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> This is the PR about the useless spilling to memory of structures that are
> returned in registers.  It was essentially addressed last year by Easwaran with
> an enhancement of the RTL DSE pass, but Easwaran also noted that we still spill
> to memory in the simplest cases, e.g. gcc.dg/pr44194-1.c, because expand_call
> creates a temporary on the stack to store the value returned in registers...
>
> The attached patch solves this problem by copying the value into pseudos
> instead by means of emit_group_move_into_temps.  This is sufficient to get rid
> of the remaining memory accesses for gcc.dg/pr44194-1.c on x86-64 for example,
> but not on strict-alignment platforms like SPARC64.
>
> The problem is that, on strict-alignment platforms, emit_group_store will use
> bitfield techniques (store_bit_field) to store the returned value, and the
> bitfield routines (store_bit_field and extract_bit_field) have these lines:
>
>   /* We may be accessing data outside the field, which means
>      we can alias adjacent data.  */
>   if (MEM_P (op0))
>     {
>       op0 = shallow_copy_rtx (op0);
>       set_mem_alias_set (op0, 0);
>       set_mem_expr (op0, 0);
>     }
>
> Now the enhancement implemented in the RTL DSE pass by Easwaran is precisely
> based on the MEM_EXPR of MEM objects.
>
> The patch solves this problem by implementing a variant of adjust_address along
> the lines of the comment at the end of adjust_address_1:
>
>   /* At some point, we should validate that this offset is within the object,
>      if all the appropriate values are known.  */
>   return new_rtx;
>
> i.e. adjust_bitfield_address will drop the underlying object of the MEM if it
> cannot prove that the adjusted memory access is still within its bounds.
> The bitfield manipulation routines in expmed.c are then changed to invoke
> adjust_bitfield_address instead of adjust_address and the above special lines
> in store_bit_field and extract_bit_field are eliminated.
>
> While I was at it, I also fixed a probable oversight in extract_bit_field_1
> that has bothered me for a while: in the multi-word case, extract_bit_field_1
> recurses on extract_bit_field instead of itself (unlike store_bit_field_1),
> which short-circuits the FALLBACK_P parameter.
>
> Tested on x86-64/Linux and SPARC64/Solaris.  Comments?
>
>
> 2012-09-12  Eric Botcazou  <ebotcazou@adacore.com>
>
>         PR rtl-optimization/44194
>         * calls.c (expand_call): In the PARALLEL case, copy the return value
>         into pseudos instead of spilling it onto the stack.
>         * emit-rtl.c (adjust_address_1): Rename ADJUST into ADJUST_ADDRESS and
>         add new ADJUST_OBJECT parameter.
>         If ADJUST_OBJECT is set, drop the underlying object if it cannot be
>         proved that the adjusted memory access is still within its bounds.
>         (adjust_automodify_address_1): Adjust call to adjust_address_1.
>         (widen_memory_access): Likewise.
>         * expmed.c (store_bit_field_1): Call adjust_bitfield_address instead
>         of adjust_address.  Do not drop the underlying object of a MEM.
>         (store_fixed_bit_field): Likewise.
>         (extract_bit_field_1): Likewise.  Fix oversight in recursion.
>         (extract_fixed_bit_field): Likewise.
>         * expr.h (adjust_address_1): Adjust prototype.
>         (adjust_address): Adjust call to adjust_address_1.
>         (adjust_address_nv): Likewise.
>         (adjust_bitfield_address): New macro.
>         (adjust_bitfield_address_nv): Likewise.
>         * expr.c (expand_assignment): Handle a PARALLEL in more cases.
>         (store_expr): Likewise.
>         (store_field): Likewise.
>
>         * dse.c: Fix typos in the head comment.

Will it help

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54315
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28831

Thanks.

-- 
H.J.



More information about the Gcc-patches mailing list