This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Defer address legitimization for expanded ARRAY_REF, COMPONENT_REF, etc. til the final address is computed
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Yufeng Zhang <Yufeng dot Zhang at arm dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 22 Nov 2013 14:48:30 +0100
- Subject: Re: [PATCH] Defer address legitimization for expanded ARRAY_REF, COMPONENT_REF, etc. til the final address is computed
- Authentication-results: sourceware.org; auth=none
- References: <528F5565 dot 60701 at arm dot com>
On Fri, Nov 22, 2013 at 2:00 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote:
> Hi,
>
> Currently the address legitimization (by calling memory_address_addr_space)
> is carried out twice during the RTL expansion of ARRAY_REF, COMPONENT_REF,
> etc. when their OFFSET is not NULL. It is done once for the BASE and once
> for the summed address in offset_address. This may cause part, if not all,
> of the generated BASE RTL to be forced into reg(s), preventing the RTL
> generator from carrying out effective re-association across BASE and OFFSET
> (via simplify_gen_binary).
>
> For example, given the following test case:
>
> typedef int arr_2[20][20];
> void foo (arr_2 a2, int i, int j)
> {
> a2[i+10][j] = 1;
> }
>
> the RTL code for the BASE (i.e. a2[i+10]) on arm (-mcpu=cortex-a15, -mthumb)
> is
TER makes us see *(a2_5(D) + i * 80 + 800)[j_8] here
> * before the legitimization of BASE:
>
> (plus:SI (plus:SI (mult:SI (reg/v:SI 115 [ i ])
> (const_int 80 [0x50]))
> (reg/v/f:SI 114 [ a2 ]))
> (const_int 800 [0x320]))
> (reg/f:SI 122)
>
> * after the legitimization of BASE:
>
> (reg/f:SI 122)
>
> "Thanks to" the initial legitimization, the RTL for the final address is
> turned into:
>
> (plus:SI (mult:SI (reg/v:SI 116 [ j ])
> (const_int 4 [0x4]))
> (reg/f:SI 122))
>
> while with the legitimization deferred, the RTL for the final address could
> be:
>
> (plus:SI (plus:SI (plus:SI (mult:SI (reg/v:SI 115 [ i ])
> (const_int 80 [0x50]))
> (mult:SI (reg/v:SI 116 [ j ])
> (const_int 4 [0x4])))
> (reg/v/f:SI 114 [ a2 ]))
> (const_int 800 [0x320]))
>
> which has more complete info in the RTL and is much more canonicalized;
> later on it could open up more opportunities for CSE.
>
> The effect of this duplicated legitimization effort varies across different
> targets, as it is strongly related to the available addressing modes on a
> target. On RISC machines where in general there are fewer addressing modes
> (which are in general less complicated as well), the RTL code quality can be
> affected more adversely.
>
> The patch passes bootstrapping on arm and x86_64 and regtest on
> arm-none-eabi, aarch64-none-elf and x86_64. There is no regression in
> spec2000 on arm or x86_64.
>
> OK for the mainline?
But this patch makes the path through expansion even harder
to follow ... :/
I wonder if something along a "validate pass" after expansion
would be a better solution (and thus never validate during expansion
itself).
That is, expand is a beast - don't add to it, try to simplify it instead ;)
Thanks,
Richard.
> Thanks,
> Yufeng
>
>
> gcc/
>
> * cfgexpand.c (expand_call_stmt): Update the call to
> expand_expr_real_1.
> * expr.c (expand_assignment): Add new local variable validate_p and
> set
> it; call expand_expr or expand_expr_nv depending on validate_p for
> to_rtx; call adjust_address_1 instead of adjust_address.
> (store_expr): Update the call to expand_expr_real.
> (expand_expr_real): Add new parameter 'validate_p'; update the call
> to
> expand_expr_real_1.
> (expand_expr_real_1): Add new parameter 'validate_p'; update the
> call
> to expand_expr_real; depending on validate_p, call
> memory_address_addr_space or convert_memory_address_addr_space;
> likewise for expand_expr or expand_expr_nv; call adjust_address_1
> instead of adjust_address.
> * expr.h (expand_expr_real): Update.
> (expand_expr_real_1): Update.
> (expand_expr): Update.
> (expand_expr_nv): New function.
> (expand_normal): Update.