This is the mail archive of the
mailing list for the GCC project.
Re: [RFC/RFT] Improving SMS by data dependence export
- From: "Daniel Berlin" <dberlin at dberlin dot org>
- To: "Alexander Monakov" <amonakov at ispras dot ru>
- Cc: gcc.gcc.gnu.org <gcc at gcc dot gnu dot org>, "Revital1 Eres" <ERES at il dot ibm dot com>, "Andrey Belevantsev" <abel at ispras dot ru>, "Ayal Zaks" <zaks at il dot ibm dot com>
- Date: Fri, 7 Dec 2007 15:49:28 -0500
- Subject: Re: [RFC/RFT] Improving SMS by data dependence export
- References: <email@example.com>
On 12/7/07, Alexander Monakov <firstname.lastname@example.org> wrote:
> Attached is the patch that allows to save dependence info obtained on tree
> level by data-reference analysis for usage on RTL level (for RTL memory
> disambiguation and dependence graph construction for modulo scheduling).
> It helps for RTL disambiguation on platforms without base+offset memory
> addressing modes, and impact on SMS is described below. We would like to
> see it in 4.4 mainline.
> We have tested this patch with modulo scheduling on ia64, using SPEC
> CPU2000 benchmark suite. It allows to apply software pipelining to more
> loops, resulting in ~1-2% speedup (compared to SMS without exported
> info). The most frequent improvements are removal of cross-iteration
> memory dependencies, as currently SMS adds such dependencies for all pair
> of memory references, even in cases when they cannot alias (for example,
> for different arrays or different fields of a struct). As I understand,
> SMS does not use RTL alias analysis here because pairs that do not alias
> within one iteration, but may alias when cross-iteration movement is
> performed (like a[i] and a[i+1]), should be marked as dependent. So, SMS
> data dependence analysis can be greatly improved even without
> data-dependence export patch by using RTL-like memory disambiguation, but
> without pointer arithmetic analysis.
> There are currently two miscompiled SPEC tests with this patch; in one of
> them, the problem is related to generation of register moves in the
> prologue of software pipelined loop (which was not pipelined without the
> patch). The problem is reported and discussed with Revital Eres from IBM
> We would like to ask people interested in SMS performance on PowerPC and
> Cell SPU to conduct tests with this patch. Any feedback is greatly
I see a few random unrelated changes, like, for example:
if (may_eliminate_iv (data, use, cand, &bound))
- elim_cost = force_var_cost (data, bound, &depends_on_elim);
- /* The bound is a loop invariant, so it will be only computed
- once. */
- elim_cost /= AVG_LOOP_NITER (data->current_loop);
+ elim_cost = force_var_cost (data, bound, &depends_on_elim);
elim_cost = INFTY;
Please pull these out into separate patches or don't do them :)
also, i see
+ /* We do not use operand_equal_p for ORIG_EXPRs because we need to
+ distinguish memory references at different points of the loop (which
+ would have different indices in SSA form, like a[i_1] and a[i_2], but
+ were later rewritten to same a[i]). */
+ && (p->orig_expr == q->orig_expr));
This doesn't do enough to distinguish memory references at different
points of the loop, while also eliminating from consideration that
*are* the same.
What if they are regular old VAR_DECL?
This will still return true, but they may be different accesses at
different points in the loop.
In any case, this doesn't belong in mem_attrs_htab_eq, because if they
are operand_equal_p, for purposes of memory attributes, they *are*
equal. They may still be different accesses, which is something you
have to discover later on.
IE You should be doing this check somewhere else, not in a hashtable
equality function :)
DDR will mark them as data refs
> Alexander Monakov