This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Use simplify_replace_rtx rather than wrap_constant


Bernd Schmidt <bernds_cb1@t-online.de> writes:
> Richard Sandiford wrote:
>
>> Do you think the changes are unsafe, or are you worried about the
>> performance impact?
>
> The latter (performance and memory usage), and I don't like the idea of
> having this change buried in a larger patch.  So, if the other changes
> depend on it, split it off and make sure it gets in first.
>
> I'd also like more information about which existing places you found
> that made you think this change is needed (wrt. MEMs or possibly invalid
> sharing).

OK, well, let's go through each caller in turn:

fwprop.c:forward_propagate_and_simplify()

    Here we're replacing a MEM with a constant.  The extra rtx_equal_p()
    is safe but redundant.  The copy_rtx() will be a no-op for sharable
    constants, but strictly speaking, it might be needed for unsharable
    constants.

gcse.c:try_replace_reg()

    Here's we're replacing a register with either a new register or
    a gcse_constant_p().  The extra rtx_equal_p() is an operational
    no-op because simplify_replace_rtx() already uses rtx_equal_p()
    for registers.  The copy_rtx() will be a no-op for registers.

    The gcse_constant_p() case is interesting because we have:

      /* Determine whether the rtx X should be treated as a constant for
         the purposes of GCSE's constant propagation.  */

      static bool
      gcse_constant_p (const_rtx x)
      {
        /* Consider a COMPARE of two integers constant.  */
        if (GET_CODE (x) == COMPARE
            && CONST_INT_P (XEXP (x, 0))
            && CONST_INT_P (XEXP (x, 1)))
          return true;

        /* Consider a COMPARE of the same registers is a constant
           if they are not floating point registers.  */
        if (GET_CODE(x) == COMPARE
            && REG_P (XEXP (x, 0)) && REG_P (XEXP (x, 1))
            && REGNO (XEXP (x, 0)) == REGNO (XEXP (x, 1))
            && ! FLOAT_MODE_P (GET_MODE (XEXP (x, 0)))
            && ! FLOAT_MODE_P (GET_MODE (XEXP (x, 1))))
          return true;

        /* Since X might be inserted more than once we have to take care that it
           is sharable.  */
        return CONSTANT_P (x) && (GET_CODE (x) != CONST || shared_const_p (x));
      }

    The first two cases look a little hackish, but strictly speaking,
    those COMPAREs aren't shareable.  We would need the copy_rtx()
    for them.  (Or I suppose we could just say "well, it shouldn't
    matter if we share those particular unsharable expressions,
    because they'll probably be reduced or rejected".  But that isn't
    being correct by design.)

    Also note that the comment above try_replace_reg() says:

      /* Try to replace all non-SET_DEST occurrences of FROM in INSN with TO.
         Returns nonzero is successful.  */

    so there's no guarantee that FROM occurs only once.

    In the usual case, we only consider sharable constants to be
    gcse_constant_p(), so the copy_rtx() will be a no-op.  But
    try_simplify_rtx() nevertheless has:

      /* Usually we substitute easy stuff, so we won't copy everything.
         We however need to take care to not duplicate non-trivial CONST
         expressions.  */
      to = copy_rtx (to);

    TBH, I think this is exactly the sort of thing that happens when you
    leave it up to the caller rather than the callee to handle sharing
    (which is after all a correctness decision).  Optimisation decisions
    like gcse_constant_p() shouldn't be affected by such an internal
    decision as sharability.

gcse.c:cprop_jump()

    The second call replaces a register with a gcse_constant_p(),
    so this case is the same as try_replace_reg().

    The first call replaces a setcc destination with a setcc source
    pattern, or with a REG_EQUAL note, if present.  This can create
    invalid sharing in cases where no REG_EQUAL note is present.
    There's also no guarantee that the register is used only once in the
    jump, although that is of course the usual case.  (Or again we could
    just say "well, it shouldn't matter if we share those particular
    expressions, because they'll probably be reduced or rejected".
    But that again isn't being correct by design.)

gcse.c:bypass_block()

    Like cprop_jump(), the first call replaces a setcc destination
    with a setcc source, while the second call replaces a register
    with a gcse_constant_p().  The same analysis applies.

loop-iv.c:replace_single_def_regs()

    This replaces a register with a function_invariant_p().
    These invariants include (plus (fp|ap) (const_int ...)), which isn't
    sharable, so a copy_rtx() would be needed here for each use of
    the register.  (Or each use minus 1 if we can guarantee that the
    original value is no longer needed.)  Again, there is no guarantee
    that the "from" register is used only once in the argument to
    simplify_replace_rtx(), so the decision shouldn't be left up
    to the caller.

loop-iv.c:replace_in_expr()

    This replaces a register with a simple_rhs_p().  These simple rhses
    include various kinds of unsharable binary operation, so the same
    copying requirements apply as for replace_single_def_regs().
    
loop-iv.c:implies_p()

    This function uses simplify_replace_rtx() to see whether something
    simplifies to "true".  Sharing isn't a concern here, but the
    function can create plenty of throw-away rtx when used in
    this way, regardless of whether we use copy_rtx().

loop-iv.c:simplify_using_condition()

    This function replaces a REG with a CONSTANT_P.  Some CONSTANT_Ps
    aren't sharable, so the copy here could be needed.  If the users
    of the returned value are mindful that the returned value could
    have invalid sharing, then I think there should be a comment
    to say so.  Otherwise the same "correctness by design" concerns
    apply as before.

In short, I think implies_p() is the only case where both the following apply:

  (a) copy_rtx() might create new rtl
  (b) the creation of that new rtl is always unnecessary

And like I say, the creation of _any_ new rtl in the implies_p() case
is unnecessary, so I don't think it's a good example.

Really, I think any assumption along the lines of "this sharing isn't
strictly correct, but it shouldn't matter in this case" is as bad as
the problem I'm trying to fix.  It feels like premature optimisation.

Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]