This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug rtl-optimization/29349] New: mode switching is inefficient both at compile time and at run time


Although my analysis of the i386 mode switching in PR 28764 was incorrect
(see PR target/29347), in my analysis in commnent #13 of PR target/28764
still stands with regard to superflous/unnecessarily repeated computation
that we do in the mode switching code, and additional optimizations that could
be done:

- make_preds_opaque is not needed.
- the pre_edge_lcm computation for all modes of an entity can be done
  simultanously.  Since pre_edge_lcm will not insert additional computations
  into a path, and different modes of the same entity kill each other, it
  won't result in multiple modes of the same entity being live at the same
  point.  Hence, a single pre_edge_lcm call is sufficient.
- The SH fpchg is currently not used.  What we are missing is knowledge
  when we have a transition from one known mode to another.  We can get this
  by adding new array pointer pointer parameters to pre_edge_lcm so that at
  the end, if these parameters are non-zero, instead of freeing the avin /
  avout arrays, it will assign the pointers to these arrays to the
  pointers pointed to be the new parameters.
  avin is useful for exception handlers that nominally start out in no_mode.
  If avin is set for the same mode as for the mode required at the start
  of an exception handler, the mode setting can be elided; this can be done
  in the machine-dependent code.  If avin is set for the opposite mode,
  fpchg can be used.  This can be implemented by making the
  machine-independent code pass any mode known from the matching avin
  as a parameter to EMIT_MODE_SET.
  Likewise, avout can be used to determine any known mode for an edge
  insertion.
- while pre_edge_lcm won't insert calculations in paths where they have not
  been originally, doing so can be beneficial considering the probability of
  paths and the different costs of set from a known mode as compared to a set
  from an unknown mode.
  For any transparent sub-graph, considering a mode anticipatable
  in this graph is beneficial if the sum of the frequencies of outgoing edges
  on which the mode is anticipatable, times the cost of setting from an
  unknown mode, plus the sum of the frequencies of outgoing edges on which any
  other mode is anticipatable, times the cost savings when setting from a
  known mode rather than an unknown mode, is larger than the sum of the
  products of frequencies of incoming edges on which the expression is
  not available, time the cost of setting the mode on this edge (considering
  any other available mode known there).

  So the main problem for a code motion pass that takes edge frequencies into
  account appears to be to find heuristics how to identify transparent
subgraphs
  which should be checked - a naiive algorithm to try all subsets of nodes
would
  be exponential. On the other hand, checking only subgraphs which are
  interconnected as a list and have no connection to other transparent nodes
  except in head and tail, and could be expanded at their tail (alternatively:
  at their head) only in the form of a longer list, can be done with a greedy
  algorithm which checks only O(number of nodes) nodes. (Traverse all nodes to
  find nodes which are tail / head of maximum lists, drop nodes at the bottom
if
  it increases the tally, for a positive tally, if adding a new node would
  decrease it, remember current list.  When current tally becomes negative or
  the list can't be extended, check for remembered list.)


-- 
           Summary: mode switching is inefficient both at compile time and
                    at run time
           Product: gcc
           Version: 4.2.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, compile-time-hog
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: amylaar at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29349


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]