This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: using scratchpads to enhance RTL-level if-conversion: revised patch


On 10/08/2015 01:29 AM, Abe wrote:
Attached please find my revised patch to the RTL if converter.  This
patch enables the
if-conversion of half-hammocks with a store in them that the internal
GCC machinery
otherwise considers too hazardous to if-convert.  This is made safe by
using the
"scratchpad" technique, i.e. throwing away the store into a safe
location where nothing
of any importance is currently stored.  The scratchpads are allocated in
the stack frame.

So, one conceptual issue first. Obviously this increases the size of the stack frame, which makes the transformation more expensive. The patch does not appear to attempt to estimate costs. However, why do we need to allocate anything in the first place? If you want to store something that will be thrown away, just pick an address below the stack pointer. I think some ports may need different strategies due to stack bias or red zones, so a target hook is in order, with one safe default to fail, and one default implementation that can be used by most targets, and then specialized versions in target-dependent code where necessary:

rtx
default_get_scratchpad_fail (HOST_WIDE_INT size)
{
  return NULL_RTX;
}

rtx
default_get_scratchpad (HOST_WIDE_INT size)
{
  /* Possibly also take STACK_BOUNDARY into account so as to not
     make unaligned locations.  */
  if (size >= param (SCRATCHPAD_MAX_SIZE))
    return NULL_RTX;
  return plus_constant (stack_pointer_rtx, gen_int_mode (-size, Pmode));
}

With that, I think all the code to keep track of scratchpads can just be deleted.

There's this preexisting comment:

/* ??? This is overconservative. Storing to two different mems is
as easy as conditionally computing the address. Storing to a
single mem merely requires a scratch memory to use as one of the
destination addresses; often the memory immediately below the
stack pointer is available for this. */

suggesting that it ought to be possible to generalize the technique to stores to different addresses.


To the patch itself. The code still has many stylistic problems and does not follow the required guidelines.

+#include "diagnostic-color.h"

Why?


-/* Return true if a write into MEM may trap or fault.  */
+/* Return true if a write into MEM may trap or fault
+   even in the presence of scratchpad support.  */


+/* Return true if a write into MEM may trap or fault
+   without scratchpad support.  */

Please explain the rationale for these changes. What exactly is different with scratchpads?

+	  /* The next "if": quoting "noce_emit_cmove":
+	       If we can't create new pseudos, though, don't bother.  */
+	  if (reload_completed)
+	    return FALSE;
+
+	  if (optimize<2)
+	    return FALSE;
+
+	  if (optimize_function_for_size_p (cfun))
+	    return FALSE;
+
+	  if (targetm.have_conditional_execution () || ! HAVE_conditional_move)
+	    return FALSE;

Merge the conditions into one if. Watch spacing around operators.

+
+	  const bool not_a_scratchpad_candidate =
+	    noce_mem_write_may_trap_or_fault_p_1 (orig_x);
+	  if (! not_a_scratchpad_candidate)

The = should start a line, but what you really should do is just put the condition into the if and eliminate the variable.

+	      const size_t size_of_MEM = MEM_SIZE (orig_x);

Identifiers are still too verbose. This is typically just called size, or memsize if there are other sizes to keep track of.

+
+		for (rtx_insn* insn = BB_HEAD (then_bb); insn && insn != insn_a
+		  && insn != BB_END (then_bb); insn=NEXT_INSN (insn))
+		  {
+		     if (! (NOTE_INSN_BASIC_BLOCK_P (insn) || DEBUG_INSN_P (insn)))

There are six different coding style violations in this block. Please identify and fix them (elsewhere as well). In addition, I think it would be better to start each part of the for statement on its own line for clarity.

I still need to figure out what is going on in this insn-copying loop.

> +		/* Done copying the needed insns between the start of the
> +		   THEN block and the set of 'a', if any.  */
> +
> +		if (CONSTANT_P (XEXP (cond, 0)) && CONSTANT_P (XEXP (cond, 1)))
> +		  {
> +		    end_sequence ();
> +		    return FALSE;
> +		  }

This should be done earlier before you go to the effort of copying insns.

+		MEM_NOTRAP_P (mem) = true;

So I'm still not entirely sure which cases you are trying to optimize and which ones not, but couldn't this technique allow a trapping store here?


Bernd


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]