[PATCH v4 01/10] Initial TI PRU GCC port

Dimitar Dimitrov dimitar@dinux.eu
Tue Sep 25 05:16:00 GMT 2018


On Monday, 9/24/2018 11:38:23 EEST Richard Sandiford wrote:
> Dimitar Dimitrov <dimitar@dinux.eu> writes:
> > On Thursday, 9/13/2018 13:02:21 EEST Richard Sandiford wrote:
> >> Dimitar Dimitrov <dimitar@dinux.eu> writes:
> >> > +/* Callback for walk_gimple_seq that checks TP tree for TI ABI
> >> > compliance.  */ +static tree
> >> > +check_op_callback (tree *tp, int *walk_subtrees, void *data)
> >> > +{
> >> > +  struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
> >> > +
> >> > +  if (RECORD_OR_UNION_TYPE_P (*tp) || TREE_CODE (*tp) ==
> >> > ENUMERAL_TYPE)
> >> > +    {
> >> > +      /* Forward declarations have NULL tree type.  Skip them.  */
> >> > +      if (TREE_TYPE (*tp) == NULL)
> >> > +	return NULL;
> >> > +    }
> >> > +
> >> > +  /* TODO - why C++ leaves INTEGER_TYPE forward declarations around? 
> >> > */
> >> > +  if (TREE_TYPE (*tp) == NULL)
> >> > +    return NULL;
> >> > +
> >> > +  const tree type = TREE_TYPE (*tp);
> >> > +
> >> > +  /* Direct function calls are allowed, obviously.  */
> >> > +  if (TREE_CODE (*tp) == ADDR_EXPR && TREE_CODE (type) ==
> >> > POINTER_TYPE)
> >> > +    {
> >> > +      const tree ptype = TREE_TYPE (type);
> >> > +      if (TREE_CODE (ptype) == FUNCTION_TYPE)
> >> > +	return NULL;
> >> > +    }
> >> 
> >> This seems like a bit of a dangerous heuristic.  Couldn't it also cause
> >> 
> >> us to skip things like:
> >>    (void *) func
> >> 
> >> ?  (OK, that's dubious C, but we do support it.)
> > 
> > The cast yields a "data pointer", which is 32 bits for both types of ABI.
> > Hence it is safe to skip "(void *) func".
> > 
> > The TI ABI's 16 bit function pointers become a problem when they change
> > the
> > layout of structures and function argument registers.
> 
> OK.  The reason this stood out is that the code doesn't obviously match
> the comment.  If the code is just trying to skip direct function calls,
> I think the gcall sequence I mentioned would be more obvious, and would
> match the existing comment.  If anything that takes the address of a
> function is OK then it might be worth changing the comment to include that.
I will use your suggested gcall snippet since it is safe and obvious. The 
comment matches my original intent.


> >> > +/* Implement TARGET_PREFERRED_RELOAD_CLASS.  */
> >> > +static reg_class_t
> >> > +pru_preferred_reload_class (rtx x ATTRIBUTE_UNUSED, reg_class_t
> >> > regclass)
> >> > +{
> >> > +  return regclass == NO_REGS ? GENERAL_REGS : regclass;
> >> > +}
> >> 
> >> This looks odd: PREFERRED_RELOAD_CLASS should return a subset of the
> >> original class rather than add new registers.  Can you remember why
> >> it was needed?
> > 
> > I'm not sure what target code is supposed to return when NO_REGS is passed
> > here. I saw how other ports handle NO_REGS (c-sky, m32c, nios2, rl78). So
> > I
> > followed suite.
> > 
> > Here is a backtrace of LRA when NO_REGS is passed:
> > 0xf629ae pru_preferred_reload_class
> > 
> >         ../../gcc/gcc/config/pru/pru.c:788
> > 
> > 0xa3d6e8 process_alt_operands
> > 
> >         ../../gcc/gcc/lra-constraints.c:2600
> > 
> > 0xa3ef7d curr_insn_transform
> > 
> >         ../../gcc/gcc/lra-constraints.c:3889
> > 
> > 0xa4301e lra_constraints(bool)
> > 
> >         ../../gcc/gcc/lra-constraints.c:4906
> > 
> > 0xa2726c lra(_IO_FILE*)
> > 
> >         ../../gcc/gcc/lra.c:2446
> > 
> > 0x9c97a9 do_reload
> > 
> >         ../../gcc/gcc/ira.c:5469
> > 
> > 0x9c97a9 execute
> > 
> >         ../../gcc/gcc/ira.c:5653
> 
> I think it should just pass NO_REGS through unmodified (which means
> spill to memory).  In some ways it would be good if it didn't have to
> handle this case, but again that's going to be work to fix.
> 
> The RA will pass ALL_REGS if it can handle any register, and wants to
> know what the target would prefer.  But for any input, the hook needs
> to stick within the registers it was given.

Thanks for the clarification. I will remove the PRU hook and will rely on the 
default implementation (i.e. return the given rclass).

> >> > +; LRA cannot cope with clobbered op2, hence the scratch register.
> >> > +(define_insn "ashr<mode>3"
> >> > +  [(set (match_operand:QISI 0 "register_operand"	    "=&r,r")
> >> > +	  (ashiftrt:QISI
> >> > +	    (match_operand:QISI 1 "register_operand"	    "0,r")
> >> > +	    (match_operand:QISI 2 "reg_or_const_1_operand"  "r,P")))
> >> > +   (clobber (match_scratch:QISI 3			    "=r,X"))]
> >> > +  ""
> >> > +  "@
> >> > +   mov %3, %2\;ASHRLP%=:\;qbeq ASHREND%=, %3, 0\; sub %3, %3, 1\;
> >> > lsr\\t%0, %0, 1\; qbbc ASHRLP%=, %0, (%S0 * 8) - 2\; set %0, %0, (%S0 *
> >> > 8) - 1\; jmp ASHRLP%=\;ASHREND%=: +   lsr\\t%0, %1, 1\;qbbc LSIGN%=,
> >> > %0,
> >> > (%S0 * 8) - 2\;set %0, %0, (%S0 * 8) - 1\;LSIGN%=:" +  [(set_attr
> >> > "type"
> >> > "complex,alu")
> >> > +   (set_attr "length" "28,4")])
> >> 
> >> What do you mean by LRA not coping?  What did you try originally and
> >> what went wrong?
> > 
> > Better assembly could be generated if the shift count register was used
> > for a> 
> > loop counter. Pseudo code:
> >    while (op2--)
> >    
> >      op0 >>= 1;
> > 
> > I originally tried to clobber operand 2:
> >  (define_insn "ashr<mode>3"
> >  
> >    [(set (match_operand:QISI 0 "register_operand" "=&r,r")
> >    
> >          (ashiftrt:QISI
> >          
> >            (match_operand:QISI 1 "register_operand" "0,r")
> >            (match_operand:QISI 2 "reg_or_const_1_operand" "+r,P"))
> >           
> >           )]
> > 
> > But with the above pattern operand 2 was not clobbered. Its value was
> > deemed untouched (i.e. live across the pattern). So I came up with 
> > clobbering a separate register to fix this, at the expense of slightly
> > bigger generated code.
> 
> Ah, yeah, "+" has to be on an output operand (the destination of a set
> or clobber).
> 
> In this kind of situation you might be better off expanding the loop
> as individual RTL instructions, which is what several targets do for
> things like memmove.  But if you'd prefer to keep a single pattern,
> you could use:
> 
>   [(set (match_operand:QISI 0 "register_operand" "=r")
> 	(ashiftrt:QISI
> 	  (match_operand:QISI 2 "register_operand" "0")
> 	  (match_operand:QISI 3 "register_operand" "1")))
>    (set (match_operand:QISI 1 "register_operand" "=r")
> 	(const_int -1))]
> 
> (assuming the optimised loop does leave operand 2 containing all-1s).
> 
> The shift by 1 would then need to be a separate pattern, with the
> define_expand choosing between them.
> 

Expanding the loop works like a charm. Thanks for the tip. I will include the 
rework in the next patch revision.


> >> Despite the long reply (sorry), this looks in really good shape to me
> >> FWIW.
> >> Thanks for the submission.
> > 
> > I take this sentence as an indication that the PRU port is welcome for
> > inclusion into GCC. I'll work on fixing the comments and will send a new
> > patch version.
> 
> Well, process-wise, it's up to the steering committee to decide whether
> to accept the port.  But one of the main requirements is of course
> reviewing the patches.
> 
> Which of the other patches in the series still need review?  I think Jeff
> approved some of them earlier.
Rest of patches are approved:
https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01448.html
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01152.html

Thanks,
Dimitar




More information about the Gcc-patches mailing list