[PATCH v4 01/10] Initial TI PRU GCC port
Dimitar Dimitrov
dimitar@dinux.eu
Tue Sep 25 05:16:00 GMT 2018
On Monday, 9/24/2018 11:38:23 EEST Richard Sandiford wrote:
> Dimitar Dimitrov <dimitar@dinux.eu> writes:
> > On Thursday, 9/13/2018 13:02:21 EEST Richard Sandiford wrote:
> >> Dimitar Dimitrov <dimitar@dinux.eu> writes:
> >> > +/* Callback for walk_gimple_seq that checks TP tree for TI ABI
> >> > compliance. */ +static tree
> >> > +check_op_callback (tree *tp, int *walk_subtrees, void *data)
> >> > +{
> >> > + struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
> >> > +
> >> > + if (RECORD_OR_UNION_TYPE_P (*tp) || TREE_CODE (*tp) ==
> >> > ENUMERAL_TYPE)
> >> > + {
> >> > + /* Forward declarations have NULL tree type. Skip them. */
> >> > + if (TREE_TYPE (*tp) == NULL)
> >> > + return NULL;
> >> > + }
> >> > +
> >> > + /* TODO - why C++ leaves INTEGER_TYPE forward declarations around?
> >> > */
> >> > + if (TREE_TYPE (*tp) == NULL)
> >> > + return NULL;
> >> > +
> >> > + const tree type = TREE_TYPE (*tp);
> >> > +
> >> > + /* Direct function calls are allowed, obviously. */
> >> > + if (TREE_CODE (*tp) == ADDR_EXPR && TREE_CODE (type) ==
> >> > POINTER_TYPE)
> >> > + {
> >> > + const tree ptype = TREE_TYPE (type);
> >> > + if (TREE_CODE (ptype) == FUNCTION_TYPE)
> >> > + return NULL;
> >> > + }
> >>
> >> This seems like a bit of a dangerous heuristic. Couldn't it also cause
> >>
> >> us to skip things like:
> >> (void *) func
> >>
> >> ? (OK, that's dubious C, but we do support it.)
> >
> > The cast yields a "data pointer", which is 32 bits for both types of ABI.
> > Hence it is safe to skip "(void *) func".
> >
> > The TI ABI's 16 bit function pointers become a problem when they change
> > the
> > layout of structures and function argument registers.
>
> OK. The reason this stood out is that the code doesn't obviously match
> the comment. If the code is just trying to skip direct function calls,
> I think the gcall sequence I mentioned would be more obvious, and would
> match the existing comment. If anything that takes the address of a
> function is OK then it might be worth changing the comment to include that.
I will use your suggested gcall snippet since it is safe and obvious. The
comment matches my original intent.
> >> > +/* Implement TARGET_PREFERRED_RELOAD_CLASS. */
> >> > +static reg_class_t
> >> > +pru_preferred_reload_class (rtx x ATTRIBUTE_UNUSED, reg_class_t
> >> > regclass)
> >> > +{
> >> > + return regclass == NO_REGS ? GENERAL_REGS : regclass;
> >> > +}
> >>
> >> This looks odd: PREFERRED_RELOAD_CLASS should return a subset of the
> >> original class rather than add new registers. Can you remember why
> >> it was needed?
> >
> > I'm not sure what target code is supposed to return when NO_REGS is passed
> > here. I saw how other ports handle NO_REGS (c-sky, m32c, nios2, rl78). So
> > I
> > followed suite.
> >
> > Here is a backtrace of LRA when NO_REGS is passed:
> > 0xf629ae pru_preferred_reload_class
> >
> > ../../gcc/gcc/config/pru/pru.c:788
> >
> > 0xa3d6e8 process_alt_operands
> >
> > ../../gcc/gcc/lra-constraints.c:2600
> >
> > 0xa3ef7d curr_insn_transform
> >
> > ../../gcc/gcc/lra-constraints.c:3889
> >
> > 0xa4301e lra_constraints(bool)
> >
> > ../../gcc/gcc/lra-constraints.c:4906
> >
> > 0xa2726c lra(_IO_FILE*)
> >
> > ../../gcc/gcc/lra.c:2446
> >
> > 0x9c97a9 do_reload
> >
> > ../../gcc/gcc/ira.c:5469
> >
> > 0x9c97a9 execute
> >
> > ../../gcc/gcc/ira.c:5653
>
> I think it should just pass NO_REGS through unmodified (which means
> spill to memory). In some ways it would be good if it didn't have to
> handle this case, but again that's going to be work to fix.
>
> The RA will pass ALL_REGS if it can handle any register, and wants to
> know what the target would prefer. But for any input, the hook needs
> to stick within the registers it was given.
Thanks for the clarification. I will remove the PRU hook and will rely on the
default implementation (i.e. return the given rclass).
> >> > +; LRA cannot cope with clobbered op2, hence the scratch register.
> >> > +(define_insn "ashr<mode>3"
> >> > + [(set (match_operand:QISI 0 "register_operand" "=&r,r")
> >> > + (ashiftrt:QISI
> >> > + (match_operand:QISI 1 "register_operand" "0,r")
> >> > + (match_operand:QISI 2 "reg_or_const_1_operand" "r,P")))
> >> > + (clobber (match_scratch:QISI 3 "=r,X"))]
> >> > + ""
> >> > + "@
> >> > + mov %3, %2\;ASHRLP%=:\;qbeq ASHREND%=, %3, 0\; sub %3, %3, 1\;
> >> > lsr\\t%0, %0, 1\; qbbc ASHRLP%=, %0, (%S0 * 8) - 2\; set %0, %0, (%S0 *
> >> > 8) - 1\; jmp ASHRLP%=\;ASHREND%=: + lsr\\t%0, %1, 1\;qbbc LSIGN%=,
> >> > %0,
> >> > (%S0 * 8) - 2\;set %0, %0, (%S0 * 8) - 1\;LSIGN%=:" + [(set_attr
> >> > "type"
> >> > "complex,alu")
> >> > + (set_attr "length" "28,4")])
> >>
> >> What do you mean by LRA not coping? What did you try originally and
> >> what went wrong?
> >
> > Better assembly could be generated if the shift count register was used
> > for a>
> > loop counter. Pseudo code:
> > while (op2--)
> >
> > op0 >>= 1;
> >
> > I originally tried to clobber operand 2:
> > (define_insn "ashr<mode>3"
> >
> > [(set (match_operand:QISI 0 "register_operand" "=&r,r")
> >
> > (ashiftrt:QISI
> >
> > (match_operand:QISI 1 "register_operand" "0,r")
> > (match_operand:QISI 2 "reg_or_const_1_operand" "+r,P"))
> >
> > )]
> >
> > But with the above pattern operand 2 was not clobbered. Its value was
> > deemed untouched (i.e. live across the pattern). So I came up with
> > clobbering a separate register to fix this, at the expense of slightly
> > bigger generated code.
>
> Ah, yeah, "+" has to be on an output operand (the destination of a set
> or clobber).
>
> In this kind of situation you might be better off expanding the loop
> as individual RTL instructions, which is what several targets do for
> things like memmove. But if you'd prefer to keep a single pattern,
> you could use:
>
> [(set (match_operand:QISI 0 "register_operand" "=r")
> (ashiftrt:QISI
> (match_operand:QISI 2 "register_operand" "0")
> (match_operand:QISI 3 "register_operand" "1")))
> (set (match_operand:QISI 1 "register_operand" "=r")
> (const_int -1))]
>
> (assuming the optimised loop does leave operand 2 containing all-1s).
>
> The shift by 1 would then need to be a separate pattern, with the
> define_expand choosing between them.
>
Expanding the loop works like a charm. Thanks for the tip. I will include the
rework in the next patch revision.
> >> Despite the long reply (sorry), this looks in really good shape to me
> >> FWIW.
> >> Thanks for the submission.
> >
> > I take this sentence as an indication that the PRU port is welcome for
> > inclusion into GCC. I'll work on fixing the comments and will send a new
> > patch version.
>
> Well, process-wise, it's up to the steering committee to decide whether
> to accept the port. But one of the main requirements is of course
> reviewing the patches.
>
> Which of the other patches in the series still need review? I think Jeff
> approved some of them earlier.
Rest of patches are approved:
https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01448.html
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01152.html
Thanks,
Dimitar
More information about the Gcc-patches
mailing list