[PATCH 11/11] Increase MAX_MAX_OPERANDS limit

Thu Jun 28 03:30:00 GMT 2018

On 06/23/2018 06:26 AM, Dimitar Dimitrov wrote:
> On Ð¿ÐµÑ‚ÑŠÐº, 22 ÑŽÐ½Ð¸ 2018 Ð³. 19:41:55 EEST Jakub Jelinek wrote:
>> On Fri, Jun 22, 2018 at 11:33:06AM -0600, Jeff Law wrote:
>>> On 06/13/2018 12:58 PM, Dimitar Dimitrov wrote:
>>>> The PRU load/store instructions can access memory with byte
>>>>
>>>> granularity for all 30 of its 32-bit GP registers. Examples:
>>>>    # Load 17 bytes from address r0[0] into registers r10.b1-r14.b2
>>>>    lbbo r10.b1, r0, 0, 17
>>>>    
>>>>    # Load 100 bytes from address r28[0] into registers r0-r25
>>>>    lbbo r0.b0, r28, 0, 100
>>>>
>>>> The load/store multiple patterns declare all subsequent registers
>>>> as distinct operands. Hence the need to increase the limit.
>>
>> Can't you have a look on how other targets, e.g. arm, aarch64, s390x
>> etc. handle load/store multiple patterns, e.g. with match_parallel or
>> match_par_dup?
>> The instructions then don't have dozens of operands, and the predicate
>> is just supposed to check everything is the way it should be.
> I took arm/ldmstm.md as an inspiration. See attached machine description for 
> PRU that requires the increase. I omitted this machine-generated MD file from 
> my first patch set, but per comments will include it in v2.
> 
> PRU has a total of 32 32-bit registers with flexible subregister addressing. 
> The PRU GCC port represents the register file as 128 individual 8-bit 
> registers. Rationale: http://gcc.gnu.org/ml/gcc/2017-01/msg00217.html
> 
> Load/store instructions can load anywhere between 1 and 124 consecutive 8-bit 
> registers. The load/store-multiple patterns seem to require const_int_operand 
> offsets for each loaded register, hence the explosion of operands.
> 
> I make no distintion for class - patterns accept any GP register.
Right, but is that level of generality really all that useful?  Based on
what I know about the PRU I'd probably stick mostly to 32bit registers
and only expose the byte level addressibility when it's clearly a win,
particularly for bitfield insertions/extractions.  I probably wouldn't
expose operations which cross 32bit boundaries, except perhaps for
arithmetic through the carry.

I guess my point is I'd like to see a stronger justification for
exposing this much of the architecture before bumping up the maximum
operand limits.

jeff