This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass


The test case present a real embedded case, yes it is not functional 
benchmark, it is a test case, but on our days, most of the MCU's for IoT 
devices acts as controllers and less as computing/algorithm units. This 
days, functional algorithms are done on HW and managed by SW with 
snippet as the giving test-case.
This means that SW that accessing the memory on the SoC, for managing 
some HW feature, is commonly used. Therefor the benefit in this pass 
will serve everyone.

We (WD) will obviously support it as we are part of this joint effort to 
drive better density. As a next phase, we can improve this, taking Jim 
comments among others.
Also, as Jim said this, this is a simple case and should not cause 
maintenance problems.

On 31/10/2019 12:18, Andrew Waterman wrote:
> Nevertheless, as Jim observed, it's a great burden on the RISC-V
> backend maintainers to support all these passes.  Are you saying WD is
> willing to put its money where its mouth is and will provide active
> support for these passes?
>
> On Thu, Oct 31, 2019 at 2:42 AM Nidal Faour <Nidal.Faour@wdc.com> wrote:
>> A tests case for this patch has been written and pushed to WD github repository at the following link:
>> https://github.com/westerndigitalcorporation/riscv32-Code-density-test-bench/tree/master/common_load_store_base_address_offset
>>
>> the test case itself has been written based on a real product scenario.
>> We got a saving of about 10% in code size of the test itself.
>>
>> Best Regards,
>>
>> Nidal Faour
>> Staff Engineer, R&D Engineering – Firmware & Toolchain, CTO Group
>>
>> Western Digital®
>> Migdal Tefen 24959, P.O Box 3
>> Email: nidal.faour@wdc.com
>> Office: +972-4-9078756
>> Mobile: +972-50-8867756
>>
>>> -----Original Message-----
>>> From: Jim Wilson <jimw@sifive.com>
>>> Sent: Thursday, 31 October 2019 1:57
>>> To: Craig Blackmore <craig.blackmore@embecosm.com>
>>> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Ofer Shinaar
>>> <Ofer.Shinaar@wdc.com>; Nidal Faour <Nidal.Faour@wdc.com>; Kito Cheng
>>> <kito.cheng@gmail.com>; Jeff Law <law@redhat.com>
>>> Subject: Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass
>>>
>>> CAUTION: This email originated from outside of Western Digital. Do not click
>>> on links or open attachments unless you recognize the sender and know that
>>> the content is safe.
>>>
>>>
>>> On Fri, Oct 25, 2019 at 10:40 AM Craig Blackmore
>>> <craig.blackmore@embecosm.com> wrote:
>>>> This patch aims to allow more load/store instructions to be compressed
>>>> by replacing a load/store of 'base register + large offset' with a new
>>>> load/store of 'new base + small offset'. If the new base gets stored
>>>> in a compressed register, then the new load/store can be compressed.
>>>> Since there is an overhead in creating the new base, this change is
>>>> only attempted when 'base register' is referenced in at least 4 load/stores in
>>> a basic block.
>>>> The optimization is implemented in a new RISC-V specific pass called
>>>> shorten_memrefs which is enabled for RVC targets. It has been
>>>> developed for the 32-bit lw/sw instructions but could also be extended to
>>> 64-bit ld/sd in future.
>>>
>>> The fact that this needs 4 load/stores in a block with the same base address
>>> means it won't trigger very often.  But it does seem to work.
>>> I see about a 0.05% code size reduction for a rv32gc newlib nano libc.a.  There
>>> might be other codes that benefit more.
>>>
>>> I'm concerned about the number of RISC-V specific optimization passes
>>> people are writing.  I've seen at least 3 so far.  This one is at least small and
>>> simple enough to understand that it doesn't look like it will cause maintenance
>>> problems.
>>>
>>> The config.gcc change conflicts with the save-restore optimization pass that
>>> Andrew Burgess added, but that is a trivial fix.
>>>
>>> The code only works for 32-bit load/stores.  rv64 has compressed 64-bit
>>> load/stores, and the F and D extensions have float and double compressed
>>> loads and stores.  The patch would be more useful if it handled all of these.
>>>
>>> The patch doesn't check the other operand, it only looks at the memory
>>> operand.  This results in some cases where the code rewrites instructions that
>>> can never be compressed.  For instance, given void store1z (int *array) {
>>>   array[200] = 0;
>>>   array[201] = 0;
>>>   array[202] = 0;
>>>   array[203] = 0;
>>> }
>>> the patch increases code size by 4 bytes because it rewrites the stores, but we
>>> don't have compressed store 0, and x0 is not a compressed reg, so there is no
>>> point in rewriting the stores.  I can also create examples that show a size
>>> increase with loads but it is a little trickier.
>>> extern int sub1 (int, int, int, int, int, int, int); int load1a (int a0, int a1, int a2, int
>>> a3, int a4, int *array) {
>>>   int a = 0;
>>>   a += array[200];
>>>   a += array[201];
>>>   a += array[202];
>>>   a += array[203];
>>>   return sub1 (a0, a1, a2, a3, a4, 0, a); } The register allocator will pass a0-a4
>>> directly through from args to the call, leaving only a5-a7 for the load base
>>> address and dest, and only one of those regs is a compressed reg, but we
>>> need two, so these loads can't be compressed.  The code still gets rewritten,
>>> resulting in a size increase for the extra add.  Not sure if you can do anything
>>> about that though, since you are doing this before register allocation.
>>>
>>> It isn't clear that the change riscv_address_cost is for.  That should be
>>> commented.
>>>
>>> I'd suggest parens in the riscv_compressed_lw_offset_p return statement,
>>> that makes it easier to align the code correctly (in emacs at least).
>>>
>>> The RISCV_MAX_COMPRESSED_LW_OFFSET should be moved down to the
>>> "ISA constants" section of riscv.h, and maybe renamed similar to the other
>>> constants.
>>>
>>> There is a REG_P check in get_si_mem_reg.  That should probably handle
>>> SUBREGs too.
>>>
>>> A testcase to verify the patch would be nice.  I have one I wrote for testing
>>> that shows some tests that work, some tests that don't work, and some that
>>> work but maybe shouldn't.
>>>
>>> I vaguely remember Micheal Meissner talking about doing something similar
>>> for PPC in a GNU Cauldron maybe 2 years ago.  But I don't know if he tried, or
>>> how far he got if he did try.  It might be useful to ask him about that work.
>>>
>>> Otherwise this looks OK to me.
>>>
>>> Jim



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]