This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch 0/2] PR49847: Add hook to place read-only lookup-tables in named address-space
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Georg-Johann Lay <avr at gjlay dot de>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, Denis Chertykov <chertykov at gmail dot com>
- Date: Fri, 28 Jul 2017 13:14:58 +0200
- Subject: Re: [patch 0/2] PR49847: Add hook to place read-only lookup-tables in named address-space
- Authentication-results: sourceware.org; auth=none
- References: <0c3f3987-38d4-6723-4c54-d183d2e65106@gjlay.de> <CAFiYyc2gR=DUpU_2xzkZMgwXYTDh0x8K71UfhBueVMhLUasP=w@mail.gmail.com> <2e179ada-1252-fbd7-00f4-aca8acef42c3@gjlay.de> <CAFiYyc3Nk6t4=So33zDck+dAfoi921MPkfXu7BfRdurh+ApFwA@mail.gmail.com> <597B0F82.5070000@gjlay.de>
On Fri, Jul 28, 2017 at 12:18 PM, Georg-Johann Lay <avr@gjlay.de> wrote:
> Richard Biener schrieb:
>
>> On Thu, Jul 27, 2017 at 3:32 PM, Georg-Johann Lay <avr@gjlay.de> wrote:
>>>
>>> On 27.07.2017 14:34, Richard Biener wrote:
>>>>
>>>> On Thu, Jul 27, 2017 at 2:29 PM, Georg-Johann Lay <avr@gjlay.de> wrote:
>>>>>
>>>>> For some targets, the best place to put read-only lookup tables as
>>>>> generated by -ftree-switch-conversion is not the generic address space
>>>>> but some target specific address space.
>>>>>
>>>>> This is the case for AVR, where for most devices .rodata must be
>>>>> located in RAM.
>>>>>
>>>>> Part #1 adds a new, optional target hook that queries the back-end
>>>>> about the desired address space. There is currently only one user:
>>>>> tree-switch-conversion.c::build_one_array() which re-builds value_type
>>>>> and array_type if the address space returned by the backend is not
>>>>> the generic one.
>>>>>
>>>>> Part #2 is the AVR part that implements the new hook and adds some
>>>>> sugar around it.
>>>>
>>>>
>>>> Given that switch-conversion just creates a constant initializer doesn't
>>>> AVR
>>>> benefit from handling those uniformly (at RTL expansion?). Not sure but
>>>> I think it goes through the regular constant pool handling.
>>>>
>>>> Richard.
>>>
>>>
>>> avr doesn't use constant pools because they are not profitable for
>>> several reasons.
>>>
>>> Moreover, it's not sufficient to just patch the pool's section, you'd
>>> also have to patch *all* accesses to also use the exact same address
>>> space so that they use the correct instruction (LPM instead of LD).
>>
>>
>> Sure. So then not handle it in constant pool handling but in expanding
>> the initializers of global vars. I think the entry for this is
>> varasm.c:assemble_variable - that sets DECL_RTL for the variable.
>
>
> Expand and RTL is too late. The CSWTCH tables are created by some tree
> pass, and it also generated accesses which use the address-space of the
> element types. If any tree variable or copy thereof uses generic space,
> then you will get wrong code. In the case of CSWTCH, you must get the
> AS into value_type as noted by Steven
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49857#c3
>
>>> Loop optimization, for example, may move addr-space pointers out of
>>> loops and actually does this for some CSWTCH tables. I didn't look
>>> into pool handling, but don't expect it allows to consistently
>>> patch all accesses in the aftermath.
>>>
>>> *If* that's possible, then it should also be possible to patch vtables
>>> and all of their accesses, aka.
>>>
>>> https://gcc.gnu.org/PR43745
>>>
>>> e.g. in a target-specific tree pass?
>>
>>
>> Ah, ok. I see what you mean. Once the address of a global var is
>> taken the pointers refering to it have to use the proper address-space.
>
>
> Yes, and all copies of these pointers etc. generated so far. That's the
> point of address spaces. The binary representation of a pointer may
> be exact the same for different ASes, but still accesses have to be
> handled differently: Different legitimate addresses, different
> instructions to access through these pointers, etc. Just converting
> a pointer to a different AS will give you wrong code.
>
>> So yeah, it looks like this would need to be done in a (target specific)
>> pass, probably an IPA pass in case the variables are used from multiple
>> functions.
>
>
> It would at least be an ABI change. If different modules are accessing
> the vtable, then they must use the same AS. Having a copy of the vtable
> in each module is pointless: If these modules don't agree about the AS
> and host vtables in different ASes, we lost from the optimization point:
> one module would have it in flash (consuming flash) and a different
> module would have it in RAM (consumes RAM and flash to initialize that
> RAM at startup). So this would not use 1x flash as intended, and not
> 1x flash + 1x RAM like in the non optimized version, but 2x flash +
> 1x RAM. Some comdat won't help because that gives wrong code.
>
>>> With vtables it's basically the same problem, you'd have to patch *all*
>>> accesses to match the vtable's address-space. c++ need not to expose
>>> address-spaces to the application, handling address-spaces internally
>>> would be sufficient.
>>
>>
>> The IPA pass should be able to handle this transparently (vtables are just
>> global decls with initializers).
>
>
> Global is bad. If the address of the object escapes, that's be wrong
> code because location + all accesses must agree about the AS.
>
>> Of course the question is whether the linker can handle relocations
>> between address-spaces for example?
>
>
> What's the linker to do with it? Maybe there's confusion about what
> I mean with "address space". I mean the named address spaces as of
>
> http://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html
>
> according to ISO/IEC DTR 18037. It's not in the C-standard and only
> supported by gcc as a GNU extension. Suppose the following GNU-C
>
> extern const char c;
>
> char readc (const char *pc)
> {
> return c + *pc;
> }
>
> The generic AS allows direct addressing and addressing via X=r26 for
> example:
>
> readc:
> ; (set (reg:HI 26)
> ; (reg:HI 24))
> movw r26,r24 ; 18 *movhi/1
> ; (set (reg:QI 25)
> ; (mem:QI (reg:HI 26) [S1 A8]))
> ld r25,X ; 8 movqi_insn/4
> ; (set (reg:QI 24)
> ; (mem:QI (symbol_ref:HI ("c")) [S1 A8]))
> lds r24,c ; 9 movqi_insn/4
> ; (set (reg:QI 24)
> ; (plus:QI (reg:QI 24)
> ; (reg:QI 25)))
> add r24,r25 ; 15 addqi3/1
> ret
>
> This locates c in RAM and *pc must also be located in RAM which
> is a waste. Now use AS __flash, i.e. non-volatile memory:
>
> extern const __flash char c;
>
> char readc (const __flash char *pc)
> {
> return c + *pc;
> }
>
> readc:
> ; (set (reg:HI 30)
> ; (reg:HI 24))
> movw r30,r24 ; 20 *movhi/1
> ; (set (reg:QI 25)
> ; (mem:QI (reg:HI 30) [S1 A8 AS1]))
> lpm r25,Z ; 7 movqi_insn/4
> ; (set (reg:HI 30)
> ; (symbol_ref:HI ("c")))
> ldi r30,lo8(c) ; 17 *movhi/5
> ldi r31,hi8(c)
> ; (set (reg:QI 24)
> ; (mem:QI (reg:HI 30) [S1 A8 AS1]))
> lpm r24,Z ; 8 movqi_insn/4
> ; (set (reg:QI 24)
> ; (plus:QI (reg:QI 24)
> ; (reg:QI 25)))
> add r24,r25 ; 14 addqi3/1
> ret
>
> And access to __flash must use instruction LPM *,Z.
> Direct addressing or using X or Y register are not allowed.
> Using Z+const addressing is not allowed. Using LD or LDS on
> the same binary address will lead to wrong code, for example will
> read from RAM address 0x42 instead of from flash address 0x42.
>
> How could the linker fix an address to generic AS to one that
> goes to non-generic AS? The linker doesn't even have that
> information.
>
>
>> As of the patch I think it is too specific (switch-conversion only)
>> for my taste.
>
>
> It's actually not restricted to switch-conversion. It can be used for
> any object provided
>
> 1) The object is in static storage and located in that AS.
>
> 2) All accesses to that object use that AS.
>
> 3) The object is read-only, i.e. not modified after load-time.
>
>> What you can do I think is in assemble_variable handle the case of
>> ! TREE_PUBLIC && ! TREE_ADDRESSABLE -- non-exported, non-address
>> taken variables can be put into a different address-space transparently
>> by adjusting their type to reflect the changed address-space.
>
>
> /* Assemble everything that is needed for a variable or function
> declaration.
> Not used for automatic variables, and not used for function definitions.
> Should not be called for variables of incomplete structure type.
>
> TOP_LEVEL is nonzero if this variable has file scope.
> AT_END is nonzero if this is the special handling, at end of compilation,
> to define things that have had only tentative definitions.
> DONT_OUTPUT_DATA if nonzero means don't actually output the
> initial value (that will be done by the caller). */
>
> void
> assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
> int at_end ATTRIBUTE_UNUSED, int dont_output_data)
> {
>
> So I can patch TYPE_ADDR_SPACE of /decl/ and this will by magic
> diffuse to all places that access it? Wow!
For the ! TREE_ADDRESSABLE case yes, I think so (of course you
need to double-check...). ! TREE_PUBLIC is because otherwise
you're not seeing all accesses and thus TREE_ADDRESSABLE cannot
be trusted. TREE_STATIC should be always set in this function.
>> This should catch the switch-conversion case that didn't run into the
>> loop case. It won't fix vtables I think because they are always exported
>> unless you use LTO, they also end up address-taken I think.
>>
>> For vtables (a bigger chunk than switch-conversion decls) adjusting
>> things in the C++ FE is probably a good idea if it helps AVR.
>>
>> Richard.
>
>
> It don't think the C++ FE has an understanding of ASes at all. So even
> if it doesn't ICE or error, it's likely non-generic ASes are not
> preserved / respected during FE transformations. For now I'd say
> that when you use C+ +/ vtables on such a µC one must live with the
> consequences.
Ah - I totally forgot C++ doesn't handle address-spaces...
Richard.
> Johann