This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch 0/2] PR49847: Add hook to place read-only lookup-tables in named address-space


On Fri, Jul 28, 2017 at 12:18 PM, Georg-Johann Lay <avr@gjlay.de> wrote:
> Richard Biener schrieb:
>
>> On Thu, Jul 27, 2017 at 3:32 PM, Georg-Johann Lay <avr@gjlay.de> wrote:
>>>
>>> On 27.07.2017 14:34, Richard Biener wrote:
>>>>
>>>> On Thu, Jul 27, 2017 at 2:29 PM, Georg-Johann Lay <avr@gjlay.de> wrote:
>>>>>
>>>>> For some targets, the best place to put read-only lookup tables as
>>>>> generated by -ftree-switch-conversion is not the generic address space
>>>>> but some target specific address space.
>>>>>
>>>>> This is the case for AVR, where for most devices .rodata must be
>>>>> located in RAM.
>>>>>
>>>>> Part #1 adds a new, optional target hook that queries the back-end
>>>>> about the desired address space.  There is currently only one user:
>>>>> tree-switch-conversion.c::build_one_array() which re-builds value_type
>>>>> and array_type if the address space returned by the backend is not
>>>>> the generic one.
>>>>>
>>>>> Part #2 is the AVR part that implements the new hook and adds some
>>>>> sugar around it.
>>>>
>>>>
>>>> Given that switch-conversion just creates a constant initializer doesn't
>>>> AVR
>>>> benefit from handling those uniformly (at RTL expansion?).  Not sure but
>>>> I think it goes through the regular constant pool handling.
>>>>
>>>> Richard.
>>>
>>>
>>> avr doesn't use constant pools because they are not profitable for
>>> several reasons.
>>>
>>> Moreover, it's not sufficient to just patch the pool's section, you'd
>>> also have to patch *all* accesses to also use the exact same address
>>> space so that they use the correct instruction (LPM instead of LD).
>>
>>
>> Sure.  So then not handle it in constant pool handling but in expanding
>> the initializers of global vars.  I think the entry for this is
>> varasm.c:assemble_variable - that sets DECL_RTL for the variable.
>
>
> Expand and RTL is too late.  The CSWTCH tables are created by some tree
> pass, and it also generated accesses which use the address-space of the
> element types.  If any tree variable or copy thereof uses generic space,
> then you will get wrong code.  In the case of CSWTCH, you must get the
> AS into value_type as noted by Steven
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49857#c3
>
>>> Loop optimization, for example, may move addr-space pointers out of
>>> loops and actually does this for some CSWTCH tables.  I didn't look
>>> into pool handling, but don't expect it allows to consistently
>>> patch all accesses in the aftermath.
>>>
>>> *If* that's possible, then it should also be possible to patch vtables
>>> and all of their accesses, aka.
>>>
>>> https://gcc.gnu.org/PR43745
>>>
>>> e.g. in a target-specific tree pass?
>>
>>
>> Ah, ok.  I see what you mean.  Once the address of a global var is
>> taken the pointers refering to it have to use the proper address-space.
>
>
> Yes, and all copies of these pointers etc. generated so far.  That's the
> point of address spaces.  The binary representation of a pointer may
> be exact the same for different ASes, but still accesses have to be
> handled differently:  Different legitimate addresses, different
> instructions to access through these pointers, etc.  Just converting
> a pointer to a different AS will give you wrong code.
>
>> So yeah, it looks like this would need to be done in a (target specific)
>> pass, probably an IPA pass in case the variables are used from multiple
>> functions.
>
>
> It would at least be an ABI change.  If different modules are accessing
> the vtable, then they must use the same AS.  Having a copy of the vtable
> in each module is pointless:  If these modules don't agree about the AS
> and host vtables in different ASes, we lost from the optimization point:
> one module would have it in flash (consuming flash) and a different
> module would have it in RAM (consumes RAM and flash to initialize that
> RAM at startup).  So this would not use 1x flash as intended, and not
> 1x flash + 1x RAM like in the non optimized version, but 2x flash +
> 1x RAM.  Some comdat won't help because that gives wrong code.
>
>>> With vtables it's basically the same problem, you'd have to patch *all*
>>> accesses to match the vtable's address-space.  c++ need not to expose
>>> address-spaces to the application, handling address-spaces internally
>>> would be sufficient.
>>
>>
>> The IPA pass should be able to handle this transparently (vtables are just
>> global decls with initializers).
>
>
> Global is bad.  If the address of the object escapes, that's be wrong
> code because location + all accesses must agree about the AS.
>
>> Of course the question is whether the linker can handle relocations
>> between address-spaces for example?
>
>
> What's the linker to do with it?  Maybe there's confusion about what
> I mean with "address space".  I mean the named address spaces as of
>
> http://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html
>
> according to ISO/IEC DTR 18037.  It's not in the C-standard and only
> supported by gcc as a GNU extension.  Suppose the following GNU-C
>
> extern const char c;
>
> char readc (const char *pc)
> {
>     return c + *pc;
> }
>
> The generic AS allows direct addressing and addressing via X=r26 for
> example:
>
> readc:
>  ; (set (reg:HI 26)
>  ;      (reg:HI 24))
>         movw r26,r24     ;  18  *movhi/1
>  ; (set (reg:QI 25)
>  ;      (mem:QI (reg:HI 26) [S1 A8]))
>         ld r25,X         ;  8   movqi_insn/4
>  ; (set (reg:QI 24)
>  ;      (mem:QI (symbol_ref:HI ("c")) [S1 A8]))
>         lds r24,c        ;  9   movqi_insn/4
>  ; (set (reg:QI 24)
>  ;      (plus:QI (reg:QI 24)
>  ;               (reg:QI 25)))
>         add r24,r25      ;  15  addqi3/1
>         ret
>
> This locates c in RAM and *pc must also be located in RAM which
> is a waste. Now use AS __flash, i.e. non-volatile memory:
>
> extern const __flash char c;
>
> char readc (const __flash char *pc)
> {
>     return c + *pc;
> }
>
> readc:
>  ; (set (reg:HI 30)
>  ;      (reg:HI 24))
>         movw r30,r24     ;  20  *movhi/1
>  ; (set (reg:QI 25)
>  ;      (mem:QI (reg:HI 30) [S1 A8 AS1]))
>         lpm r25,Z        ;  7   movqi_insn/4
>  ; (set (reg:HI 30)
>  ;      (symbol_ref:HI ("c")))
>         ldi r30,lo8(c)   ;  17  *movhi/5
>         ldi r31,hi8(c)
>  ; (set (reg:QI 24)
>  ;      (mem:QI (reg:HI 30) [S1 A8 AS1]))
>         lpm r24,Z        ;  8   movqi_insn/4
>  ; (set (reg:QI 24)
>  ;      (plus:QI (reg:QI 24)
>  ;               (reg:QI 25)))
>         add r24,r25      ;  14  addqi3/1
>         ret
>
> And access to __flash must use instruction LPM *,Z.
> Direct addressing or using X or Y register are not allowed.
> Using Z+const addressing is not allowed.  Using LD or LDS on
> the same binary address will lead to wrong code, for example will
> read from RAM address 0x42 instead of from flash address 0x42.
>
> How could the linker fix an address to generic AS to one that
> goes to non-generic AS?  The linker doesn't even have that
> information.
>
>
>> As of the patch I think it is too specific (switch-conversion only)
>> for my taste.
>
>
> It's actually not restricted to switch-conversion. It can be used for
> any object provided
>
> 1) The object is in static storage and located in that AS.
>
> 2) All accesses to that object use that AS.
>
> 3) The object is read-only, i.e. not modified after load-time.
>
>> What you can do I think is in assemble_variable handle the case of
>> ! TREE_PUBLIC && ! TREE_ADDRESSABLE -- non-exported, non-address
>> taken variables can be put into a different address-space transparently
>> by adjusting their type to reflect the changed address-space.
>
>
> /* Assemble everything that is needed for a variable or function
> declaration.
>    Not used for automatic variables, and not used for function definitions.
>    Should not be called for variables of incomplete structure type.
>
>    TOP_LEVEL is nonzero if this variable has file scope.
>    AT_END is nonzero if this is the special handling, at end of compilation,
>    to define things that have had only tentative definitions.
>    DONT_OUTPUT_DATA if nonzero means don't actually output the
>    initial value (that will be done by the caller).  */
>
> void
> assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
>                    int at_end ATTRIBUTE_UNUSED, int dont_output_data)
> {
>
> So I can patch TYPE_ADDR_SPACE of /decl/ and this will by magic
> diffuse to all places that access it? Wow!

For the ! TREE_ADDRESSABLE case yes, I think so (of course you
need to double-check...).  ! TREE_PUBLIC is because otherwise
you're not seeing all accesses and thus TREE_ADDRESSABLE cannot
be trusted.  TREE_STATIC should be always set in this function.

>> This should catch the switch-conversion case that didn't run into the
>> loop case.  It won't fix vtables I think because they are always exported
>> unless you use LTO, they also end up address-taken I think.
>>
>> For vtables (a bigger chunk than switch-conversion decls) adjusting
>> things in the C++ FE is probably a good idea if it helps AVR.
>>
>> Richard.
>
>
> It don't think the C++ FE has an understanding of ASes at all. So even
> if it doesn't ICE or error, it's likely non-generic ASes are not
> preserved / respected during FE transformations.  For now I'd say
> that when you use C+ +/ vtables on such a µC one must live with the
> consequences.

Ah - I totally forgot C++ doesn't handle address-spaces...

Richard.

> Johann


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]