This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch 0/2] PR49847: Add hook to place read-only lookup-tables in named address-space


Richard Biener schrieb:
On Thu, Jul 27, 2017 at 3:32 PM, Georg-Johann Lay <avr@gjlay.de> wrote:
On 27.07.2017 14:34, Richard Biener wrote:
On Thu, Jul 27, 2017 at 2:29 PM, Georg-Johann Lay <avr@gjlay.de> wrote:
For some targets, the best place to put read-only lookup tables as
generated by -ftree-switch-conversion is not the generic address space
but some target specific address space.

This is the case for AVR, where for most devices .rodata must be
located in RAM.

Part #1 adds a new, optional target hook that queries the back-end
about the desired address space.  There is currently only one user:
tree-switch-conversion.c::build_one_array() which re-builds value_type
and array_type if the address space returned by the backend is not
the generic one.

Part #2 is the AVR part that implements the new hook and adds some
sugar around it.

Given that switch-conversion just creates a constant initializer doesn't
AVR
benefit from handling those uniformly (at RTL expansion?).  Not sure but
I think it goes through the regular constant pool handling.

Richard.

avr doesn't use constant pools because they are not profitable for
several reasons.

Moreover, it's not sufficient to just patch the pool's section, you'd
also have to patch *all* accesses to also use the exact same address
space so that they use the correct instruction (LPM instead of LD).

Sure.  So then not handle it in constant pool handling but in expanding
the initializers of global vars.  I think the entry for this is
varasm.c:assemble_variable - that sets DECL_RTL for the variable.

Expand and RTL is too late.  The CSWTCH tables are created by some tree
pass, and it also generated accesses which use the address-space of the
element types.  If any tree variable or copy thereof uses generic space,
then you will get wrong code.  In the case of CSWTCH, you must get the
AS into value_type as noted by Steven

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49857#c3

Loop optimization, for example, may move addr-space pointers out of
loops and actually does this for some CSWTCH tables.  I didn't look
into pool handling, but don't expect it allows to consistently
patch all accesses in the aftermath.

*If* that's possible, then it should also be possible to patch vtables
and all of their accesses, aka.

https://gcc.gnu.org/PR43745

e.g. in a target-specific tree pass?

Ah, ok.  I see what you mean.  Once the address of a global var is
taken the pointers refering to it have to use the proper address-space.

Yes, and all copies of these pointers etc. generated so far.  That's the
point of address spaces.  The binary representation of a pointer may
be exact the same for different ASes, but still accesses have to be
handled differently:  Different legitimate addresses, different
instructions to access through these pointers, etc.  Just converting
a pointer to a different AS will give you wrong code.

So yeah, it looks like this would need to be done in a (target specific)
pass, probably an IPA pass in case the variables are used from multiple
functions.

It would at least be an ABI change.  If different modules are accessing
the vtable, then they must use the same AS.  Having a copy of the vtable
in each module is pointless:  If these modules don't agree about the AS
and host vtables in different ASes, we lost from the optimization point:
one module would have it in flash (consuming flash) and a different
module would have it in RAM (consumes RAM and flash to initialize that
RAM at startup).  So this would not use 1x flash as intended, and not
1x flash + 1x RAM like in the non optimized version, but 2x flash +
1x RAM.  Some comdat won't help because that gives wrong code.

With vtables it's basically the same problem, you'd have to patch *all*
accesses to match the vtable's address-space.  c++ need not to expose
address-spaces to the application, handling address-spaces internally
would be sufficient.

The IPA pass should be able to handle this transparently (vtables are just
global decls with initializers).

Global is bad.  If the address of the object escapes, that's be wrong
code because location + all accesses must agree about the AS.

Of course the question is whether the linker can handle relocations
between address-spaces for example?

What's the linker to do with it?  Maybe there's confusion about what
I mean with "address space".  I mean the named address spaces as of

http://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html

according to ISO/IEC DTR 18037.  It's not in the C-standard and only
supported by gcc as a GNU extension.  Suppose the following GNU-C

extern const char c;

char readc (const char *pc)
{
    return c + *pc;
}

The generic AS allows direct addressing and addressing via X=r26 for
example:

readc:
 ; (set (reg:HI 26)
 ;      (reg:HI 24))
	movw r26,r24	 ;  18	*movhi/1
 ; (set (reg:QI 25)
 ;      (mem:QI (reg:HI 26) [S1 A8]))
	ld r25,X	 ;  8	movqi_insn/4
 ; (set (reg:QI 24)
 ;      (mem:QI (symbol_ref:HI ("c")) [S1 A8]))
	lds r24,c	 ;  9	movqi_insn/4
 ; (set (reg:QI 24)
 ;      (plus:QI (reg:QI 24)
 ;               (reg:QI 25)))
	add r24,r25	 ;  15	addqi3/1
	ret

This locates c in RAM and *pc must also be located in RAM which
is a waste. Now use AS __flash, i.e. non-volatile memory:

extern const __flash char c;

char readc (const __flash char *pc)
{
    return c + *pc;
}

readc:
 ; (set (reg:HI 30)
 ;      (reg:HI 24))
	movw r30,r24	 ;  20	*movhi/1
 ; (set (reg:QI 25)
 ;      (mem:QI (reg:HI 30) [S1 A8 AS1]))
	lpm r25,Z	 ;  7	movqi_insn/4
 ; (set (reg:HI 30)
 ;      (symbol_ref:HI ("c")))
	ldi r30,lo8(c)	 ;  17	*movhi/5
	ldi r31,hi8(c)
 ; (set (reg:QI 24)
 ;      (mem:QI (reg:HI 30) [S1 A8 AS1]))
	lpm r24,Z	 ;  8	movqi_insn/4
 ; (set (reg:QI 24)
 ;      (plus:QI (reg:QI 24)
 ;               (reg:QI 25)))
	add r24,r25	 ;  14	addqi3/1
	ret

And access to __flash must use instruction LPM *,Z.
Direct addressing or using X or Y register are not allowed.
Using Z+const addressing is not allowed.  Using LD or LDS on
the same binary address will lead to wrong code, for example will
read from RAM address 0x42 instead of from flash address 0x42.

How could the linker fix an address to generic AS to one that
goes to non-generic AS?  The linker doesn't even have that
information.


As of the patch I think it is too specific (switch-conversion only)
for my taste.

It's actually not restricted to switch-conversion. It can be used for
any object provided

1) The object is in static storage and located in that AS.

2) All accesses to that object use that AS.

3) The object is read-only, i.e. not modified after load-time.

What you can do I think is in assemble_variable handle the case of
! TREE_PUBLIC && ! TREE_ADDRESSABLE -- non-exported, non-address
taken variables can be put into a different address-space transparently
by adjusting their type to reflect the changed address-space.

/* Assemble everything that is needed for a variable or function declaration.
   Not used for automatic variables, and not used for function definitions.
   Should not be called for variables of incomplete structure type.

   TOP_LEVEL is nonzero if this variable has file scope.
AT_END is nonzero if this is the special handling, at end of compilation,
   to define things that have had only tentative definitions.
   DONT_OUTPUT_DATA if nonzero means don't actually output the
   initial value (that will be done by the caller).  */

void
assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
		   int at_end ATTRIBUTE_UNUSED, int dont_output_data)
{

So I can patch TYPE_ADDR_SPACE of /decl/ and this will by magic
diffuse to all places that access it? Wow!

This should catch the switch-conversion case that didn't run into the
loop case.  It won't fix vtables I think because they are always exported
unless you use LTO, they also end up address-taken I think.

For vtables (a bigger chunk than switch-conversion decls) adjusting
things in the C++ FE is probably a good idea if it helps AVR.

Richard.

It don't think the C++ FE has an understanding of ASes at all. So even
if it doesn't ICE or error, it's likely non-generic ASes are not
preserved / respected during FE transformations.  For now I'd say
that when you use C+ +/ vtables on such a µC one must live with the
consequences.

Johann


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]