This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: Compiling GCC with g++: a report

From: DJ Delorie <dj at redhat dot com>
To: schlie at comcast dot net
Cc: gcc at gcc dot gnu dot org
Date: Wed, 25 May 2005 21:29:37 -0400
Subject: Re: Compiling GCC with g++: a report
References: <BEBA91DB.A46F%schlie@comcast.net>
>   which is defined to correspond to some physical mode

Close.  Defined to correspond to one or more physical modes.

> - Huh?, can you provide a single example of where a char type would
>   be mapped by the target to two different target specified modes?

i386 can hold a char in %al (QImode) or %bx (HImode) or %esi (SImode).
It can also be stored in st0 (TFmode).  I can't imagine why a target
port would want to *force* a char into a wider register on the i386,
but I don't want to assume it won't happen either.  Imagine, for
example, a port with 9 bit chars, with an __attribute__ to get an 8
bit char when different overflow semantics are required.

But I can easily imagine a chip (8086) where "int *" can be one of two
different sizes (16 or 32 bits).

>   mapped to into an indexed subreg of a wider target specified mode, the

See also WORD_REGISTER_OPERATIONS.

> - fully agree, no physical mode name (such as SImode) as may be defined
>   by the target should ever be referred to by the MI portion of the

Note that using integer_mode is equally wrong, if you don't check to
see if it's the proper size for the application.  long_integer_mode or
short_integer_mode might be more appropriate.  Copying modes from one
of the operands is faster.

Consider adding two scalars of modes A and B.  Maybe the target has an
addAB3 insn (some do!).  If not, see if it has {zero|sign})_extendAB2
(or BA depending on bits-per-mode) and addB3.  If not, for
(bits=bits(B)+1; bits<MAX_MODE_BITS; bits++) see if it has
addMODE(bits)3.

Note that this kind of thing happens a lot with multiply and divide
insns, which reminds me, I need to submit a patch to take out an
assumption that the "next wider" mode happens to be the "twice wider"
mode.  Doesn't work with partial int modes defined.

> - sorry, I was just attempting to respond given my best guess as to your
>   intent when you introduced BI mode into the discussion by with the above
>   to:
> 
> >>>>   target_unit_mode // presumably the target's smallest addressable datum.

On some chips, the smallest addressable datum is a bit, not a byte.
Given how often gcc makes invalid assumptions about bytes and words,
it was worth noting.

>   modes (i.e. function pointers may be defined to map to WILMA mode, where
>   label pointers may be defined to map to BARNEY mode physical target
>   pointer modes).

Sigh, more assumptions.  Don't group all functions into the same
pointer size either.  Consider the i386 "near" vs "far" attributes;
you need different sized pointers for EACH FUNCTION AND VARIABLE
depending on whether it's near or far.

> - fully agreed, my mistake in making the same incorrect assumption which
>   GCC seems to do on occasion, as alignment should likely be an attribute
>   of every target defined logical type -> physical mode mapping definition.

Worse.  Some chips have both aligned and unaligned addressing modes
for each physical data type.  Unaligned modes are normally slower, so
you'd want to tag those variables or pointers which may be unaligned,
and use a different type of pointer (although perhaps the same size)
for those, than the default.

> - given that the target may only "decide" based on the uniqueness of the
>   contextual of the pointer's use visible to it,

It knows what the pointer points to, too.  "pointer to int" may be
different than "pointer to __attribute__((far)) int".

>   rather than leaving it up to chance that they may be discrimated

i386 near vs far again.  Borland has supported that since the late
1980's, gcc still doesn't support it.

> - but only needs to associate a logical type mode with a target defined
>   physical mode when attempting to match the the target's instruction
>   rtl definitions defined in terms of physical modes SI, DI, WILMA, etc.),
>   it would seem?

My point was that MI and target both have the same information
available for doing this conversion.  There's no advantage to trying
to force the MI to do it somehow.

> - yes, as would always need to be done if the language's new data-type
>   precision or representation is target dependant, as most language
>   data-types are?

There are more language types than hard types, for example floating
point types might be simulated in software.  The target shouldn't need
to know about new types if they fit (somehow) into existing modes.

> - fully agree, which is why the MI need only deal with *language* type/modes
>   and rely on the target to define their mapping to *target* types/modes.

But the target should *not* know about language types, aside from a
few key bits like "bits per int" etc.

> >> - sorry, I don't see; as the program code, and internal tree representation
> >>   of that code (as you've noted below), identifies all nodes as having one
> >>   of N  canonical types (bool, char, short, int, *, [], etc.) not an
> >>   arbitrary type,
> > 
> > *Except* when you consider __attribute__ which can modify *anything*
> > in gcc.  This is how we get vector variables, interrupt functions,
> > etc.
> 
> - unless I misunderstand, I suspect you're mixing a few orthogonal issues:
> 
>   - the last first, a vector is just another canonical type, no different
>     than bool, char, etc. and needs it's target specific attributes
>     described by the target, and correspondingly mapped to some target
>     defined named physical mode which the target's rtl is described using.

Vectors can be simulated in software for ports that don't support
them.  Ports that *do* support them need to map them to existing
(nonstandard so far) vector operations.

>   - interrupt, no-return, etc. attributes aren't types per-se, but rather
>     semantic modifiers which GCC has provided as a convenience to both
>     programmers, and the middle-end architecture to enable the incremental
>     specification of semantics in a canonical way by language front-ends,
>     to enable the development of a language neutral middle end?

Except that "pointer to function" and "pointer to interrupt function"
must be different types, because *calling* those two types of
functions often requires a different set of opcodes.  "pointer to int"
and "pointer to unaligned int" must be different so the right
addressing modes (or simulations) can be used.

Consider this C++:

void foo (void __attribute__((far) *);
void foo (void __attribute__((near)) *);

int j;

foo(&j);

Which function is called?  How can it choose if it doesn't know how to
differentiate types that differ only in attributes?

>     be used to introduce "new" types, but only if that "new" type is defined
>     as being physically equivalent to an existing supported type/mode known
>     already to both the MI and the target portions of the compiler, as

No.  The MI shouldn't need to know about machine modes that are used
by the target only to support attribute-tagged types.  Essentially,
the target causes MI to create a new type (much like "typedef" does in
C) that the target happens to have insns for.

>     otherwise neither the MI or target portions of the compiler understands
>     it relationship/conversion to/from other types,

Why should the MI know how to convert between types in a given class?
That's what the extendMN2 insn patterns are for.

> > No, you're missing a lot of variability here.
> 
> - like? (observing that it may be easily augmented as desired/required)

near vs far comes to mind.  Vectors, complex, alignment, signalling,
volatile, etc.  Two variables of C type "char" might be assigned
different machine modes based on something MI doesn't know about.
What about CPUs with multiple [different] integer units?  You'd want
distinct scalar modes so the programmer can assign variables to
specific integer units, perhaps.  Example: a fast FPU vs an
IEEE-accurate FPU.

> >>   if (TYPE_MODE(...) == char_mode) ...
> > 
> > I'd rather see
> > 
> >     if (TYPE_MODE(...) == TYPE_MODE(...))
> > 
> > or
> >     if (BITS_PER_MODE (TYPE_MODE (...)) <= BITS_PER_BYTE)
> 
> - out of curiosity, why?

If the MI is calling a function and passing three arguments, why
should it care if the type is "char" ?  If it's offsetting a pointer
to access a structure field, who cares if the pointer is Pmode?  What
about DSPs that have 32 bit chars, but can access 8 bit bytes?

Either deal with language types, or deal with machine modes.  Don't
mix them unless you're lowering to RTL, and only when absolutely
needed.  That's where the lookup functions get used.  But there's a
big difference between "give me a mode that is this type" and "give me
a mode that I can put this variable in".  The first makes unneeded
assumptions, the second doesn't.
References:
- Re: Compiling GCC with g++: a report
  - From: Paul Schlie
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]