Questions about code generation
Mon Nov 29 13:40:00 GMT 2010
On 29/11/2010 13:29, firstname.lastname@example.org wrote:
> Thanks, David. Some follow up questions:
>> It is typically possible, but it depends on the compile-time flags you
>> use and the exact formulation of the source code. Don't expect that
>> compiler will be able to read your mind here - there are all sorts of
>> subtleties that come into play with automatic vectorisation. gcc will
>> err on the side of generating definitely correct code, rather than
>> generating fast code that might be wrong due to things like aliasing
> Is the best solution to use variable types that are
> architecture-specific? For example, defining a series of variable types
> that reflect every possible interpretation of the x86 XMM register file?
> Or should we be more general about it and allow for arrays of type
> int/float and hope that the compiler will vectorise them? There are
> certain groups that are commonly used - for example, a matrix of 2x2
> floating point numbers, or an array of 3 or 4 pairs of floats. Should
> they have their own variable type?
I don't write much C code for the x86 - certainly not any where the
speed matters that much. So you are getting beyond my experience here.
But the best advice is to try it and see - make some examples, compile
with different options, and examine the generated assembly. Small
examples will be easier to follow, but on the other hand the compiler
has more scope for optimisation when given plenty of code.
There are "intrinsic" functions in gcc for some processors that give you
direct access to SIMD instructions. There are also special types or
attributes for variables for such code. Again, I don't know if that
applies to the x86. But in general, if you are using "normal" types and
want to give the compiler its best chance, make sure you avoid "manual
optimisation". The compiler can do far more with an array of ints than
it can with a pointer-to-int, because it knows much more about it.
>> gcc does not knowingly produce buggy code - therefore it cannot
>> code that is "less buggy". But I believe it can produce alternate
>> pathways that are chosen at runtime according to the processor being
>> used - though I haven't worked with such code myself.
> Well, some code that is perfectly legal for both the 68000 and 68060
> chips will not work correctly on the 68060 due to errata. Certain
> instruction sequences are not executed correctly even though they should
> be - and one would only know if one reads the errata documents. I wonder
> whether gcc avoids such instruction sequences.
It will avoid such instruction sequences if you tell it to. You can
choose the target(s) that the code should run on, and you can also
choose the target for optimisation. For example, you can choose to
generate code that will run on all 68xxx processors, but is optimised
for the 68060 - it will run correctly on them all, but will use much
less absolute addressing (which is typically a fast choice on a 68000,
but slow on a 68060).
More information about the Gcc-help