SV: generic type support

Thu Feb 20 11:21:00 GMT 2003

=?iso-8859-1?Q?=D8yvind=5FHarboe?= writes:
 > 
 > > > A list of weapons available to a JIT compiler:
 > > >
 > > > - Static compilation from bytecode at install time: the native code
 > > > can take advantage of CPU specific instructions. Move conditionals,
 > > > cache sizes, vector instructions, alignment rules, scheduling, etc.
 > >
 > >We can do that too.  It's nothing to do with JIT compilation.
 > 
 > Would you compile one executable per cpu configuration?
 > 
 > Not very convenient from a deployment point of view. One complation
 > for ecah of the configurations below:
 > 
 > - 486
 > - 586
 > - MMX
 > - MMX + SSE
 > - 3D Now! + MMX
 > - 3D Now! + SSE
 > - MMX + SSE + SSE2

Yes, to do this right you'd have to compile the code at install time
or ship multiple versions.  Either would work, but neither is very
nice.  

When we have automatic vectorization for Java I might start to worry
about this!

 > > > - Profiling based compilation: spend compilation time where it does
 > > > good
 > >
 > >We can even do that for C.  We should do more.
 > 
 > The problem is that the profiling pattern is not known before the
 > application is deployed. The profiling pattern changes with data
 > processed.

In which case profile directed optimization is invalid, because the
profile might equally well be wrong for the next run of data.  Profile
directed optimization pretty much assumes that the pattern of usage is
consistent between runs.

 > > > - Speculative compilation: based upon profiling information, create
 > > > versions of functions based upon actual arguments/instantiations.
 > >
 > >That too.
 > 
 > How about:
 > 
 > E.g.
 > 
 > // this sort of optimisation is probably most useful in combination with
 > // vector instructions
 > void calcsome(int c, int d)
 > {
 >   for(int i=0; i<c; i++)
 > {
 >   foobar[i]/=d;
 > }
 > }
 > 
 > A JIT profiler could discover that the following were common, based
 > upon actual data processed:
 > 
 > - calcsome(4, d) -> unrolled (single vector instruction on SSE2).
 > - calcsome(c, 1) -> no-op
 > - calcsome(c, 2525) -> multiply by inverse
 > - calcsome(3, 7) -> perhaps completely unrolled and inlined

Indeed it would, but whatever does this have to do with a JIT?  Why
cannot a source compiler with profile directed optimization discover
the same thing?

With regard to getting the best performance from a specific chip,
there seems to be a choice here:

  a) Compile from bytecode during installation. 
  b) Compile from source during installation.

  ... and then perhaps recompile with profile directed optimization.
  Alternatively, a profile could be delivered with the application.

Why will a) necessarily be better than b)?

Andrew.