This is the mail archive of the java-discuss@sourceware.cygnus.com mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: inlining and string concatenation


Bryce McKinlay <bryce@albatross.co.nz> writes:

> I think this could be made a lot more efficient. We allways know at
> compile time how many strings will need to be concatenated (although we
> don't usually know the length of all those strings). The compiler never
> calls insert() or reverse() on a stringbuffer. We never need to append()
> chars or other stringbuffers, only Strings.

We do need to append chars.

> Given these assumptions (which may not all be correct, I'm just
> guessing), why not make a native StringBuffer equivilant that the
> compiler calls with an argument of the total number of strings being
> concatenated. This "FastStringBuffer" would then allocate an array of
> pointers of the precice length required. When the compiler calls
> toString(), it would add up the length fields of all the strings
> appended, allocate a new string of the precise size required, and
> memcpy() all the data. It only ever has to do 2 (3?) allocations, only
> the exact number of memcpy()'s required, and doesn't waste any memory.

While the basic idea is appealing, there are problems.  It would be
nice to only allocate the exact amount of space needed.  That requires
a two-pass algorithm, which can be expensive.  Consider if one of the
operands is a double.  It is difficult to calculate the number of
characters needed to string-ify a double without doing the actual
conversion - but you don't want to do the conversion twice.

What you can do for numbers is stack-allocate a work buffer
guaranteed to be big enough.  I.e. the algorithm is:
(1) Stack-allocate an array with an element (a pointer plus one bit)
for each operand.
(2) Evaluate each operand in order.  If the operand is an object,
leave a pointer to the resulting String in the array of (1).  If the
operand is a primitive, stack-allocate a work buffer guaranteed to
be big enough, leaving a pointer to the buffer in (1).
(3) Sum the lengths from (2).
(4) Allocate a string using the result from (3).
(5) Copy the results of of (2) into the string from (4).
Finally, return (4).

The steps (3)-(5) can all be handled by a run-time routine that
is passed the address and length of the array (1).
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]