This is the mail archive of the java@gcc.gnu.org mailing list for the Java project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: gcj's IO performance vs blackdown JDK

From: Christopher Marshall <christopherlmarshall at yahoo dot com>
To: gnustuff at thisiscool dot com, Bryce McKinlay <bryce at mckinlay dot net dot nz>
Cc: Christopher Marshall <christopherlmarshall at yahoo dot com>, tromey at redhat dot com, per at bothner dot net, java at gcc dot gnu dot org
Date: Fri, 26 Dec 2003 11:28:45 -0800 (PST)
Subject: Re: gcj's IO performance vs blackdown JDK

--- Mohan Embar <gnustuff@thisiscool.com> wrote:
> Hi Chris,
> 
> >I can't believe all I had to do was complain to get you and everyone to spend this amount of
> time
> >fixing and discussing the IO performance.
> 
> >Thanks!
> 
> I can't speak for the others, but I think it was your nice test case
> more than the complaining. Plus the problem was pretty interesting....
> 
> So once we get this performance thing nailed, are you going to ditch the JDK? :)
> 
> -- Mohan
> http://www.thisiscool.com/
> http://www.animalsong.org/
> 

Definately I'll ditch the JDK and start using gcj!  This performance issue is the only thing
holding me back.

I probably should have mentioned this before, when I was reading the thread, but someone referred
to my test case as "contrived," since the lines were 1000 characters long.  I take it the poster
meant that in the real world such files would be far and few between.

I suppose that is true for large parts of the real world, but I would also be surpised if it were
not false for significant parts of it as well.

In my case, I have written a lot of java programs that read very long text-lines like this:

a=1,b=2,c=3,...,field_name=value,...

where there are a lot of fields, and the field names are very long.  A typical program of mine
would read a file with 800,000 lines in it, and focus on five of six out of 30 or 40 such fields,
a construct a hashtable of statistics on what combinations of fields occur.  1000 characters per
line is not unusual, and I'd be suprised if I there weren't a significant numer of people writing
analysis software in this style (although I have no idea how many of those would think of using
java to do it).

When I first started experimenting with gcj I took one of my programs and ran it under gcj and
noticed a two to one penalty.  I first thought is must be the hashtable performance, which is why
I wrote my test-file generator the way I did (the test-file generator took, as one of its
arguments, the number of distinct text lines as well as the number of lines to write.  The number
of distinct lines would be the size of the hashtable an analysis program would construct in
reading the file).  I had hardly played around with generating such files and analyzing them for
an hour before it became clear the hashtable performance was not the issue.

Anyway, to make a long story short, the ability to read long lines of text quickly doesn't strike
me as all that contrived a requirement, given what uses I have been putting java to.

Chris Marshall

__________________________________
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/

Follow-Ups:
- Re: gcj's IO performance vs blackdown JDK
  - From: Mohan Embar

References:
- Re: gcj's IO performance vs blackdown JDK
  - From: Mohan Embar

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]