This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Patent on aligned memcpy


Alan Lehotsky wrote:
> 
> At 5:57 PM +1000 8/20/98, Geoff Keating wrote:
> 
> * Mike wrote about a patent, which appears to be US number 5706483, which
> * claims:
> *
> 
>         Similar code was used in VMS (circa 1977) and if I remember correctly,
> it's the canonical example of "Duff's Device" (the C language hack involving
> a switch statement without "break" between the cases))
1) it is not Duff's device. That was to read a block of memory into a
single io port. The core statement was
	*dst = *src++
note no increment on dst. It wasn't an alignment problem, but a loop
unrollong problem and how to deal with the N % M cases left over. Read
about it from Tom Duff at http://www.lysator.liu.se/c/duffs-device.html. 
(this confusion about what Duffs device is keeps coming up -- I'm
stamping on it now!)

2)Such memory alignment in memcpy might not be worth it. We did such a
memcpy implementation and found it to be slower/no faster than a simpler
case which aligned only the dst writes. The reason was to do with the
particular cache line size and how many outstanding reads & writes could
be dealt with. We'd reached the raw memory bandwidth limitation. I
wouldn't be at all surprised if other systems had the same behaviour.

3)Yes we thought it 'a bloody obvious' thing to do.

4)The Inmos Transputer (T414) released as product in 1985 contained this
algorithm. I've just had a word with David May, its chief architect, who
confirms its operation. For the T4's channel communication a memmove
type operation is needed (for memory resident channels). The
implementation had a setup stage which read until src is aligned and
wrote until dst aligned, a move stage which read and wrote aligned words
(with shift register to carry over the alignment difference), and
finally a post stage to do the end bits. It operated by examining the
bottom 2 bits of src, dst & length to determine which particular shift,
setup etc was required (looks like the 'state machine thing' in Msoft's
case). In addition to being used for the in and out instructions, it was
made available as a move instruction. The compiler writer's guide says
something like 'the minimum number of aligned accesses are used'. The T4
reference manual was published in 1985, from which you can infer the
implementation based on the instruction timing (you could also do so by
hooking a logic analyser up to the memory bus). David did not file a
patent on it. But as it is in product, the algorithm is not now
patentable. If earlier documentation and or references are required,
they can be provided (but not just now as it is at home).

It seems to me that the msoft patent is unenforcable.

nathan

-- 
Dr Nathan Sidwell :: Computer Science Department :: Bristol University
      You can up the bandwidth, but you can't up the speed of light      
nathan@acm.org  http://www.cs.bris.ac.uk/~nathan/  nathan@cs.bris.ac.uk


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]