Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site fortune.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxl!ihnp4!fortune!rpw3 From: rpw3@fortune.UUCP Newsgroups: net.lang.c Subject: Re: Re: Casting Pointers -- fast *portab - (nf) Message-ID: <2482@fortune.UUCP> Date: Wed, 8-Feb-84 07:03:53 EST Article-I.D.: fortune.2482 Posted: Wed Feb 8 07:03:53 1984 Date-Received: Fri, 10-Feb-84 02:06:08 EST Sender: notes@fortune.UUCP Organization: Fortune Systems, Redwood City, CA Lines: 32 #R:kobold:-27200:fortune:16200020:000:1178 fortune!rpw3 Feb 8 02:25:00 1984 And of course (?) everyone knows by now (?) that you can get even better with a 68000 by using the move-multiple-long (register load/store) instructions to eat and spew big gulps. 1. Save a few regs 2. While bunches left to do a. gulp into the regs b. spew out to memory c. adjust indices 3. copy the odd few words. Now, that strategy doesn't compare well against loop-unrolled move-long, since the move long takes care of the indices (movl a1@+,a2@+) and the moveml doesn't, but the moveml's can be loop-unrolled too! In that case, each load/store pair has a higher address offset word in the instruction ("moveml,a5(offset1)"), and you fix up the whole loop with two adds at the end. In the limiting case (which you can get close to attaining while doing buffer-block moves), you only fetch 8 bytes of instructions for each 40 bytes of data copied (note that's 80 bytes touched), or just over 10% overhead. (See the code for "blt" that comes with the the MIT "C" compiler.) Rob Warnock UUCP: {sri-unix,amd70,hpda,harpo,ihnp4,allegra}!fortune!rpw3 DDD: (415)595-8444 USPS: Fortune Systems Corp, 101 Twin Dolphins Drive, Redwood City, CA 94065