Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site fortune.UUCP Path: utzoo!watmath!clyde!floyd!harpo!ihnp4!fortune!rpw3 From: rpw3@fortune.UUCP Newsgroups: net.micro.68k Subject: Re: Re: 68020 vs 16k - is the 020 worth - (nf) Message-ID: <2459@fortune.UUCP> Date: Tue, 7-Feb-84 04:05:09 EST Article-I.D.: fortune.2459 Posted: Tue Feb 7 04:05:09 1984 Date-Received: Thu, 9-Feb-84 13:36:05 EST Sender: notes@fortune.UUCP Organization: Fortune Systems, Redwood City, CA Lines: 83 #R:utzoo:-349300:fortune:6600011:000:4379 fortune!rpw3 Feb 6 23:15:00 1984 Please, please, please, folks... don't fall in the trap of comparing CPU clock speeds across different machine architectures (such as 20 Mhz 68k vs. 6Mhz 16k). "It ain't that simple!" [Murphy's Law #27] The CPU clock has only to do with the internal fineness of the particular state-machine/microcode-engine used to implement the chip. You have to look at how many clocks it takes for a memory cycle, AND what access time is demanded of the memory to achieve that cycle. Comparing CPU clocks is like saying, "My car is faster than yours because my wheels have higher RPMs." (What's the diameter of the wheels, Ollie?) To get valid comparisons one must normalize the CPU clock to the memory access time and then memory cycle times can be calculated using the bus sequence of the particular chip. Since processor clock speeds generally evolve more quickly than memory access times (in the marketplace), one has to look at how well the (expensive) memory is being used. In extreme examples, equal speed memories can result in one architecture being two or more times faster than another, simply because the memory is left idle. This explains, for example, why the obscure 6809 can stomp the familiar Z80, given equal access time memories, even though the Z80 may be running with a 2.5 times faster CPU clock. The 6809 uses one clock per memory cycle, the Z80 needs three (data) or four (instruction fetch). The Z80 also leaves the RAMs idle for a longer fraction of the cycle. (To get equivalent performance from the Z80, you have to run the CPU clock at a MUCH higher rate to balance the duty cycle while adding back wait states to match the access time.) One of the main reasons I happen to like the 68000/68010 is simply that the bus access-to-cycle time ratio nicely matches the access-to-cycle ratio of current (and near-future) dynamic RAMs. (For hardware hackers, the chip leaves the memories idle for just about the "RAS precharge time".) It makes good use of the memories. (Who knows about the 68020?) But don't let Motorola hype you. With the RAM chips we are going to have available over the next 1-2 years, you don't NEED a 20Mhz CPU; 12-16Mhz will do just fine, thank you. (I have not done a careful study of the 16000, but from the few minutes I have looked at the bus timing diagrams, it didn't looked quite as memory efficient. Be that as it may, ...) To do a fair comparision, one needs to presume some RAM access time, add bus driver/receiver and memory system delays (to get a memory SYSTEM access and cycle time), add MMU delays, and then compute the fastest CPU clock speed (for each chip) that just makes that access time work. (If one of the CPUs won't go fast enough to keep commercial memory chips busy, you've got a real problem with that one.) From that clock and the number of clocks per memory cycle, you can calculate the effective system memory cycle time as driven by each processor. Divide the raw memory system cycle time by the CPU-cum-memory system cycle time to get percentage effective memory utilization. The result is a pretty good first-order comparison of throughput between the CPU architectures. If you have reason to believe that one machine is GROSSLY more instruction stream efficient than the other (average bits/instruction), then you can scale a little for that, but be careful. Such interpretations are tricky (what is an "average instruction"?). The best way to do that is to take some fairly large modules of frequently used code (say, pieces of "libc") and hand code them in assembler as tight as possible. (Comparisons of individual instructions are meaningless.) Look at total memory cycles required for the entire function (don't forget a byte often costs the same as a word), and scale by the memory utilization calculated above. That gives you "functions per mem-access-time", which is a measure that can be used across a fairly large evolution in CPU clock and memory access times (which occur as chips get better). Whatever you do, don't try to compare CPU clock speeds alone. Even within a chip family, it's bogus. (A 20 Mhz 68000 is twice as fast as a 10 MHz 68000 ONLY with an infinitely fast memory system with no real-world components.) Rob Warnock UUCP: {sri-unix,amd70,hpda,harpo,ihnp4,allegra}!fortune!rpw3 DDD: (415)595-8444 USPS: Fortune Systems Corp, 101 Twin Dolphins Drive, Redwood City, CA 94065