Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!floyd!harpo!seismo!rochester!ritcv!tropix!rcm From: rcm@tropix.UUCP (Robert C. Moore) Newsgroups: net.micro.68k Subject: Re: Re: 68020 vs 16k - is the 020 worth - (nf) Message-ID: <181@tropix.UUCP> Date: Thu, 9-Feb-84 19:17:06 EST Article-I.D.: tropix.181 Posted: Thu Feb 9 19:17:06 1984 Date-Received: Sat, 11-Feb-84 08:15:11 EST References: fortune.2459 Lines: 43 Rob's comments on the relative speed comparisons of micros are quite correct. It is important to note the effective instruction execution speed including the effects of mmu's, bus arbitration, memory speeds, and so forth --- unless there is an intervening cache. The cache speed is then most important, as well as the cache size (and thus its hit rate.) For example, the 16k can get data from memory in only 3 clock cycles, but with its mmu, the number jumps to 4 (assuming very fast memory.) If the 68451 mmu is used alone with the 68000, getting only 2 waits at 12.5 Mhz is considered pretty good (ie 6 clock cycles). But if a translation cache is put around it, no wait operation (4) is pretty easy with conventional dynamic ram. With both a translation cache and a data cache, no wait operation of the 68000 at 12.5 Mhz is trivial, although the cache size will determine the degradation due to imperfect hit rate. The 68020 provides a virtual cache inside the processor, neatly avoiding the delays in address translation and main memory cycle time. A hidden benefit is the fact that the internal cycles are synchronous, avoiding the need to repetedly sample the DTACK (actually DSACK) asynchronous handshake line to prevent meta-stable states from propagating into the chip. ("You are not expected to understand this.") In short, an average of one byte is consumed off the instruction stream on each clock cycle. (The shortest instructions require 2 cycles, and are two bytes long.) Compare this to the 32032. There the state machine is unchanged from the 16032. It contains no cache. The 16032 already underutilizes its bus (in fact the 16008 is almost as fast, as the 8 byte prefetch queue is almost always full.) The 32032 will only go slightly faster than the 16032 in such circumstances. It will, however, leave enough bus time available that one could credibly run two processors on the same bus! All this discussion assumes that with register rich instruction sets most of the effects on system timing are due to the time needed to access text (programs) and data cycles have very little effect. Does anyone have any hard numbers on 16k bus utilization or text/data access ratios for either of these chips? bob moore ihnp4!tropix!rcm