Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!cmcl2!phri!roy From: roy@phri.UUCP (Roy Smith) Newsgroups: sci.math,sci.physics,sci.electronics,sci.bio Subject: Re: Analog/Digital Distinction Message-ID: <2489@phri.UUCP> Date: Tue, 11-Nov-86 11:58:59 EST Article-I.D.: phri.2489 Posted: Tue Nov 11 11:58:59 1986 Date-Received: Wed, 12-Nov-86 10:05:01 EST References: <521@ptsfd.UUCP> <277@apple.UUCP> <680@randvax.UUCP> Reply-To: roy@phri.UUCP (Roy Smith) Distribution: net Organization: Public Health Research Inst. (NY, NY) Lines: 60 Summary: DNA is digital Xref: mnetor sci.math:182 sci.physics:150 sci.electronics:63 sci.bio:25 In article <680@randvax.UUCP> edhall@rand-unix.UUCP (Ed Hall) writes: > Nature chose digital code of three-digit base-four numbers to determine > how you and I are put together. [...] There is a good engineering reason > why this is so. You can say all you want about the discontinuous nature > of digital representations as opposed to analog, but the fact remains > that digital is exactly reproducible, while analog is not. This is one of my favorite topics, so I'd like to expand on that a bit. I trust the real biologists out there will take into account the fact this this is a huge gross simplification of a complicated subject and not take me to task on details. I have been deliberately loose with nomenclature to highlight the information processing aspects at the cost of some biological accuracy. Readers interested in finding out more are encouraged to get a good book on molecular biology. Jim Watson's "Molecular Biology of the Gene, 3rd edition (1976)" is a good place to start. The Genetic code is indeed 3-digit, base-4 numbers. It's also an overloaded code -- the mapping from DNA to Amino Acids (AA's) is not one-to-one. Some AA's are coded for by more than one codon (3 base DNA sequence). What's really interesting, is that the copying of DNA does *not* have the perfect accuracy we have come to expect from digital processes. DNA exists in the cell most of the time as double stranded (dsDNA). This means that each base exists twice, once on one strand, and again on the other strand in its complementary form. After replication, you have a piece of dsDNA in which one of each base pair is from the original piece of DNA, and the other is a copy. You digital types will recognize this as a 2-symbol ECC, with 1 data symbol and one check symbol (there are 4 symbols, so you can't really say "bits"). OK, now that we've got our base pairs, what do we do with them? Well, a wonderful thing happens -- an enzyme (Pol1?) comes along and re-reads both strands of the new dsDNA. Every time it finds a place where a base-pair is wrong, it corrects it. But, you ask, with only a single check symbol (Hamming distance < 1), how do you know which one to trust? The answer is that you don't! You fix one of them at random and hope it's the right one. If it's not, no big deal. Either you've introduced a fatal mutation which will take care of itself, or you've made a "silent mutation" which doesn't make any difference (remember the many-to-one mapping of codons to AA's). Of course, you might have just lucked out and made a useful mutation, in which case you're off on the road to evolution. If you really get into this, it's amazing how many computer science concepts were thought of by living cells first. The most obvious is that DNA is a program. Then you have ECC (described above), subroutines (different enzymes made from common subunits), regular expressions (restriction enzymes), compilers and assemblers (ribosomes and tRNA's) compile-time preprocessing using #ifdef's (introns), self-modifying code (transposons and integrating phages), portable programs (plasmids), P&V operations (numerous regulatory systems), etc. You can even think of mRNA as a vector register, DNA as main memory, and chromosone-histone complexes as demand paging from a file system (or maybe as archival tape storage). -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016 "you can't spell unix without deoxyribonucleic!"