Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!cmcl2!phri!roy
From: roy@phri.UUCP (Roy Smith)
Newsgroups: sci.math,sci.physics,sci.electronics,sci.bio
Subject: Re: Analog/Digital Distinction
Message-ID: <2489@phri.UUCP>
Date: Tue, 11-Nov-86 11:58:59 EST
Article-I.D.: phri.2489
Posted: Tue Nov 11 11:58:59 1986
Date-Received: Wed, 12-Nov-86 10:05:01 EST
References: <521@ptsfd.UUCP> <277@apple.UUCP> <680@randvax.UUCP>
Reply-To: roy@phri.UUCP (Roy Smith)
Distribution: net
Organization: Public Health Research Inst. (NY, NY)
Lines: 60
Summary: DNA is digital
Xref: mnetor sci.math:182 sci.physics:150 sci.electronics:63 sci.bio:25

In article <680@randvax.UUCP> edhall@rand-unix.UUCP (Ed Hall) writes:
> Nature chose digital code of three-digit base-four numbers to determine
> how you and I are put together. [...] There is a good engineering reason
> why this is so.  You can say all you want about the discontinuous nature
> of digital representations as opposed to analog, but the fact remains
> that digital is exactly reproducible, while analog is not.

	This is one of my favorite topics, so I'd like to expand on that a
bit.  I trust the real biologists out there will take into account the fact
this this is a huge gross simplification of a complicated subject and not
take me to task on details.  I have been deliberately loose with
nomenclature to highlight the information processing aspects at the cost of
some biological accuracy.  Readers interested in finding out more are
encouraged to get a good book on molecular biology.  Jim Watson's
"Molecular Biology of the Gene, 3rd edition (1976)" is a good place to
start.

	The Genetic code is indeed 3-digit, base-4 numbers.  It's also an
overloaded code -- the mapping from DNA to Amino Acids (AA's) is not
one-to-one.  Some AA's are coded for by more than one codon (3 base DNA
sequence).  What's really interesting, is that the copying of DNA does
*not* have the perfect accuracy we have come to expect from digital
processes.

	DNA exists in the cell most of the time as double stranded (dsDNA).
This means that each base exists twice, once on one strand, and again on
the other strand in its complementary form.  After replication, you have a
piece of dsDNA in which one of each base pair is from the original piece of
DNA, and the other is a copy.  You digital types will recognize this as a
2-symbol ECC, with 1 data symbol and one check symbol (there are 4 symbols,
so you can't really say "bits").

	OK, now that we've got our base pairs, what do we do with them?
Well, a wonderful thing happens -- an enzyme (Pol1?) comes along and
re-reads both strands of the new dsDNA.  Every time it finds a place where
a base-pair is wrong, it corrects it.  But, you ask, with only a single
check symbol (Hamming distance < 1), how do you know which one to trust?
The answer is that you don't!  You fix one of them at random and hope it's
the right one.  If it's not, no big deal.  Either you've introduced a fatal
mutation which will take care of itself, or you've made a "silent mutation"
which doesn't make any difference (remember the many-to-one mapping of
codons to AA's).  Of course, you might have just lucked out and made a
useful mutation, in which case you're off on the road to evolution.

	If you really get into this, it's amazing how many computer science
concepts were thought of by living cells first.  The most obvious is that
DNA is a program.  Then you have ECC (described above), subroutines
(different enzymes made from common subunits), regular expressions
(restriction enzymes), compilers and assemblers (ribosomes and tRNA's)
compile-time preprocessing using #ifdef's (introns), self-modifying code
(transposons and integrating phages), portable programs (plasmids), P&V
operations (numerous regulatory systems), etc.  You can even think of mRNA
as a vector register, DNA as main memory, and chromosone-histone complexes
as demand paging from a file system (or maybe as archival tape storage).
-- 
Roy Smith, {allegra,cmcl2,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016

"you can't spell unix without deoxyribonucleic!"