"Tijmen Tieleman" wrote in
message news:cf0ial$1n72$1{at}darwin.ediacara.org...
> Hi there specialists in the beautiful field of the studies of Life
>
> I have a question about the wonderful mechanism of DNA. If you could
> tell me something relevant, or give some relevant links, all at the
> level of one with a reasonable mind but unfortunately ignorant of most
> biology, that would be greatly appreciated.
>
> I have always been told that DNA describes exactly how to build a (say
> human) body. Now to describe such a big and complicated thing, I
> suppose one needs a powerful language. One that allows us to say
> ‘build this and this structure, and then repeat it N times.' Or ‘build
> the following 1000 structures: as the basis for each one of
> them, and then and to complete
> them.' So that one doesn't have to write the basis a 1000 times. And
> some other tricks that we also use in our natural languages.
>
> Do many codes in DNA occur many times, or does DNA have a way of
> preventing such seemingly needless repetition? How complicated is the
> language in which DNA describes bodies? What sort of a language is it?
>
> Also I have some side questions:
> 1: How much data does DNA of a human and that of a bacterium contain?
> 2: How much of that data is needed to describe the human brain? Its
> different sections, its individual neurons, the connections between
> the sections, and all that. One would expect this to be a lot. After
> all, in AI (my field) we are continually impressed by the power and
> supposed complexity of the brain, and no computer program so far, it
> seems, has gotten anywhere near.
>
> But my main question is about the language that DNA uses to describe.
Well, I am certainly not this group's best expert on development.
But I do know a little bit about languages and AI, so I will try to
sketch out enough so that you can formulate more specific questions
for the real experts.
First of all, claiming that "DNA describes exactly how to build a
(say human) body" is a little misleading. The language used for
this kind of building activity is not a procedural language, nor is
it a structure description language. You might prefer to think of it
as a production system language. That is, the instructions for
building the radius bone in your wrist do not say something like
"Place N bone cells end-to-end". Instead they says something like
"while not LONG_ENOUGH -> add more bone cells" and "when elbow can
not 'smell' fingers -> set LONG_ENOUGH to TRUE". And there are
additional productions that tell the fingers how to 'stink' and the
elbow how to 'smell'. (Don't take this metaphor TOO seriously. I
don't really know how LONG_ENOUGH is defined.)
However, the 'language' of DNA is not a single monolithic language.
There are a collection of nested interpreters involved - each interpreter
being specified by a lower level interpreter and the fragment of the
entire DNA message that gets read by that interpreter. At the lowest
level, my claim that the language is not structure-descriptive is no
longer true. At this level, the sequence of bases in DNA directly
describe the sequence of amino acids in a protein to be constructed.
It is mostly at this level that the possibility of repetition comes up.
Suppose that there are a hundred different proteins that all happen to
contain the amino acid subsequence Ala-Gly-Trp-Cys-Ala. One might imagine
that some compression could be achieved by encoding this sequence only
once and then using a "pointer" to this subsequence from within each of
the 100 proteins. Nature has chosen not to do this kind of compression.
The code for that sequence of five amino acids is repeated within the
genome a hundred times. Or rather two hundred times, since chromosomes
come in pairs.
(There are other instances of massive duplication and repetition of
sequences, but I will not discuss them, as they are not yet well
understood.)
However, it is sometimes the case that the same protein is used in
several different places within the cell or within the body. So how is
this handled? Sometimes by duplicating the specification of the protein,
sometimes by putting the OR logic into the antecedent of the production
that says "Make this protein", sometimes by designing the protein with
'little velcro tabs' so that as it floats around, it will only stick
to the places where it is supposed to go. As languages go, the languages
of development are not elegant and mother nature uses a fairly
undisciplined coding style. The whole thing is something of a kludge,
though a very clever kludge.
I expect that someone else will answer your questions regarding genome
size - man vs bacteria and brain vs other. However, I will point out
something that you maybe didn't know. The total size of the
human genome and the mouse genome are about the same. Although this is
not known yet for sure, it seems very likely that the amount of DNA
required to specify the human brain and the DNA to specify the mouse
brain are about the same. Human uniqueness, if it exists at all, does
not seem to arise from more having more lines of code in the program.
It arises from having "better" code - presumably a few tweaks here and
there in the program.
---
þ RIMEGate(tm)/RGXPost V1.14 at BBSWORLD * Info{at}bbsworld.com
---
* RIMEGate(tm)V10.2áÿ* RelayNet(tm) NNTP Gateway * MoonDog BBS
* RgateImp.MoonDog.BBS at 8/8/04 9:57:34 PM
* Origin: MoonDog BBS, Brooklyn,NY, 718 692-2498, 1:278/230 (1:278/230)
SEEN-BY: 633/267 270
@PATH: 278/230 10/345 106/1 2000 633/267
|