Sunday 5 November 2017

genetics - What is the instructional language of DNA?



DNA carries most of the genetic instructions used in the development, functioning and reproduction of all known living organisms and many viruses (Wikipedia).



Is it already know how ATCG's sequences create theses instructions? Or equivalently, what is the instructional language of DNA? My interest in this question is to know if we already understand how ATCG's sequences can create, for example, human epidermis with an excess of a determined mineral or change cell's shape for more fundamental geometrical ones (triangles, rectangles, diamonds, etc.).


Please, forgive me for this foolish question: I'm not a biologist.



Answer




This language is called the genetic code. But before talking about this specific code, it is important to talk about how the code is read. Please note that the below answer is a simplification of the reality.


Mechanism by which the code is read


To make things easier (reality is a little more complicated), DNA is "formatted" into RNA which is then "formatted" into proteins. When "formatting" occurs, we do not lose the previous format but we only copied the information into a new format. The formatting from DNA to RNA is called transcription and the formatting from RNA to proteins is called translation.


Proteins are the active units that affect cell activities. The activity of the proteins is a function of their sequence. As a consequence the language/code we must understand is the relationship between DNA and proteins.


Numeral systems of DNA


DNA, as you said, is written in quaternary (=numeric system at base 4). The four letters are called A, T, C and G which stands for Adenine, Thymine, Cytosine and Guanine (4 nucleotides).


Numeral systems of proteins


Proteins, on the other hand, are coded in a system at base 21 (21 amino acids).


Redundancy


Because $4^2=16<21$, we need to use (at least) 3 letters in DNA to code for 1 amino acid. Because $4^3=64>21$, we necessarily have several codes of 3 nucleotides that match to the same amino acid. We talk about redundancy. In computer science, a byte is a series of 8 bits. In biology, a codon is a series of 3 nucleotides.



Code


Below is the code:


enter image description here


, where Phe, Leu, Ser, etc... are an abbreviation for specific amino acids. Phe for example is the phenylallanine.


As you can see in the above table, most of the redundancy is caused by changes in the third letter of the codon. More information on why it is the case on this post.


There are 4 special codons.



  • AUG: Is the start codon. It is an indication of where translation should start.

  • UAG, UAA and UGA are stop codons. They are an indication of where translation should stop.





Possible confusions


Central Dogma of Molecular Biology


The unidirectional relationship between DNA, RNA and proteins has been called Central Dogma of Molecular Biology. Of course, there is no place for dogma in science and the term is very improperly used. Interestingly enough, the Central Dogma of Molecular Biology is partially wrong because this relationship between DNA, RNA and proteins is not necessarily unidirectional. The reverse transcriptase is an enzyme that reformats DNA from RNA.


Genetic code


If you google "genetic code", most hits will display the letter U instead of the letter T. The reason is that the code is often presented for the relationship between RNA and DNA and the only difference in the code between RNA and DNA is that T (Thymine) is replaced by U (Uracile).


mRNA vs RNA


Not all RNA are meant to be translated into proteins. Only mRNA (m stands for "messenger") are. For example, the process of translation is itself catalyzed by RNA (called rRNA, where "r" stands for "ribosomial")


Note also that DNA is first transcribed into pre-mRNA which is then modified into mRNA before being translated.


Is all of DNA transcribed?



No. In eukaryotes (which include pretty much all the living things you may think of to the exception of bacteria and viruses), most of the DNA is not transcribed. The rest is used for regulatory purposes or might not be used at all. The regions (called locus/loci) of the DNA that is being transcribed are called genes


No comments:

Post a Comment

evolution - Are there any multicellular forms of life which exist without consuming other forms of life in some manner?

The title is the question. If additional specificity is needed I will add clarification here. Are there any multicellular forms of life whic...