Sunday 18 June 2017

molecular biology - Gene located on Which strand? Want reference.



Update: Initially my question (below) was appropriately marked as duplicate/overlapping with two other questions; This and this ; among them a specific part of an answer of the second duplicate/overlap ; appropriately answer my question.



The convention is that in indicating any sequence feature† in a protein-coding gene on double-stranded DNA, a single strand‡ is represented — the one from which the amino sequence could be read using the genetic code (conceptually, with T substituted for U). Like any other nucleic acid sequence§, it is always written in the 5ʹ to 3ʹ direction in the same manner as the mRNA transcribed from it, without this being explicitly stated.



But what it does not mention (but I'm still looking for); is whether there is a standard and universal documentation for this particular convention (IUPAC/ IUBMB/ ICBN/ ICZN/ anything such); because "conventions" are human-made rules, and without "declared, universally followed, and accessible" documentation; the conventions does not make a sense. It rather feels like a rumor or viral phenomenon. I visited the IUPAC and IUBMB website home pages but I could not locate the documentation from there. Also google keyword search did not found the helpful.


So I specify my question that I want to get the standard/ official source of this convention. I would be highly grateful if anyone kindly mention the source hyperlink.




My Initial Question



I'm stuck at a very basic convention on notation on reading Gene sequence. *Specially when a book (or question paper or anything) does not mention whether a nucleic acid (usually DNA), uses a single stranded DNA


The problem appears because similar region of the 2 opposite strands of a DNA can contain different gene, i.e.


enter image description here


(So it is possible that if we note a stretch from gene-1 with its complementary sequence, that complementary sequence could be confused with gene-2.)


In some cases, a sequence to a stretch of nucleic acid, indeed mean the sequence directly.


For example; In the wikipedia image about Reading Frame (Permalink) Wikipedia Image the 5'to 3'Left to Right indicates the sequence of the template strand directly whereas


But at the more and more "interior chapters" I've seen a rather complicated and indirect notation (although I could not found clear mention of a convention anywhere in books or web)... when they mean a sequence of a stretch of a gene (or a regulatory sequence), they writes the "nontemplate" or "sense-strand" sequence of the respective gene.


In the Wikipedia image on TATA Transcription Box (Permalink);


Wikipedia image TATABOX it must be a sequence from the nontemplate strand because if it was the template strand, the "Promoter" location on the "template" strand must be on 3' side otherwise RNA polymerase would not be able to synthesize the nontemplate strand mRNA strand from 5' to 3' time-order. But it would become impossible to infer for me if the given sequence lacked labelling for relative locations of promoters and structural genes.


Now my question is If an article provides me a single stranded "Gene Sequence"; How can I infer it is directly the sequence of the mentioned gene stretch or it is the complementary ("sense") strand of the gene? What is the convention about it?





No comments:

Post a Comment

evolution - Are there any multicellular forms of life which exist without consuming other forms of life in some manner?

The title is the question. If additional specificity is needed I will add clarification here. Are there any multicellular forms of life whic...