DNA: The Carrier of Genetic Information – Replication, Transcription, and Translation

DNA as Carrier of Genetic Information

Molecular biology studies the mechanisms responsible for transmitting and expressing genetic information, ultimately determining cellular functions and structures. A central question was identifying the molecule responsible for conserving, transmitting, and expressing this information—the molecular nature of the gene. Experiments by Griffith (1928), Avery, MacLeod, and McCarty (1944), and Hershey and Chase (1952) led to the acceptance of DNA as the genetic material.

The Gene: A DNA Sequence

According to Morgan’s chromosome theory of inheritance, genes are located on chromosome fragments in linear, fixed positions. A gene is a linear DNA sequence containing the information necessary to synthesize a macromolecule (RNA or protein) with a specific cellular function. It represents the storage unit of genetic information and the unit of inheritance.

The Central Dogma of Molecular Biology

All cellular functions are mediated by RNA and protein synthesis, which depends on the set of genes and those expressed at any given time. In 1958, Francis Crick proposed the sequence hypothesis: a relationship exists between the linear arrangement of nucleotides in DNA and amino acids in polypeptides. Converting DNA sequence information into a protein is called gene expression and involves two steps, which, along with DNA replication, form the central dogma of molecular biology:

  1. Transcription of DNA: The first step, where DNA is transcribed into RNA. This is the first level of gene regulation.
  2. Translation of RNA information: The second step, where transcribed RNA information is translated into a specific protein. Regulatory mechanisms also exist here, though regulation commonly occurs by controlling protein activity after synthesis.

Transcription in Prokaryotes

Severo Ochoa (1955) discovered polynucleotide phosphorylase in E. coli, later identified by Jerard Hurwitz (1960) as RNA polymerase, the enzyme responsible for RNA synthesis. RNA polymerase catalyzes the 5′ to 3′ joining of ribonucleotide triphosphates to form RNA strands, using a DNA sequence as a template. In prokaryotes, a single RNA polymerase synthesizes all RNA types (mRNA, rRNA, and tRNA).

RNA Polymerase and Promoters

E. coli RNA polymerase is a holoenzyme with five subunits: 2α, β, β’, and σ. The first four form the core enzyme, responsible for transcription initiation and elongation. The σ subunit binds weakly and is responsible for the enzyme’s specific binding to DNA consensus sequences (TTGACA and TATAAT) within promoter regions, located 10 and 35 base pairs upstream of the transcription start site.

Stages of Prokaryotic Transcription

  1. Non-specific RNA polymerase binding to DNA: The enzyme searches for promoter consensus sequences.
  2. Specific σ subunit binding to the promoter: Forming the closed enzyme-promoter complex, with DNA still in a double helix.
  3. Open enzyme-promoter complex formation: RNA polymerase unwinds ~15 DNA base pairs around the start site, making the 3′ to 5′ DNA strand available as a transcription template.
  4. Transcription initiation: The first two ribonucleotide triphosphates are incorporated (always 5′ to 3′). After ~10 nucleotides, the σ subunit detaches.
  5. RNA chain elongation: Polymerase moves along the template, adding complementary ribonucleotides (ATP, GTP, CTP, or UTP) to the nascent chain’s 3′ OH end. It unwinds DNA ahead and rewinds transcribed DNA, maintaining ~15 unwound base pairs.
  6. Transcription termination: Occurs upon encountering a termination signal (often a GC-rich palindromic sequence followed by multiple adenines), causing RNA, RNA polymerase, and DNA template separation. Ribosomes access mRNA during synthesis, so translation occurs simultaneously with transcription.

Operon Model

Jacob and Monod proposed the operon model to explain gene expression regulation in prokaryotes. An operon consists of:

  • Structural genes (cistrons): Transcribed to produce mRNA, forming specific proteins.
  • Operator gene: Binds a regulatory protein.
  • Promoter gene: Binds RNA polymerase to initiate transcription.
  • Regulator gene: Encodes a regulatory protein that binds the operator, blocking (negative control) or inducing (positive control) transcription.

In negative control, the regulatory protein is a repressor. These systems can be inducible (e.g., lactose operon, where an inducer inactivates the repressor) or repressible (e.g., tryptophan operon, where a corepressor activates the repressor). In positive control, the regulatory protein is an activator, essential for transcription.

Transcription in Eukaryotes

Eukaryotic transcription differs from prokaryotic transcription:

  • Three RNA polymerases: RNA polymerase I (rRNA), RNA polymerase II (mRNA), and RNA polymerase III (tRNA and other small RNAs).
  • Protein factors required: RNA polymerases interact with protein factors to form the transcription initiation complex.
  • TATA box: The common consensus sequence (TATA) is located ~25-30 base pairs upstream of the start site.
  • Termination sequence: Usually TTATTT.
  • Pre-RNA: All eukaryotic RNAs are synthesized as primary transcripts (pre-RNA) requiring maturation.

Regulation and Maturation

Eukaryotic transcription regulation is complex due to DNA packaging with histones. Regulation occurs at two levels: chromatin packing and RNA polymerase activity. Regulatory regions (CCAAT and GGGCGG boxes), enhancers, and silencers influence transcription. Pre-mRNA maturation involves:

  • 5′ capping: Adding 7-methylguanosine for ribosome binding.
  • 3′ polyadenylation: Adding a poly-A tail for mRNA stability.
  • Intron splicing: Removing non-coding sequences via the spliceosome.

Translation: The Genetic Code

Translation, the final gene expression stage, synthesizes proteins using mRNA as a template. Ribosomes carry out this process, linking nucleotides and amino acids via tRNA molecules. The genetic code governs this process, defining the relationship between codons (three-nucleotide sequences) and amino acids. Its features include:

  • Triplet codons: Each codon specifies an amino acid.
  • Degeneracy: Multiple codons can encode the same amino acid.
  • Non-overlapping: A base belongs to only one codon.
  • No commas: Reading occurs in continuous three-base groups.
  • Start codon: AUG (methionine).
  • Stop codons: UAA, UAG, and UGA.
  • Near universality: Most organisms use the same code (except mitochondria).

Mechanism of Translation

Translation involves tRNA, mRNA, rRNA, protein factors, and enzymes. tRNA has two key regions: the 3′ ACC end (binds amino acids) and the anticodon (pairs with mRNA codons). Ribosomes have three tRNA binding sites: P (peptidyl), A (aminoacyl), and E (exit). mRNAs can be translated by multiple ribosomes (polyribosomes). Translation occurs between the 5′ and 3′ untranslated regions of mRNA. The process has three stages:

  1. Initiation: Initiation factors, ribosome subunits, mRNA, and the first aminoacyl-tRNA (methionine or formylmethionine) form the initiation complex, which binds to the cap structure (eukaryotes) or Shine-Dalgarno sequence (prokaryotes) and locates the AUG start codon.
  2. Elongation: Aminoacyl-tRNAs enter the A site, peptide bonds form, the ribosome translocates, and the process repeats.
  3. Termination: A release factor binds to a stop codon, releasing the polypeptide chain.

After translation, proteins fold into their native conformation with the help of chaperone proteins.

DNA Replication

DNA replication, occurring during S phase, ensures faithful genetic information transmission. It follows the semiconservative model (Meselson and Stahl, 1958), where each daughter DNA molecule has one parental and one new strand. Replication can be unidirectional (rare) or bidirectional (common). DNA polymerases catalyze DNA synthesis but cannot unwind DNA, require a primer, and synthesize DNA in the 5′ to 3′ direction. This leads to semidiscontinuous replication, with a leading strand (synthesized continuously) and a lagging strand (synthesized in Okazaki fragments).

Replication in Prokaryotes

E. coli replication involves:

  1. Replication fork formation: Initiator proteins bind to Ori C, helicases unwind DNA, SSB proteins stabilize single strands, and DNA gyrase relieves supercoiling. Primase synthesizes RNA primers.
  2. New DNA strand synthesis: DNA polymerase III synthesizes the leading strand continuously and the lagging strand discontinuously. DNA polymerase I removes primers and fills gaps. DNA ligase joins Okazaki fragments.

Replication in Eukaryotes

Eukaryotic replication differs from prokaryotic replication in several aspects, including multiple origins of replication, different polymerases, and the use of RNA-DNA hybrid primers. Telomerase, a reverse transcriptase, replicates chromosome ends (telomeres) in certain cells, preventing information loss. In differentiated somatic cells, telomerase is inactive, leading to telomere shortening and eventual cell death.

Mutations

Mutations are inheritable alterations in genetic information. Germline mutations affect gametes and are inherited, while somatic mutations affect other cells and are not inherited. Mutations can be classified as:

  1. Gene mutations: Affect one or more DNA nucleotides. These include base substitutions (transitions and transversions) and frameshift mutations (insertions and deletions). Gene mutations can be silent, missense, nonsense, or frameshift.
  2. Chromosomal mutations: Visible changes in chromosome structure, including inversions, deletions, duplications, and translocations.
  3. Karyotypic/genomic mutations: Affect chromosome number. These include euploidy (polyploidy and haploidy) and aneuploidy (nullisomy, monosomy, trisomy, and tetrasomy).

Causes of Mutations

Mutations can be spontaneous (due to replication errors, DNA lesions, or transposable elements) or induced (by chemical mutagens or radiation).