Mutations and Genetic Variation

Mutations

Mutations alter DNA sequences and are the basis of alleles, the source of genetic variation in populations (fuel for evolutionary change), the cause of genetic diseases and disorders, and useful tools for understanding biological processes.

Mutations occur in somatic and germ cells, but only germline mutations are passed down to offspring.

Types of Mutations Based on Protein Function

  • Gain-of-Function (GOF) Mutations: Increase gene/protein activity or produce new activity. Often dominant. Example: Achondroplasia dwarfism caused by a constitutively active FGFR3 receptor protein.
  • Loss-of-Function (LOF) Mutations: Decrease or eliminate gene/protein activity. Often recessive. Example: Cystic fibrosis caused by an ion channel (CFTR) that doesn’t function properly.
  • Conditional Mutations: Under permissive conditions, individuals display a wild-type phenotype. Under restrictive conditions, individuals display mutant phenotypes. Example: Temperature-sensitive mutations like Drosophila shibire(ts), which alters the dynamin gene, causing a reversible block of endocytosis at the restrictive temperature, which prevents membrane cycling and thus depletes synaptic vesicles.

Types of Mutations Based on DNA Sequence Alterations

  • Nucleotide Substitutions: Transitions and transversions.
  • Missense Mutations: Change an amino acid.
  • Nonsense Mutations: Change an amino acid codon to a stop codon.
  • Silent Mutations: Do not alter instructions for making proteins. Example: In coding sequence, ATC and ATT both encode isoleucine.
  • Mutations in Non-Functional Regions of the Genome: Can affect gene expression.
  • Frameshift Mutations: Insertions/deletions/splice site mutations that alter the reading frame.
  • Splice Site Mutations: Alter or remove a splice site.
  • Promoter (& Enhancer) Mutations: Alter the amount of transcription of a protein-encoding gene. Up-mutations increase promoter activity, down-mutations decrease promoter activity.

Identifying Functional Domains

The location of mutations can help identify important functional domains in proteins. For example, the CFTR structure includes two transmembrane domains and two nucleotide-binding domains. The most common CF allele is a deletion that removes a phenylalanine in the NBD1 domain of CFTR.

Causes of Mutations

  • Spontaneous Mutations: Arise from altered base-pairing properties of nucleotides during DNA replication (if the error is not corrected, it is passed on to daughter cells), strand slippage during replication (more common in highly repetitive regions of DNA), or unequal crossing over (produces insertions and deletions). Depurinations produce an apurinic site (spontaneous). Cytosine residues are prone to deamination (spontaneous or chemically induced).
  • Chemical Mutagens: Alter base pairing and damage DNA. Mutagens are chemicals that significantly increase the rate of mutations above spontaneous rates. Examples: Base analog 5-bromouracil can be incorporated into DNA, leading to altered base pairing situations. Many chemical mutagens alter base structure and alter base pairing. Oxidative radicals alter guanine. Intercalating agents distort the double helix and cause insertions and deletions. UV light introduces pyrimidine dimers that often block DNA replication and lead to programmed cell death. Cells have specific mechanisms to bypass cell death, leading to mutation. Benzo(a)pyrene is a major carcinogen in cigarette smoke.
  • Radiation: UV and ionizing radiation can cause mutations.

Ames Test

Potential chemical mutagens are screened using the Ames test, which makes use of reversion to test whether chemicals are mutagenic (can change DNA sequences). His- Salmonella bacteria require histidine to grow. His+ reversion mutations caused by mutagens allow bacteria to grow in the absence of histidine.

DNA Repair

Cells have mechanisms to fix mutations. Two strands are necessary; one serves as a template for repair. Repair systems are redundant; if an error escapes one repair system, it’s likely to be repaired by another. Repair pathways are robust; less than 1/1000 DNA lesions are estimated to become mutations.

General Steps of DNA Repair

  1. Recognition/detection.
  2. Excision: Removal of the nucleotide or nucleotides.
  3. Polymerization: DNA polymerases replace mismatched bases.
  4. Ligation: DNA ligase seals the gaps in the DNA backbone.

Types of DNA Repair

  • Mismatch Repair: Corrects normal, but incorrectly paired bases.
  • Base-Excision Repair: Excises damaged and modified bases then replaces the entire nucleotide. It’s mediated by DNA glycosylases (followed by DNA polymerase and ligase) and is a common mechanism for repairing chemical changes to bases.
  • Direct Repair: Pyrimidine dimers in bacteria are direct-repaired by photoreactivation.
  • Nucleotide-Excision Repair: Removes bulky DNA lesions that distort the double helix (like pyrimidine dimers).
  • Double-Strand Break Repair: Eukaryotic cells can repair chromosome damage induced by radiation. Double-strand breaks can be repaired by homologous recombination or non-homologous end joining. Chromosomal rearrangement produced by ionizing radiation of fibroblast cells. Non-homologous end-joining after ionizing radiation exposure can lead to translocations and inversions. Homologous recombination: mechanisms of crossing over (recombination) in meiosis, repair of double-stranded breaks.

Gene Regulation

Types of Genes

  • Structural Genes: Genes that encode proteins used in biosynthesis and metabolism.
  • Regulatory Genes: Genes whose products (RNA or protein) interact with other sequences and affect the transcription or translation of those sequences (e.g., DNA-binding proteins).
  • Constitutive Genes: Genes that are always being transcribed.

Regulatory Elements

Sequences that are not transcribed but play a role in regulating other nucleotide sequences to which they are physically linked (these are often binding sites for transcription factors encoded by regulatory genes).

Operons

An operon is a sequence of nucleotides that includes a regulatory region and more than one gene. Genes are regulated as a unit. Common in prokaryotes. Proteins encoded by operons often work together to:

  • Make Stuff: Synthesize a product from raw materials (anabolism).
  • Break Stuff: Break down a molecule into products that can be utilized by a cell (catabolism).

Regulation of Operons

Regulatory genes encode proteins that bind to regulatory sequences to control operons. Models of regulation include:

  • Positive Control: Turning genes on (processes that stimulate gene and/or protein expression).
  • Negative Control: Turning genes off (processes that inhibit gene and/or protein expression).

Operons can be inducible and repressible:

  • Negative Control Systems: Inducible (substrate makes repressor inactive), repressible (product makes repressor active).
  • Positive Control Systems: Inducible (allosteric effector (substrate makes activator active)), repressible (allosteric inhibitor (product makes activator inactive)).

Examples of Operon Regulation

  • Negative Inducible: Usually controls the transcription of proteins that carry out degradative processes (proteins are not needed unless the substrate to be broken down is present).
  • Negative Repressible: Usually controls the transcription of proteins involved in biosynthesis, such as the synthesis of amino acids (the process is turned off when adequate amounts of product are present in the cell).
  • Positive Inducible: The inducer is often a precursor (substrate) of the reaction controlled by the operon, so the enzymes necessary would only be synthesized when the substrate is present.
  • Positive Repressible: The product of the operon is a repressing substance; if sufficient product is present, the synthesis of operon proteins is not necessary.

Lac Operon

The lac operon is a classic example of gene regulation in prokaryotes. It controls the metabolism of lactose in E. coli.

  • Lactose Unavailable (Glucose Available): Lac repressor protein binds to the operator (lacO) sequence and inhibits transcription.
  • Lactose Available (Glucose Unavailable): With the repressor protein inactivated by allolactose binding, RNA polymerase carries out transcription.
  • Lactose and Glucose Available: Basal transcription.

Components of the Lac Operon

  • LacI: Repressor protein, contains two binding sites, one for the operator and one for allolactose, the inducer. L-: Unable to bind to the operator. Ls: Unable to bind the inducer (allolactose).
  • LacZ: β-galactosidase, cleaves lactose into two monosaccharides (glucose and galactose). Z-: No function β-galactosidase.
  • LacY: Permease, facilitates lactose transport across the cell membrane. Y-: No functional permease.
  • LacA: Transacetylase, protects against harmful by-products of lactose metabolism. A-: No transacetylase.
  • LacO: Operator, binds the repressor protein to block transcription of operon genes. OC: Fails to bind the repressor protein.
  • LacP: Promoter, binds RNA polymerase. P-: Fails to bind RNA polymerase or does so weakly.

Tryptophan (trp) Operon

The tryptophan (trp) operon is a repressible operon that produces five polypeptides that participate in tryptophan synthesis. Trp operon transcription is inhibited by a feedback mechanism involving tryptophan as a corepressor. Trp operon gene expression is attenuated to maintain the cellular concentration of tryptophan at a steady state. Many amino acid operons are regulated by an attenuation mechanism.

  • trpL (Leader) Region: Contains an attenuator sequence of four DNA repeats that form one of two alternative mRNA stem loops.
  • 2–3 (Antitermination) Stem Loop: Formed by mRNA, permits transcription of five trp operon structural genes in a polycistronic mRNA.
  • 3–4 (Termination) Stem Loop: Of mRNA terminates transcription before RNA polymerase binds to structural genes of the operon.

Eukaryotic Gene Regulation

Similarities Between Prokaryotes and Eukaryotes

  • Basal (or core) promoter binds transcriptional machinery (RNA polymerase and friends) to activate transcription.
  • Some genes are expressed constitutively (always), others conditionally.
  • Proteins (specific transcription factors) that exert positive and negative control bind to additional sites (enhancers, silencers, regulatory promoter).
  • Binding of specific transcription factors is conditional in some way (sometimes allosteric, more often regulated nuclear entry).

Differences Between Prokaryotes and Eukaryotes

  • Not organized into operons (usually), although there are some polygenic messages.
  • Chromatin/histone problem.
  • Nuclear membrane separates transcription and translation in time and space.
  • Additional regulatory steps are possible post-transcriptionally.

Regulation at the Level of mRNA

  • Alternative Splicing: SR proteins bind to exonic splice enhancers to promote splicing, regulated by the presence of splice machinery and enhancers.
  • mRNA Stability: 3′ UTR (untranslated region) binding proteins and 5′ UTR binding proteins (the amount of mRNA to be translated is a function of both mRNA synthesis and degradation).

Regulation of Translation

  • RNA-binding proteins that interfere with translation.
  • miRNAs and siRNAs.

RNA-Binding Proteins

  • Masking prevents translation by interfering with the PolyA tail’s interaction with the ribosome.

MicroRNAs (miRNAs)

  • Can bind to the 3′ UTR of mRNAs and silence gene expression.
  • miRNAs and RNAi pathways share components.
  • In general, imperfect base pairing between a small RNA and an mRNA blocks translation; perfect base pairing signals degradation.
  • Some microRNAs transcriptionally silence genes by binding DNA and attracting DNA (or histone) methylases.

Eukaryotic Transcriptional Apparatus

  • Recruited by activator proteins bound to the regulatory promoter or to enhancers.
  • Mediator complex serves as a bridge between activators and general transcription factors.
  • Histone-modifying enzymes and chromatin remodeling complexes help make the DNA template accessible.
  • Insulators keep enhancers from inappropriately regulating nearby genes.
  • Activation (and repression) can be tissue-specific.

Silencers

  • Compete with activators for binding sites (man-to-man defense).
  • Bind nearby and prevent the activator from interacting with the basal transcriptional machinery (more of a zone defense).
  • Interfere with the assembly of the basal transcriptional machinery (goaltending).

Enhancers

  • Can contain many activator and silencer (repressor) binding sites that are often overlapping.

Chromatin Regulation of Gene Expression

  • DNase I Hypersensitive Sites: Areas of DNA”accessible to transcription factors”
  • Histone-Modifying Enzymes and Histone Code: Methylation of H3K4: Activating modification found around transcriptional start sites, bound by NURF nucleosome remodeling factor. Acetylation, e.g., of H4K16 (added by HATs (histone acetyltransferases), generally destabilize chromatin and promote transcription. HDACs (histone deacetylases) remove acetyl groups and are usually negative regulators of transcription. Phosphorylation. Together, patterns of all modifications around a gene constitute the histone code.
  • Chromatin Remodeling Complexes: Reorganize histones without modifying them chemically.

Histone Acetylation and Deacetylation

  • Regulate DNA accessibility: HATs and HDACs regulate the acetylation status of chromatin, allowing the opening and closing of promoters.
  • Histone methyltransferases further open or close chromatin.

DNA Methylation

  • Associated with transcriptional silencing.
  • Common in CG dinucleotides.
  • Can recruit HDACs via interaction between proteins that bind methylated DNA and HDACs.
  • Fragile X mutations are CGG repeats near the promoter of the FMR1 gene. These are methylated and silence FMR1.

Epigenetics

Epigenetics = alteration of the genome without changing the DNA sequence. Epigenetic inheritance can occur when genes or chromosomes are modified such that gene expression is altered without changing the nucleotide sequence. These modifications may be inherited but are reversible (i.e., not permanent over many generations).

Environmental Epigenetics

  • Modifications include DNA methylation as well as histone acetylation and methylation.
  • Example: Genomic Imprinting. The allele inherited from one parent is preferentially expressed; the other is silenced. Igf2 (insulin-like growth factor 2). Prader-Willi and Angelman syndromes (chromosome 15 imprinting).

Example: tinman Gene

  • tinman is a Drosophila gene that encodes a homeodomain transcription factor required for heart development in Drosophila.
  • Drosophila with a tinman mutation lack a heart and die during embryogenesis.
  • The mammalian homolog of tinman is Nkx2.5.
  • Mice in which the Nkx2.5 gene is knocked out have many defects and die as embryos. These embryos initiate but do not complete heart development.
  • Humans heterozygous for mutations in Nkx2.5 are born with congenital heart defects.
  • The Nkx2.5 gene is regulated by multiple enhancers and silencers.

Quantitative Genetics

Types of Traits

  • Discontinuous or Discrete Traits: Traits that only have a few distinct phenotypes (AA, Aa, aa). The relationship between genotype and phenotype is relatively simple. Most phenotypes are encoded by a single genotype.
  • Continuous or Polygenic Traits: Traits that have an apparent continuous distribution of phenotype. Phenotypes are often affected by many genes acting in concert. Examples: Height, skin color, eye color, blood pressure, litter size, susceptibility to disease. Are influenced by multiple genes (polygenic) and the environment.

Quantitative Traits

  • Result when more than one gene contributes to a phenotype, each with a small but additive effect, and/or environmental factors influence the phenotype.
  • Alleles are additive.
  • As more genes (loci) influence a polygenic trait, the 1-to-1 relationship between genotype and phenotype disappears. Genotype can only be inferred from phenotypes at extremes.
  • As the number of loci increases, the distribution looks increasingly like a Gaussian curve.

Phenotypic Variance

  • Can be described statistically.
  • For populations, Vp = Vg + Ve. Phenotypic variance = variance due to genetics + variance due to the environment. An individual does not have variance.

Heritability

  • The amount of variation in a population that is due to genetic factors and how much is due to the environment.
  • Broad-Sense Heritability: Takes into account all types of genetic variance. H^2 = Vg/Vp.
  • Narrow-Sense Heritability: Takes into account only genetic variance due to additive effects. h^2 = Va/Vp.

Estimating Heritability

  • Parent-Offspring Regression Analysis: Slope ~ heritability. h^2 = 0.70. For regression with one parent, twice the slope of the regression coefficient is ~ heritability. h^2 = 2m = 0.67.
  • Response to Selection: h^2 = R/S. R = response to selection, the difference between the offspring mean and the mean of the population. S = selection differential, the difference between the mean of the two parents and the population mean.

Interpreting Heritability

  • Heritability determines to what extent genes are responsible for variation in a characteristic, not the extent to which genes determine a characteristic.
  • Heritability is a population statistic. An individual cannot have heritability because it is based on variation around the mean value for a group.
  • Heritability is specific for a given population in a given environment and not generalizable to all populations.

Population Genetics

  • Population genetics asks how a group’s genetic variation changes over time. The focus is often on individual genes and discrete phenotypes in populations. Tools are allele and genotype frequencies.
  • Quantitative genetics asks how variation in genotype and environment combine to contribute to variability in phenotype. The focus is not on individual genes, but how numerous genes and the environment combine to produce a wide range of phenotypic variability seen for a single trait in groups of individuals. Tools are means and variances (statistics).

Hardy-Weinberg Equilibrium (HWE)

  • If a population is not evolving with respect to a trait, allele frequencies are constant over generations.
  • When agents of evolution are at work, allele frequencies change.
  • Allele frequencies can be calculated from genotype data.
  • The frequency of the dominant allele is denoted”p”
  • The frequency of the recessive allele is”q”
  • For a gene with two alleles, p + q = 1.
  • HWE: p^2 + 2pq + q^2 = 1.
  • The Hardy-Weinberg principle is a mathematical model that describes stable (non-evolving) traits in populations.
  • In an evolving population, allele frequencies for certain traits will change from generation to generation.
  • If a population is not evolving, allele frequencies will remain stable over generations. Such populations are said to be in HWE.
  • Genotypes and allele frequencies can be in HWE (means no evolution of that gene/genotype is occurring) if: diploid, sexually reproducing population, random mating, no mutation, no migration (in or out), sufficiently large population, no selection.
  • When a gene is in HWE, we can calculate genotype frequencies from allele frequencies.
  • Geneticists studying real-life populations often use the Hardy-Weinberg equation to determine whether evolution is likely occurring with respect to a given trait.
  • By comparing observed genotype frequencies in populations to those predicted by the Hardy-Weinberg equation, geneticists can draw conclusions about whether evolutionary forces are at work.
  • If we study a trait (and its alleles and genotypes) in a population and find it to be IN HWE, then we can conclude that the population is likely not evolving with respect to that trait.
  • If we study a trait (and its alleles and genotypes) in a population and find it to be NOT IN HWE, then we can conclude that the population is likely evolving with respect to that trait.

Natural Selection

  • Natural selection works on existing variation in populations.
  • Variation is genetically heritable: it is encoded by alleles of genes.
  • Variation in phenotype results in differential survival or offspring.
  • Genotypes that encode for an advantageous phenotype (trait) increase in frequency in populations.

Hardy-Weinberg Equation with Three Alleles

  • If we define the frequency of alleles as p, q, r: p + q + r = 1.

Agents of Evolutionary Change

  • Mutation.
  • Gene flow (migration).
  • Genetic drift (small population because of bottleneck, founder effect).
  • Non-random mating.
  • Selection.