Genomes and Their Applications
Genomes explain why everyone has different combinations of traits passed down from their ancestors.
Cell Theory
Cells are the basic unit of life for all living things, and cells only arise from other cells. All cells contain a plasma membrane and cytoplasm. Many eukaryotic cells are 16 mm in diameter. The biggest cells are ostrich eggs. Cells come in different sizes, shapes, and types.
Multicellular Organisms
Cells are organized into tissues. Tissues are organized into organs. All cells in an organism are genetically identical. The cells in one organism are derived from a single cell that divides, then those cells become specialized. Eukaryotic cells are further divided into membrane-bound structures called organelles. Animals, plants, fungi, and protists are eukaryotic. Organelles cannot sustain life independently.
Components of a Cell
- Nucleus: Organelle where DNA is stored.
- Nuclear Envelope: Membrane around the nucleus.
- Mitochondria: Organelles where energy from sugars is made available for use in the cell.
- Chloroplasts: Organelles in plants and green algae; make sugar from energy from the sun.
- Ribosome: Makes proteins.
- Plasma Membrane: Membrane around the outside of the cell.
- Cell Wall: Layer outside the cell membrane in some cells.
Molecules in Cells
- All types of cells share certain features that are important to metabolism and sustaining life.
- Four groups of molecules found in all cells: proteins, lipids, nucleic acids, and carbohydrates.
- Macromolecule polymers: large molecules made up of multiple subunits.
- Proteins, nucleic acids, and carbohydrates are macromolecule polymers.
- Proteins: macromolecules composed of different combinations of 20 amino acid subunits.
- Two or more amino acids are joined together with a peptide bond.
- The chemical natures of the amino acids determine the characteristics of a protein:
- Protein configuration (3D structure)
- Interactions with other proteins or other types of molecules
- Biochemical capacity
- Proteins form structures within the cell and catalyze chemical reactions.
- Enzyme: Name of a protein that can catalyze a chemical reaction.
- Lipids: Molecules composed of hydrocarbon chains that are insoluble in water.
- Different groups of lipids form different types of structures.
- Carbohydrates: Macromolecules composed of sugars.
- Important for cellular metabolism.
- Cells use carbohydrates for energy that the body can use.
- Plants make carbohydrates from CO2 using solar energy.
- Sugars and starches store energy for cells.
- Some types contribute to cell structures, like cell walls.
- Nucleic Acids: Polymers made up of four different types of nucleotide subunits.
- Two basic types of nucleic acids in cells: RNA and DNA.
- RNA and DNA differ slightly in nucleotide structure.
The Human Genome
Genome: The entire set of genetic information for a species.
Chromosome: An individual molecule of DNA.
A genome is organized into chromosomes. The X-shapes you see are long molecules of DNA wrapped around proteins and looped repeatedly. Each type of chromosome has a characteristic size and shape.
- Contains all the genes needed for survival.
- Contains all the genes needed to make new cells.
- Contains all the genes needed to make offspring.
- Offspring get a set of chromosomes from each parent.
Chromosomes Are Made of DNA
DNA is composed of subunits called nucleotides. Four different types of nucleotides are characterized by a nitrogenous base. Nitrogenous bases are part of the molecule in the center of DNA. Four different types of nitrogenous bases: Adenine, Guanine, Cytosine, Thymine. Nucleotides are made of a phosphate group, sugar, and nitrogenous base.
Genes Are Made of DNA
Gene: A specific region of a chromosome that is read by enzymes to produce RNA. This RNA will be used by the cell for specific functions or to make proteins that have specific functions.
DNA → RNA → Proteins
The DNA of genes is read by enzymes to produce RNA in a process called transcription. Messenger RNA (mRNA) is read to build a protein out of amino acids in a process called translation.
Not All DNA Is Part of a Gene
Some parts of the chromosome are important for survival but are not used to make RNA or proteins. These regions have different functions, such as: protecting the ends of the chromosomes from degrading; being a place for proteins to bind that move chromosomes into daughter cells during cell division; being a place for proteins to bind and start making RNA from DNA.
Alleles and Traits
Genomes also explain the differences between individuals of the same species. Each type of chromosome has the same sets of genes, but individual chromosomes might have different versions of those genes.
Alleles: Different versions of a gene.
Genotype: Combination of alleles for one individual. Can have two copies of a chromosome: homozygous and heterozygous.
Genomes, Cell Division, and Mutations
The entire genome is duplicated during cell division so that each daughter cell gets a full set of chromosomes. Some errors that occur during cell division are tolerable and contribute to differences between individuals.
Defining a Species
The concept of a species was developed before scientists had data about genomes. In some cases, genomic data changes our understanding of species vs. subspecies. In some cases, genomic data supported what we already knew.
The Human Genome
Humans have about 25,000 genes within our genome. Locus: The expected location of a gene on a specific chromosome.
Alleles in a Population
Small numbers of mutations in the genome will stay in a population as genetic diversity as long as they are tolerated. If groups of individuals within a species accumulate enough genetic differences, they may form a new species.
Genomes, Cell Cycles, and Life Cycles
Genomes are essential to the ability of cells to function and reproduce. Genomes are replicated before cell division—each new cell gets a full set of chromosomes. Genome replication, cell division, and differentiation are regulated throughout the development of an individual organism.
- Genome replication: A second copy is made of the genetic material.
- Cell division: When a cell divides into two.
- Differentiation: When a cell matures to have specific functions.
Cell Cycles
Cell cycle: The pattern of cell division, cell growth, and differentiation for individual cells. Different cell types can experience different cell cycle patterns. Some cells do not divide. Some divide rapidly. Cell cycles coincide with the function of the cell and with the life cycle stages.
Life Cycles
Life cycle: The series of changes an entire organism experiences from birth through reproduction. Details of life cycles differ between species, with some commonalities, especially between related species. Asexual reproduction is dependent only on cell division, without any union of specialized cells from different parents. Some use sexual reproduction or both asexual and sexual reproduction. Unicellular organisms mainly reproduce asexually. Flowering plants have special structures very different from animals.
The Human Life Cycle
A sexual reproduction event involves the union of a sperm and egg cell. The combination of gametes creates a new type of cell called a zygote. The zygote undergoes many rounds of cell division to develop into a multicellular embryo. The cells in an embryo divide and differentiate to develop into a fetus.
Types of Cell Division
In all life cycles, cell division and continuity of the genome between generations is essential. Eukaryotes have two different types of cell division: mitosis and meiosis. DNA replication happens before both. Both have the same basic phases: prophase, metaphase, anaphase, telophase.
Mitosis
- One round of cell division occurs.
- Chromosomes line up one by one in metaphase.
- The two copies of each chromosome separate in anaphase.
Meiosis
- Two rounds of cell division occur.
- The first round separates pairs of chromosomes.
- Chromosomes line up two by two in metaphase I.
- The pairs separate in anaphase I.
- The second round separates copies of chromosomes.
- Chromosomes line up one by one in metaphase II.
- The copies separate in anaphase II.
- Results in four daughter cells with half the DNA from the parent cell—one of each type of chromosome.
- Purpose: Making gametes with half the chromosomes for sexual reproduction.
Historical Experiments
Inheritance has been studied for centuries, starting before we knew about genomes.
- 1800s: Early observations of inheritance by Mendel on pea plants showed traits are inherited by offspring in predictable patterns.
- Early 1900s: Microscope observations of chromosomes by Sutton introduced the idea of chromosomes as inherited molecules.
- Early 1900s: Biochemical analysis by many scientists showed chromosomes are made of DNA and proteins.
- 1944: Experiment by Avery, MacLeod, and McCarty provided definitive evidence that DNA is the macromolecule containing genetic information.
- 1953: Model of DNA double helix structure by Watson and Crick, based on their own work and the work of Wilkins and Franklin. Franklin’s experiment showed DNA as a double helix.
DNA sequence: The order of nitrogenous bases along a strand of DNA.
The Invention of DNA Sequencing
Genome sequencing: Sequencing DNA for all chromosomes.
- 1970s: The Sanger dideoxy sequencing technique was developed.
- 1990s: Sequencing was automated; machines could complete larger-scale sequencing. The development of sequencing led to a new field called genomics.
Genomics
Genomics: The study of all the genetic information within an organism.
The Human Genome Project
The Human Genome Project went from 1990–2003. It took 13 years and $2.7 billion and used automated Sanger sequencing.
Sequencing Techniques
Over the years, different companies have come up with similar sequencing techniques. DNA sequencing technologies are referred to by their generation. First-generation sequencing is automated Maxam-Gilbert sequencing and Sanger sequencing. Next-generation sequencing is used now.
Sanger Sequencing
- Requires many copies of the same DNA molecule.
- The DNA molecules were chemically labeled and separated through a gel using electricity (electrophoresis).
- The scientist would look at the gel to determine the order of nucleotides.
- Requires special components that are specific to what you are trying to sequence (primers).
- Often involves “walking” along the molecule to identify the sequences between the ones you already know.
First-Gen Sequencing
- Same biochemical approach as manual Sanger and Maxam-Gilbert sequencing, but is automated.
- Each type of nucleotide is labeled so that they give off a different wavelength of light (different color).
- Still widely used for certain applications.
- The Human Genome Project used it.
Next-Gen Sequencing
- Second- through fourth-generation sequencing techniques are broadly called next-gen sequencing.
- These methods use different biochemical approaches than the earlier methods and are entirely automated.
- Complex populations of molecules can be sequenced at the same time.
- Entire genomes can be sequenced in one set of reactions rather than having to do thousands of separate reactions.
- Populations of molecules are tagged with special markers on each end so that they can be tracked.
- You don’t need to already know a portion of the sequence to use next-gen sequencing.
- In some techniques, each molecule is used as a template for DNA sequencing. Nucleotides being added to the new chain are detected and identified one at a time.
- In other techniques, individual molecules are fed through pores that use chemical and physical detectors. Each nucleotide is identified one at a time.
Sequence Analysis and Bioinformatics
Analyzing DNA sequences has required advances in data processing and storage. Scientists needed to organize and analyze large datasets. Bioinformatics: “The science of collecting and analyzing complex biological data such as genetic codes.” Bioinformatics emerged alongside advances in genomics. Genome browsers: Databases that store genome sequences and provide a useful interface to look at them. Cheaper, more reliable generation of genomic data increased the importance of computational resources and tools.
Other -omics
The suffix “-omics” means the large-scale study of the entire set of some molecule within a cell/cell type/population.
- Transcriptomics: Studying the sequence of RNA molecules from one collection of cells. RNA isolated from cells is copied into DNA, then that DNA is sequenced. Can study variations in RNA content between cell types and growth conditions to find out which genes are transcribed in which conditions.
- Proteomics: The scientific study of the entire protein composition of cells or tissues. Sequencing proteins is very different from sequencing DNA and RNA because they are very biochemically different from nucleic acids.
Metagenomes
Microbes
Microbes are diverse and abundant unicellular organisms (bacteria, fungi, viruses, and other). They are abundant in both numbers of species and numbers of individuals and are critical parts of ecosystems. They have diverse biochemistry and play roles in nutrient cycling and food chains.
The Study of Microbes
Microbes can only be observed under microscopes. Historically, studying microbes was dependent on isolating and culturing the microbe. Bacterial species that grow in the absence of oxygen or at really high temperatures are hard to grow. Anything pathogenic has safety risks to researchers. Metagenomic approaches allow characterization of the number and identity of bacterial species just by isolating and sequencing DNA from a sample. Metagenome: A collection of genomes from many individuals within an environment or sample. Can be a whole area or a microenvironment; just need a sample.
Metagenomics
Computational analysis of sequence data allows for the in silico assignment of each sequence to a different species found in the sample. In silico = done on a computer. Particular gene sequences are expected for certain species. It is like a forensic analysis of the natural world. The detection of a DNA sequence that is unique to a given species is considered evidence that a species is present in the environment the sample was taken from.
Microbiome
Microbiome: The microorganisms in a particular environment. Microbiomes can be inside the tissues/organs of multicellular organisms, soil, or bodies of water. Can include bacteria, archaea, fungi, and viruses. Many species of microbes coexist with our own cells and are beneficial to our health.
The Human Microbiome
The range of microbial species in different tissues are analyzed to study human microbiomes. This is usually done through DNA sequencing (metagenomics). We now have general descriptions of the microbial compositions of gut, nose, mouth, and skin microbiomes. Microbiomes have different purposes depending on location and species composition. They can help with digestion or response to substances in the environment. Environmental exposure and ingestion of certain foods or medications can alter microbiome composition. Changes in microbiome diversity can be a sign of certain illnesses or conditions. Having healthy populations of beneficial microbes prevents harmful ones from establishing populations.
Soil Microbiomes
Both natural soil and agriculturally maintained soil contain microbes, which perform important functions within the soil, such as: decomposing other organisms; recycling nutrients; capturing nutrients from the environment and converting them to usable forms for other organisms. Soil inoculation: When microbes are added to soils to increase yield in crop plants.
Metagenomes in Soil
Modern genome sequencing techniques can be used on DNA isolated from soil samples. This allows for the cataloging of species present in the sample through metagenomics. DNA found in the soil is fragmented, copied, sequenced, and then analyzed. Looking at metagenomes in soil samples can help researchers understand complex ecosystems and understand how ecosystems change in response to environmental impacts and estimate soil health for agriculture.
Microbes in Extreme Environments
Extremophiles: Organisms that have adapted to survive in conditions far outside the range that most organisms can survive in. It was assumed biodiversity was low in these environments, but that’s untrue. Since extremophiles grow in harsh conditions that are hard to replicate, it can be a barrier to study in labs. Metagenomics has allowed for the discovery of extremophile biodiversity. Remote and unmanned sample collection is used with metagenomic sequencing to catalogue species in extreme environments.
Genomic Disciplines
Genomics—What Is It Used For?
Greater availability and affordability of genome sequencing has allowed for the creation of a large amount of data about genomes. Genomes can be studied across a wide variety of organisms. Use genomes to study the genetic diversity that corresponds to biodiversity and speciation, the genetic basis of disease, and what organisms live in a specific environment.
Sub-fields of Genomics
Genomics is the study of the genome at any level. Scientists study genomes for a wide variety of reasons, so there are many sub-fields. Two of the broadest sub-fields are: structural genomics (the construction and analysis of genome maps, gene sequences, gene annotations, and the comparison of genome structures between organisms) and functional genomics (studies that analyze gene and genome functions, often including high-throughput analysis of gene expression).
Comparative Genomics
Comparison of genome similarities and differences between species to understand evolutionary relationships and the function of genetic elements and regulatory features.
Genomes and Evolution
The current scientific understanding of life on Earth is based on the principles of evolution. How this relates to genomes: the presence of conserved sequences and the presence of diverged sequences. Some organisms have not changed very much over time. Some organisms have evolved more recently to have newly developed functions. The most highly conserved sequences are likely similar to those in ancestral organisms. The principles of evolution were originally proposed to explain similarities and differences in physical traits before there was an understanding of genomics. Genomic data still supports these principles. Sometimes genomic evidence supports a different relationship between two species than previously thought. Genomic data cannot completely replace morphological data. The best models use multiple types of data.
Cancer Genomics
The study of genome sequences of tumor cells from cancer patients, which are often highly variable compared to all of the other cells in the patient.
Computational Genomics
The application of computational and statistical approaches to the study and analysis of genome sequence data.
Metagenomics
The analysis of biological communities through the use of genome sequencing. Multispecies communities of bacteria coexist in many environments, like the human intestinal tract, soil in natural and agricultural environments, and aquatic environments. Next-gen sequencing can be used to catalogue the species that are present within these environments (with DNA).
Knowledge about genomes is used to combat human illnesses and improve health. Investigations of the genomes of pathogens like bacteria and viruses help us understand diseases and make treatments. Many different types of human genome studies contribute to advances in medicine.
Viral Genomes
Viruses: Infectious particles consisting of a chromosome encapsulated in protein and/or a membrane. Viruses are not capable of independent survival or replication without infecting a host cell. Scientists disagree on whether viruses are alive or not. Viruses have very small genomes. We can sequence viral genomes from human tissue samples. Viruses are broadly grouped by their type of chromosome.
- Examples of DNA viruses: herpes, adenoviruses, hepatitis B
- Examples of RNA viruses: HIV, influenza, SARS-CoV-2 (COVID-19) and other coronaviruses (SARS, MERS, common cold)
When a virus infects a host cell, it lands on the surface and its chromosome is put inside the cell.
Viral Genomes: Retroviruses
Retroviruses: A group of viruses that encode special enzymes to make a DNA copy of its RNA genome. The DNA copy integrates onto the host chromosome and becomes permanent. Once integrated into the host genome, they sometimes become latent (inactivated) due to mutation. This causes them to not be detectable, since no new viruses are being produced and the host has no symptoms. They can only be detected by analyzing the host’s genome sequence.
Viral Genomes in Medicine
Viruses have host specificity: the ability to infect specific types of cells. The ability to infect certain types of cells is associated with certain types of genes on viral chromosomes. Data on viral genomes can inform basic research, clinical science, and clinical practice. Sequencing the genome of a new viral pathogen allows us to: understand its origin and pathogenicity (how infectious); predict optimal treatment plans; and design vaccines. Sequencing also allows us to: identify mutations that might affect drug resistance; better understand host specificity, vaccination efficacy, and host immunity.
Microbiomes
A diverse and robust microbiome is a healthy component of many human tissues/organs. The intestinal tract and the surface of the skin have important microbiomes. Medications and other exposures that kill the microbiome can be detrimental to human health. Fewer beneficial microbes make room for more harmful ones, impacting the overall health and physiology of the host. Digestion of plant material and maintenance of a healthy immune response are two ways the human microbiome is important to our health.
GWAS Studies
Genome-wide association studies (GWAS): Comparisons of genome sequences between different individuals of the same species with distinct phenotypes. Special calculations are done across the genome to figure out which genetic differences might be connected to phenotypes of interest. This is useful for identifying what genes have connections to predispositions for certain health conditions. Has been used to see genetic relationships to many different kinds of cancer. GWAS studies can be applied to any type of population exhibiting any type of characteristic.
Data Privacy and Ethics
Personal genome sequencing has become feasible and available to many people. Some put their genomes into public databases that are used for genealogical research. The possible implications of public access to sequence data must be considered. This data includes private health information, which can include someone’s predisposition to certain diseases. Personal genome sequence data has also been used by law enforcement to track down suspects of unsolved crimes. Intellectual property considerations: Companies may benefit financially from having access to human genome sequences if they develop therapies. Legal battles have occurred over whether human genomes can be subject to patent protections or if they should be excluded to benefit all mankind. Access to more data is exciting and interesting for scientists, but privacy concerns and ethics must be considered by scientists and the general public.
Genome Editing
Research is ongoing into the possible application of genome editing to cure certain conditions, called gene therapy. Bacterial genes can be engineered into living human cells to modify the DNA—the zinc-finger system, TALENs, or the CRISPR-Cas9 system. Diseases caused by a single gene mutation or only a few mutations are good candidates for gene therapy cures.
- Example: A possible cure for type 1 diabetes. Genome editing on cells could create populations of cells that produce insulin but will not be attacked by the immune system. Patients would be able to control their blood sugar without needing artificial insulin.
- Example: Successful gene therapy for sickle cell disease.
The most controversial potential application of genome editing is to human germline cells (sperm and egg) or embryos. Scientists and ethicists have advocated for a moratorium on this type of research.
Genetic Counseling
Genetic counselors: Licensed professionals trained in science and counseling that help patients interpret genetic analysis results. Genome sequencing results might require decisions to be made about their health.
Pharmacogenomics
Pharmacogenomics: The intersection of genomic differences in individuals and their distinct responses to different drug treatments. One way this is used is by sequencing a patient’s tumor cells to decide on what treatments might be effective.
Genomes and Food
Agriculture and Domestication
Agriculture: Growing plants and animals in human-managed areas for both food and non-food purposes. Agricultural practices include the selection, cultivation, and intentional breeding of specific organisms to maximize beneficial traits. Domestication: Intentional process of selection for agriculture resulting in new genetic strains. Specific genome regions connected to the traits that were selected for have been identified.
The Green Revolution
Growing large amounts of domesticated species causes nutrient depletion in soil, and animal husbandry changes soil quality and vegetation. In the mid-20th century, agricultural yields from crops like rice and wheat plateaued. Initial increases came from irrigation and fertilization, creating pollution. To solve these issues, scientists created new strains of crops through plant breeding. Norman Borlaug was a plant geneticist who won the Nobel Prize in 1970 for his work on this.
Domesticated Species
Comparing genome sequences can be used to trace the ancestry of domesticated organisms. Phenotypic similarities in different dog breeds have been associated with certain genes. Particular alleles of the RSPO2 gene are associated with certain dog traits.
GWAS Studies and Agriculture
GWAS studies have allowed researchers to identify the ancestors of important crops. This information could help bring back some genetic diversity lost from domestication. Can also maximize the potential for crops. This type of analysis has helped with nutrient limitation in soil, nutrient limitation in livestock feed, and pathogen susceptibility/resistance.
Review
Most Important Concepts
- Genome: The entire set of genetic information for a species.
- Genomes are made of DNA.
- DNA is made of nucleotide subunits.
- Nucleotides are characterized by their nitrogenous base: adenine, thymine, guanine, and cytosine.
- A DNA sequence is the nucleotides listed in order.
- A chromosome is one DNA molecule.
- Genomes are inside of cells; in eukaryotes, they are housed in the nucleus.
- Cells must divide to reproduce, through mitosis or meiosis.
- The genome has to be divided evenly between daughter cells.
- Mitosis daughter cells have a full genome; meiosis daughter cells have only half.
- Genomics can be used for medicine, agriculture, understanding evolution, and understanding microbiomes.