-
Categories
-
Pharmaceutical Intermediates
-
Active Pharmaceutical Ingredients
-
Food Additives
- Industrial Coatings
- Agrochemicals
- Dyes and Pigments
- Surfactant
- Flavors and Fragrances
- Chemical Reagents
- Catalyst and Auxiliary
- Natural Products
- Inorganic Chemistry
-
Organic Chemistry
-
Biochemical Engineering
- Analytical Chemistry
-
Cosmetic Ingredient
- Water Treatment Chemical
-
Pharmaceutical Intermediates
Promotion
ECHEMI Mall
Wholesale
Weekly Price
Exhibition
News
-
Trade Service
When scientists announced the full sequence of the human genome in 2003, they were a little vague
In fact, nearly 20 years later, about 8 percent of the genome has never been fully sequenced, largely because it consists of highly repetitive DNA segments that are difficult to align with other segments
But a three-year-old consortium eventually filled in the remaining DNA, giving scientists and doctors the first complete, gap-free genome sequence
The newly completed genome, called T2T-CHM13, represents a major upgrade from the current reference genome, GRCh38
Among other things, the new DNA sequence revealed previously unseen details of the pericentromeric region
"Discovering the complete sequences of these previously missing genomes tells us a lot about how they are organized, which is completely unknown for many chromosomes," said Nicolas Altemose, a postdoctoral researcher at the University of California and co-author of four new papers on complete genomes
Altemose is the first author of a paper describing the sequence of base pairs around the centromere
The sequencing and analysis were carried out by a team of more than 100 people known as the Telere to Terome Consortium, or T2T for short, named for the telomeres that cover the ends of all chromosomes
Adam Phillippy, one of the leaders of T2T and a senior investigator at the NIH's National Human Genome Research Institute (NHGRI), said: "When their genome sequencing can be better used for their NIH genome sequencing
Evolutionary centromeres
New DNA sequences in and around the centromere make up 6.
"DNA is nothing without proteins," said Altemose, who in 2021 received his Ph.
After the T2T consortium sequenced the missing DNA, Altemose and his team used new techniques to find within the centromere where a large protein complex called the kinetochore holds onto the chromosome so that other machinery in the nucleus can Separate pairs of chromosomes
"When this process goes wrong, you end up with chromosomal mismatches, which can lead to all kinds of problems," he said.
What they found inside and around centromeres were stacks of new sequences overlaying stacks of old ones, as if through evolution, new centromeric regions had been repeatedly placed to bind to the kinetochore
.
Older regions are characterized by more random mutations and deletions, indicating that cells no longer use these regions
.
The kinetochore-bound novel sequences were much less variable and less methylated
.
The addition of methyl groups is an epigenetic tag that tends to silence genes
.
All layers in and around the centromere are made up of repeat lengths of DNA, measured in about 171 base pairs, roughly the length of DNA, that wrap around a set of proteins to form nucleosomes that keep DNA packed and compact
.
These 171 base pair units form larger repeats that are repeated multiple times in tandem, forming a large repeat around the centromere
.
The T2T team focused on just one human genome, from a noncancerous tumor called a mole
.
A mole is essentially a human embryo that rejects maternal DNA and replicates its paternal DNA
.
Such embryos die and transform into tumors
.
But the fact that the mole had two identical copies of paternal DNA -- both from the father's X chromosome, rather than different DNA from the mother and father -- made sequencing easier
.
The researchers also released the full sequence of the Y chromosome from a different source this week, which took nearly as long to assemble as the rest of the genome combined, Altemose said
.
Analysis of this new Y chromosome sequence will be published in a future publication
.
Altemose and his team also used the new reference genome as a scaffold to compare centromeric DNA from 1,600 individuals around the world, revealing significant differences in the sequence and copy number of pericentromeric repeat DNA
.
Previous research has shown that when ancient humans migrated from Africa to other parts of the world, they took only a small sample of genetic variation
.
Altemose and his team confirmed that this pattern extends to the centromere
.
"We found that in individuals with recent ancestors outside of the African continent, their centromeres, at least on the X chromosome, tended to fall into two large clusters, while most of the interesting variation occurred in those with recent African ancestors," Altemose said.
individuals
.
This is not entirely surprising given what we know about the rest of the genome
.
But it shows that we really need to focus on sequencing more of the African genome if we want to study interesting variants in these centromeric regions , and complete the assembly of telomere-to-telomeric sequences
.
"
He points out that DNA sequences around the centromere can also be used to trace human lineages back to our common ape ancestor
.
"As you move away from the active centromere, you get more and more degenerate sequences, and if you go to the farthest shores of this ocean of repetitive sequences, you start to see ancient centromeres," Altemose said.
, maybe our distant primate ancestor once bonded with the kinetochore
.
It's almost like layers of fossils
.
"
Long reads are a game changer
The success of T2T has benefited from improvements in technology for sequencing long stretches of DNA in one pass, which helps determine the order of highly repetitive stretches of DNA
.
These include PacBio's HiFi sequencing, which can read more than 20,000 base pairs in length with high precision
.
On the other hand, the technology developed by Oxford Nanopore Technologies Ltd can read up to millions of base pairs sequentially, but with lower fidelity
.
By contrast, Illumina Inc.
's so-called next-generation sequencing is limited to a few hundred base pairs
.
"These new long-read DNA sequencing technologies are incredible; they are such game-changers, not only for this repetitive DNA world, but because they allow you to sequence individual long DNA molecules," Altemose said
.
You can start asking questions at a resolution that wasn't possible before, not even with short-read sequencing methods
.
"
Altemose plans to explore the centromeric region further, using an improved technique developed by his colleagues at Stanford University to pinpoint the site on the chromosome where the protein binds, similar to how the kinetochore binds to the centromere
.
This technique also uses long-read sequencing technology
.
He and his team describe the technique, called long-read sequence-directed methylation (DiMeLo-seq), in a paper published this week in the journal Nature Methods
.
At the same time, the T2T Consortium is working with the Human Pan-Genome Reference Consortium to develop a reference genome that represents the entire human race
.
Altemose said: "We should have a reference that represents all people, not a single human individual or a hydatidiform mole (not even an actual human individual)
.
There are various ideas on how to achieve this
.
But we What is needed first is an understanding of this variation, and we need a large number of high-quality sequences of individual genomes to make that happen
.
"
references:
1.
Complete genomic and epigenetic maps of human centromeres
2.
DiMeLo-seq: a long-read, single-molecule method for mapping protein-DNA interactions genome-wide