-
Categories
-
Pharmaceutical Intermediates
-
Active Pharmaceutical Ingredients
-
Food Additives
- Industrial Coatings
- Agrochemicals
- Dyes and Pigments
- Surfactant
- Flavors and Fragrances
- Chemical Reagents
- Catalyst and Auxiliary
- Natural Products
- Inorganic Chemistry
-
Organic Chemistry
-
Biochemical Engineering
- Analytical Chemistry
-
Cosmetic Ingredient
- Water Treatment Chemical
-
Pharmaceutical Intermediates
Promotion
ECHEMI Mall
Wholesale
Weekly Price
Exhibition
News
-
Trade Service
A particular advantage of SCALEX is its high generalization encoder
.
This encoder can project single-cell sequencing data to generate a batch-independent, unified low-dimensional cell embedding space
.
With the development of single-cell sequencing technology, single-cell scientific research continues to deepen, the scale is getting larger and larger, and the objects studied are becoming more and more complex
.
Integrating single-cell sequencing data from different sources, eliminating batch effects, and conducting comprehensive mining and analysis are now a basic and core link
in single-cell sequencing data analysis.
At present, the integration of single-cell sequencing data faces the following challenges:
1.
The batch effect caused by different experimental samples, experimental platforms, library construction methods and even operations will introduce non-biological noise into the single-cell sequencing data, interfering with the extraction and analysis of biological differences between cells;
2.
The scale of single-cell research continues to expand, and the data at the level of millions of cells puts forward higher requirements for the efficiency of the integration algorithm;
3.
The types of single-cell sequencing samples are also increasing, and different single-cell sequencing datasets often include highly heterogeneous cell subsets;
4.
Finally, the latest and most important point, how to fully reuse the old knowledge of a large amount of existing data, explore and analyze
the new data.
At present, most of the single-cell sequencing data integration algorithms correct batch effects based on the cell similarity between different batches of data, which has the drawbacks
of over-integration (especially the integration of datasets with large differences in cell heterogeneity), poor scalability, and inability to directly apply existing models to new datasets.
On October 17, 2022, Professor Zhang Qiangfeng's research group at the School of Life Sciences/Beijing Biostructure Frontier Research Center of Tsinghua University published an online publication in the journal Nature Communications entitled "Online single-cell data integration through projecting heterogeneous datasets into a common" Research paper
on cell-embedding space (online integration of single-cell sequencing data by projecting heterogeneous datasets into a unified cell embedding space).
The research team has developed an artificial intelligence algorithm based on the variational autoencoder deep learning framework, SCALEX, which can integrate
single-cell sequencing data online.
SCALEX uses an asymmetric autoencoder structure composed of batch-independent encoders and batch-specific decoders to obtain a highly generalized encoder through extensive learning, which eliminates batch effects
while preserving biological differences by projecting high-dimensional single-cell sequencing data into the low-dimensional cell embedding space.
SCALEX model framework
SCALEX has the following four main features:
1.
Compared with the existing single-cell sequencing data integration methods, SCALEX has obvious advantages in integration accuracy;
2.
SACLEX still maintains high computational efficiency under the amount of millions of single-cell data, which is suitable for the integration and analysis of ultra-high-throughput single-cell sequencing data;
3.
SCALEX effectively avoids overcorrection in the integration of single-cell sequencing data, and is suitable for the integration of highly heterogeneous and complex samples;
4.
Support single-cell RNA-seq, single-cell ATAC-seq and other multi-omics integration data integration
.
These features make SCALEX suitable for building single-cell maps
.
The developers integrated single-cell datasets from multiple studies and multiple tissues to construct three large-scale single-cell maps
for mice, humans, and COVID-19.
A particular advantage of SCALEX is its high generalization encoder
.
This encoder can project single-cell sequencing data to generate a batch-independent, unified low-dimensional cell embedding space
.
For newly generated data, SCALEX does not require retraining the encoder to project the new data into this unified low-dimensional cell embedding space
.
This type of integration is called "online integration
.
"
A huge benefit of online integration is that it is easy to compare and analyze new data with the original generated foundational data such as single-cell atlas (which needs to be generated by SCALEX data integration), so as to obtain inspiration and guidance on biological knowledge from the foundational data, and directly support analytical tasks
such as data annotation and law verification.
In addition, the cellular content of the original single-cell atlas is also enriched and expanded in the process of continuously adding new data, enabling new biological discoveries
.
In summary, this study developed the SCALEX single-cell sequencing data artificial intelligence analysis tool, which can map the gene expression profiles of different batches of cells to batch-independent unified low-dimensional cell embedding space, effectively eliminate batch effects in the data and preserve the inherent biological differences between cells, so as to achieve the effective integration
of different batches of data.
SCALEX is suitable for the integration of single-cell sequencing data at the map level and will provide foundational support
for ongoing research initiatives such as ultra-large-scale single-cell mapping across the life sciences and biomedical fields.