 Home > Active Ingredient News > Study of Nervous System > Comprehensible machine learning model for multimodal data and its preliminary application in the study of neuronal cells in the brain

Comprehensible machine learning model for multimodal data and its preliminary application in the study of neuronal cells in the brain

 Last Update: 2022-03-08
 Source: Internet
 Author: User

Tags

pharmaceutical news

science projects grade 8

Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com

The public welfare academic platform initiated by returnee scholars to share information and integrate resources to exchange academics
.

The complex biological system of the brain contains a large number of nerve cells with different functions
.

Interactions between different cells also contribute to a variety of brain states including neurological diseases
.

However, understanding the biological mechanisms that lead to these functions and states, such as gene expression and regulation, remains a challenging task
.

However, by integrating multiple sources of information describing nerves in the brain, we can gain more reliable and accurate insights into the biological mechanisms of nerve cells and even the development of disease
.

For example, rapidly developing single-cell sequencing technologies such as Patch-seq can measure multiple characteristics of the same neural cell, including transcriptome, electrophysiological signals, and cell morphology, resulting in massive multi-modal single-cell data generation (multi-modal) data)
.

Compared with a single data type, these multimodal data allow people to obtain more comprehensive features of brain neurons to analyze their related cell functions and distinguish different cell phenotypes
.

However, due to the different distributions of various data types and potential nonlinear relationships between different data types, it is a new challenge to effectively integrate and analyze these multimodal data to understand cell biological mechanisms
.

Developing tools for integrating single-cell data across samples, experiments, and measurement modalities is also one of the grand challenges presented in single-cell data science
.

Integrating multimodal features is difficult because such integration requires us to allow the problem to be solved at different data scales, providing a framework to measure the uncertainty of any modality and quantify it during analysis
.

In addition, multimodal data integration approaches need to be scalable to efficiently handle the increasing number of cells, features and data types provided by new technologies, and to allow comparisons between different phenotypes, as well as to account for cellular features and phenotypes complex relationship
.

To this end, Wang Daifeng's laboratory at the University of Wisconsin-Madison published an understandable multimodal data machine learning model (deepManReg) in "Nature-Computational Science" on January 31, 2022 [1]
.

"Nature-Computational Science" also launched a news opinion review at the same time, which specially introduced this work and discussed the prospects in the field of interpretable multimodal data learning [2]
.

deepManReg uses a common manifold model to represent the local nonlinear relationship between data features
.

The authors first learn common manifolds between features of different modalities through multiple deep neural networks, and then use it to embed all possible features into a same low-dimensional space (co-embedding)
.

Therefore, the distance of cross-modal features in this space can be used to quantify the nonlinear relationship between features, and then obtain a cross-modal feature network connecting features from different modalities
.

The author then uses this feature network to regularize a neural network classifier to achieve the purpose of improving the phenotype classification of the data (regularized classification)
.

For each phenotype, deepManReg employs an integrated gradient on this modified neural network classifier to prioritize features and feature relationships
.

This ranking explains which traits most affect phenotypes and how the trait relationships between them can be used to classify representations
.

Through clever use of several state-of-the-art computing techniques, deepManReg simultaneously demonstrates high technical performance
.

For example, deepManReg trains neural networks through a Riemann optimization process, reducing the cost of choosing between nonlinear (cannot generalize to new data without retraining) and parametric (resulting in inaccurate alignment) manifold alignment
.

The authors also use gradient descent optimization, compute nonlinear projections on Stiefel manifolds for multiple datasets, preserve manifold constraints at the neural network output layer, and more
.

Furthermore, because deepManReg allows the comparison of different phenotypes and the understanding of changes and associations between the multimodal features that shape them, it can predict previously studied phenotypes (even continuous phenotypes) for new data samples
.

To demonstrate the performance, the authors apply deepManReg to analyze different data cases
.

For example, (i) image data of handwritten digits with multiple features and (ii) gene expression and electrophysiological signal data of about 4000 mouse brain neuronal cells obtained by Patch-seq method
.

In each case, the authors found that deepManReg not only outperformed single data type and other current multimodal approaches in classification, but also explained the features that were important and their relationships
.

Especially in case 2, the authors found important gene networks and electrophysiological characteristics of neurons in different cerebral cortex
.

This provides important information for revealing changes in electrophysiological functions in gene networks and provides new perspectives on how gene functions are coordinated across different cerebral cortexes
.

In addition to understandability, the runtime of deepManReg in CPU and GPU architectures using different numbers of features also has certain advantages over other methods
.

In addition, the authors suggest some appropriate tuning factors that allow deepManReg to scale to larger datasets and use more than two data types
.

Given the ever-increasing number of single-cell multimodal datasets available, and the advantages of using deepManReg for comprehensive analysis, it can be seen as a popular alternative for studying complex diseases and phenotypes
.

Nonetheless, the authors analyze aspects that explain how multimodal data learning can be improved in the future, such as multiple neural network hyperparameter optimization and computational resources required to analyze high-dimensional data for more efficient use of machine learning including deepManReg models and tools
.

Paper information: Nam D Nguyen, Jiawei Huang, Daifeng Wang, A deep manifold-regularized learning model for improving phenotype prediction from multi-modal data, Nature Computational Science, 2, 38–46, 2022.
https:// com/articles/s43588-021-00185-x News Review: Daniel Osorio, Interpretable multi-modal data integration, Nature Computational Science, 2, 8-9, 2022.
https:// 021-00186-w About the author: Corresponding author Dr.
Daifeng Wang is an assistant professor in the Department of Biostatistics and Medical Informatics and Department of Computer Science at the University of Wisconsin-Madison, and a recipient of the PI of the University of Wisconsin Wisconsin Center and the National Natural Science Foundation of China (NSF) Career Award.
Career)
.

His laboratory has long been devoted to the research and development of artificial intelligence and machine learning methods that can be used to understand biological mechanisms and precision medicine, and are mainly applied to functional genomics research of the brain and brain diseases
.

The first author is Nam Nguyen, a 2021 computer science doctoral student in the lab (now a Lane Fellow in Computational Biology, School of Computer Science, Carnegie Mellon University)
.

The second author is Huang Jiawei, a 2021 graduate student in the laboratory of statistical data science (now a doctoral student at the University of Cincinnati School of Business)
.

For more information, please refer to the laboratory homepage: https://daifengwanglab.
org/
.

Click the Talent Plaza below to view the latest academic recruitment.
Extended reading npj: Machine Learning Prediction—Surface Energy of Various Kind-Looking Tumor Nanoparticles npj: Machine Learning Predicts Mechanical Properties of Additive Manufacturing Metal Materials The secret of "Efficient Response" and "Efficient Response": New Developments in Modular Single Atom Nanozyme Research - Putting Magical "Bandages" on Injured Brain Nerves below

This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.