-
Categories
-
Pharmaceutical Intermediates
-
Active Pharmaceutical Ingredients
-
Food Additives
- Industrial Coatings
- Agrochemicals
- Dyes and Pigments
- Surfactant
- Flavors and Fragrances
- Chemical Reagents
- Catalyst and Auxiliary
- Natural Products
- Inorganic Chemistry
-
Organic Chemistry
-
Biochemical Engineering
- Analytical Chemistry
-
Cosmetic Ingredient
- Water Treatment Chemical
-
Pharmaceutical Intermediates
Promotion
ECHEMI Mall
Wholesale
Weekly Price
Exhibition
News
-
Trade Service
Source: HUAWEI CLOUD
Recently, Peking University Biomedical Frontier Innovation Center (BIOPIC), Peking University School of Chemistry and Molecular Engineering, Shenzhen Bay Laboratory Professor Gao Yiqin's research group and Huawei jointly launched a protein multiple sequence alignment (Protein MSA) data set.
The open source Protein MSA data set completely covers the protein sequences in the latest version (released in February 2021) of the UniRef50 database.
There are more than 440 million protein sequences known to humans, but it is difficult to understand the relationship between proteins based on these single protein sequence databases
In order to better serve researchers across fields, the Protein MSA data set will be organized into multiple data formats
Professor Gao Yiqin said: “We encourage and look forward to the full collision and cooperation of experts and talents from the fields of bioinformatics, data science and AI research to introduce, improve or design new AI models to fully explore the hidden hidden in the Protein MSA data set.
From a scientific point of view, the quantity and quality of MSA have largely affected the prediction speed and accuracy of the most advanced structural models, and the non-parametric algorithm that generates MSA is still one of the main steps in determining the speed of many protein prediction methods.
The release of the database, relying on the HUAWEI CLOUD AI Gallery platform, can fully guarantee the access and download of data sets by users at home and abroad, and provide advanced data maintenance solutions that can be continuously updated and expanded, and related support for downstream AI applications and deployment.
Attached:
Data set open source description:
https://gitee.
Data set download address:
https://marketplace.
references:
[1] AlQuraishi, Mohammed.
【2】Suzek, BE, Wang, Y.
[3] Mirdita M.