 Home > Food News > Nutrition News > Open the "black box" and build better AI models

Open the "black box" and build better AI models

 Last Update: 2023-02-01
 Source: Internet
 Author: User

Tags

tnp lab result

phase definition science

Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com

When deep learning models emerge in the real world, perhaps to detect financial fraud from credit card activity, or to identify cancer in medical images, they are often able to outperform humans
.

But what exactly are these deep learning models learning? For example, does a model trained to spot skin cancer in clinical images really learn the color and texture of cancerous tissue, or does it label some other feature or pattern?

These powerful machine learning models are often based on artificial neural networks, which can have millions of nodes that can process data to make predictions
.
Because of their complexity, researchers often call these models "black boxes" because even the scientists who build them don't understand what's going on behind the
scenes.

Stefanie Jegelka is not satisfied with the "black box" explanation
.
As a new tenured associate professor in MIT's Department of Electrical Engineering and Computer Science, Jegelka is delving into deep learning to understand what these models can learn, how they behave, and how specific prior information can be built into these models
.

"At the end of the day, what a deep learning model can learn depends on a lot of factors
.
But building a practice-related understanding will help us design better models and also help us understand what's going on inside them so we know when we can deploy models and when we can't
.
This is critical," said Jegelka, who is also a member of
the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Institute for Data, Systems and Society (IDSS).

Jegelka is particularly interested in optimizing machine learning models when the input data is in
graphical form.
Graph data presents specific challenges: for example, the information in the data includes both information about individual nodes and edges, as well as structure—what is connected to what
.
In addition, graphs have mathematical symmetries, and machine learning models need to respect these symmetries, for example, the same graph will always result in the same predictions
.
Building such symmetry in a machine learning model is often not easy
.

Take molecules, for example
.
Molecules can be represented graphically, with vertices corresponding to atoms and edges corresponding to chemical bonds
between atoms.
Pharmaceutical companies may want to use deep learning to quickly predict the properties of many molecules, narrowing down the number of
physical tests they have to do in the lab.

Jegelka investigates ways to build mathematical machine learning models that can efficiently take graph data as input and output something else, in this case, predict the chemical properties
of molecules.
This is particularly challenging because the properties of a molecule are determined not only by the atoms inside it, but also by the connections between them
.

Designing these models becomes more difficult because the data used to train them tends to be different
from what the models actually see.
Perhaps the model was trained using small molecule graphs or transportation networks, but once deployed, it sees larger or more complex
graphs.

In this case, what can researchers expect the model to learn, and if real-world data is different, will it still work in practice?

"It's impossible for your model to learn everything because of some difficult problems in computer science, but what you can and can't learn depends on how you set up the model
.
"

Jegelka solved this problem
by combining his passion for algorithms and discrete mathematics with his passion for machine learning.

From butterflies to bioinformatics

Jegelka grew up in a small town in Germany and became interested in science while in high school; A supportive teacher encouraged her to participate in international science competitions
.
She and her teammates from the United States and Hong Kong created a website about butterflies in three languages and won the award
.

"In our project, we took images
of the wings with a scanning electron microscope at a local university of applied sciences.
I also had the opportunity to use Mercedes-Benz's high-speed cameras – which typically shoot internal combustion engines – which I used to capture slow-motion videos
of butterfly wings moving.
That was my first real exposure to science and exploration," she recalls
.

With an interest in both biology and mathematics, Jegelka decided to study bioinformatics
at the University of Tübingen and the University of Texas at Austin.
During college, she had several opportunities to conduct research, including an internship in computational neuroscience at Georgetown University, but she wasn't sure what career
to pursue.

When she returned to her final year of college, Jegelka moved in with two roommates who worked as research assistants
at the Max Planck Institute in Tübingen.

"They're working on machine learning, and that sounds really cool
to me.
" I was going to write my bachelor's thesis, so I asked the institute if they had a project for me
.
I started working on machine learning at the Max Planck Institute, and I loved it
.
I learned a lot there, and it's a great place to research," she said
.

She stayed at the Max Planck Institute to complete her master's thesis and then pursued her PhD
in Machine Learning at the Max Planck Institute and the Swiss Federal Institute of Technology.

During her PhD, she explored how concepts in discrete mathematics can help improve machine learning techniques
.

Teaching mode learning

The more Jegelka learned about machine learning, the more interested she became in
the challenges of understanding model behavior and how to guide that behavior.

"You can do a lot with machine learning, but only if you have the right models and data
.
It's not just a black box thing, you throw it at the data and it works
.
You actually have to think about it, its properties, and what you want the model to learn and do," she said
.

After completing his postdoctoral studies at the University of California at Berkeley, Jegelka became fascinated with research and decided to pursue a career in academia
.
She joined MIT as an assistant professor
in 2015.

"From the beginning, I really liked MIT because the people here care so much about research and creativity
.
That's what
I admire most about MIT.
People here value originality and depth
of research.
”

The focus on creativity allows Jegelka to explore a wide range of topics
.

She collaborates with other MIT faculty members on the application
of machine learning in biology, imaging, computer vision, and materials science.

But what really drives Jegelka is exploring the foundations of machine learning, and more recently the robustness problem
.
Typically, a model performs well on training data, but when it is deployed on slightly different data, its performance deteriorates
.
Building prior knowledge into a model can make it more reliable, but understanding what information a model needs to be successful and how to build it into it isn't that simple
, she says.

She is also exploring ways to
improve the performance of machine learning models for image classification.

From facial recognition systems on mobile phones to tools to identify fake accounts on social media, image classification models are everywhere
.
These models require large amounts of data to train, but because of the high cost of manually labeling millions of images, researchers often use unlabeled datasets to pre-train models
.

These models then reuse the representations
they have learned later as they fine-tune for specific tasks.

Ideally, researchers want the model to learn as much as possible during pre-training so it can apply that knowledge to
downstream tasks.
But in practice, these models typically learn only simple correlations — one image has sunlight and another has shadows — and use these "shortcuts" to classify
images.

"We show that this is a problem in 'contrastive learning', which is a standard technique
for pre-training, both theoretically and empirically.
But we also explain that you can influence the type of information that the model will learn to represent by modifying the type of
data displayed to the model.
This is the first step in understanding what the model will actually do in practice," Jegelka said
.

Researchers still don't understand everything that's happening inside deep learning models or the details of how they affect what the models learn and how they behave, but Jegelka looks forward to continuing to explore these topics
.

"Usually in machine learning, we see something happen in practice, and we try to understand it
theoretically.
This is a huge challenge
.
You want to build an understanding that matches what you see in practice so you can do better
.
Our understanding of this is just beginning," she said
.

Outside of the lab, Jegelka is a lover
of music, art, travel and cycling.
But lately, she likes to spend most of her free time with
her preschoolers.

This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.