-
Categories
-
Pharmaceutical Intermediates
-
Active Pharmaceutical Ingredients
-
Food Additives
- Industrial Coatings
- Agrochemicals
- Dyes and Pigments
- Surfactant
- Flavors and Fragrances
- Chemical Reagents
- Catalyst and Auxiliary
- Natural Products
- Inorganic Chemistry
-
Organic Chemistry
-
Biochemical Engineering
- Analytical Chemistry
-
Cosmetic Ingredient
- Water Treatment Chemical
-
Pharmaceutical Intermediates
Promotion
ECHEMI Mall
Wholesale
Weekly Price
Exhibition
News
-
Trade Service
Recently, the team of Professor Zhu Songchun and Zhu Yixin of the Institute of Artificial Intelligence published a paper "Scene Reconstruction with Functional Objects for Robot Autonomy" at IJCV 2022, which proposed a new scene reconstruction problem and scene diagram sign, providing necessary information for robot autonomous planning, and providing an interactive virtual scene similar to the real scene
function for its simulation test 。 At the same time, this work also develops a complete machine vision system to realize the proposed scene reconstruction problem
.
Experiments demonstrate the effectiveness of the proposed scene reconstruction method and the potential of
scene graph in robot autonomous programming.
Perceiving the three-dimensional environment and understanding the information contained therein is an important manifestation of human intelligence and a prerequisite
for humans to interact with the environment at will.
In addition to the geometric characteristics of the environment and the semantic information of objects, we can also "perceive" potential interactions with the environment, which we call actionable information
in the environment 。 For example, when we see the doorknob in Figure 1(a), we naturally have the potential action of turning the doorknob and pulling the door open in our minds, while in the scenario in Figure 1(b), we can easily observe the constraint relationship between stacked teacups and dishes (supporting each other) and the effect of different actions on their state (directly extracting the lower dishes will overturn the upper dishes and teacups, while removing the top object one by one can safely reach the lower dishes).
Understanding the impact of potential actions on a scene forms the basis on
which we perform tasks and interact with them.
Correspondingly, intelligent robots need similar perceptual capabilities to enable them to autonomously complete complex long-horizon planning
in the environment.
Figure 1 (a) Doorknob, (b) Stacked teacups and dishes (image from the Internet, copyright belongs to the original author)
With the maturity of 3D scene reconstruction and semantic mapping technology, robots have been able to effectively create three-dimensional maps containing geometric and semantic information, such as panoptic maps including objects and room structures, as shown in Figure 2(b).
However, there is still an insurmountable gap
between the scene representation of these traditional scene reconstructions and the realization of robot autonomous planning.
So the question is, how can we construct a scene representation that is commonly used for robot perception and planning to improve the robot's autonomous planning ability? How can a robot use its own sensor inputs, such as an RGB-D camera, to establish such a representation of the scene in a real-world scenario?
In this paper [1], the researchers propose a completely new research problem: reconstructing functionally-equivalent, interactive) virtual scenes that function functionally equivalent to real-world scenes to preserve the potential motion information
of the original scene.
The reconstructed virtual scene can be used for simulation training and testing
of robot autonomous planning.
In order to achieve this reconstruction task, the researchers proposed a scene diagram based on the relationship between supporting relation and proximal relation, as shown in Figure 2(a); Each node represents an object in the scene or a room structure (wall/floor/roof).
This scene diagram organically organizes the reconstructed scene and the physical constraints contained therein to ensure that the resulting virtual scene is physically sound
.
At the same time, it can be directly converted into a kinematic tree of the environment, which completely describes the kinematic relationship state of the environment and supports forward prediction of the impact of robot actions on the environment, which can be directly used in
robot planning tasks.
This paper also proposes a complete machine vision system to achieve this reconstruction task, and designs an output interface for the reconstructed scene, so that it can be seamlessly integrated into robot simulators (such as Gazebo) and VR environments
.
Part of the preliminary work on this paper[2] was published at
ICRA 2021.
Figure 2(a) A scene diagram based on the relationship between support and immediate adjacency, (b) a volumetric semantic panorama, and (c) an interactive virtual scene with the same function as the real scene, which can be used for simulation testing of robot autonomous planning
Reconstructing real-world scenarios in a virtual environment to support robot simulation is not a simple problem
.
There are three main difficulties: first, how to accurately reconstruct and divide the geometry of each object and structure in a messy real scene, and estimate the physical constraints between objects (such as support relationships, etc.
); The second is how to replace the reconstructed incomplete geometry with complete, interactive objects (such as CAD models); The third is how to organically integrate all this information into a general scene expression, while helping scene reconstruction and robot autonomous planning
.
This work proposes to use a special scene diagram as a bridge to connect the scene to rebuild the interaction with the robot, which can help reconstruct the virtual scene in line with physical common sense and provide the necessary information
for the robot's autonomous planning.
On the one hand, this scene graph organizes the perceived objects, room structures, and the relationships between them in the scene, as shown
in Figure 3(a).
Each node represents the object or room structure in the identified and reconstructed real scene, including its geometry (such as the reconstructed three-dimensional mesh (mesh), the three-dimensional minimum bounding box, the extracted plane features, etc.
) and semantic information (such as instances and semantic labels); Each edge represents the support relationship between the nodes [see the directed edge in Figure 3 (a)] or the immediate neighbor relationship [the undirected edge in Figure 3 (a)], representing some physical constraint information
.
For example, for the support relationship, the parent node needs to contain a horizontal support surface to achieve stable support for the child node; For example, for the immediate adjacency relationship, the three-dimensional geometry of two nodes close to each other should not overlap each other
.
On the other hand, according to the semantic and geometric similarity and comprehensive consideration of the constraints between nodes, the nodes in Figure 3 (a) are replaced with geometrically complete and interactive CAD models [including multi-articulated CAD models], and then a virtual scene that can be used for robot simulation interaction is generated, as shown in Figure 3 (b).
Such a virtual scene retains the functionality of the real scene as much as possible within the scope of perceptual ability, that is, the potential action information, which can effectively realize the simulation
of the interaction results with objects in the real scene.
Correspondingly, the obtained scene diagram also contains a complete description of environmental kinematics and constraint states, which can be used to predict the short-term quantitative impact of robot actions on kinematic states and help robot motion planning, as well as estimate the long-term qualitative impact of robot actions on constraint relationships and support robot task planning
.
Figure 3 (a) Directly reconstructed scene diagram, (b) Interactive scene diagram after replacing the CAD model
Figure 4 Flow chart of machine vision system for reconstruction tasks
In order to achieve the above reconstruction task, the authors designed and implemented a multi-module machine vision system: a volumetric semantic panorama building module [Figure 4 (A)], and a CAD model replacement inference module based on physical knowledge and geometry [Figure 4 (B)].
。 The former is used to robustly identify, divide and reconstruct the dense geometry of objects and room structures with the help of RGB-D cameras in complex real-world environments, and estimate the constraint relationship between them to obtain the scene diagram in Figure 3 (a).
The latter focuses on how to select the most suitable CAD model from the CAD model library according to the geometric features of the reconstructed object and the recognized constraints, and estimate its pose and scale to achieve the most accurate alignment with the original object, and then generate the interactive scene diagram
shown in Figure 3 (b).
Figure 5 shows the results of the authors' reconstruction of a real office scene with the help of the Kinect2 camera, including a volumetric panoramic reconstruction [Figure 5 (a)], a common interactive virtual scene [Figure 5 (b)], and a sample of robot interaction after importing a virtual scene into the robot simulator [Figure 5 (c)].
We can see that even in complex and multi-occlusion real scenes, the reconstruction system proposed in the paper can better establish interactive virtual scenes
.
Figure 5 (d-f) shows some interesting examples from this experiment: in Figure 5 (d), the same table is reconstructed into two relatively short tables due to the chair's obscuration of the table; Figure 5 (e) The station shown has been reconstructed of a relatively high quality, with all objects replaced with similar CAD models; The chair in Figure 5 (f) is not identified, and its obscuration of the table behind it creates a similar situation to Figure 5 (d), while the refrigerator and microwave oven in the scene are reconstructed and replaced with multi-joint, complex interactive CAD models
.
Figure 5 Reconstructed results with a Kinect2 camera in a real-world environment
Figure 6 Robot task and action planning in a reconstructed virtual scene
In the reconstructed interactive virtual scene, with the help of the motion chain and constraint information reflected in the scene graph, the robot can plan tasks and actions [3, 4], and its simulation effect is shown in
Figure 6.
In recent related work [5], based on the scene diagram described above, the robot can directly perform complex task planning based on the graph editing distance and efficiently generate actions
.
This work proposes a new scene reconstruction problem and scene diagram, which provides necessary information for robot autonomous planning, and provides an interactive virtual scene with similar functions to the real scene
for its simulation test.
At the same time, this work also develops a complete machine vision system to realize the proposed scene reconstruction problem
.
Experiments demonstrate the effectiveness of the proposed scene reconstruction method and the potential of
scene graph in robot autonomous programming.
In the future, we look forward to further expansion of this work: how to match rigid body and multi-joint CAD models with reconstruction geometry more robustly and accurately, how to incorporate more complex potential motion information in scene graphs, and how to make better use of scene suggestions for robot planning
.
Scene map reconstruction helps autonomous planning, and more intelligent robots are in the near future
.
References:
[1] Han, Muzhi, et al.
“Scene Reconstruction with Functional Objects for Robot Autonomy.
” 2022 International Journal of Computer Vision (IJCV), link.
springer.
com, 2022.
[2] Han, Muzhi, et al.
“Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments.
” 2021 IEEE International Conference on Robotics and Automation (ICRA), ieeexplore.
ieee.
org, 2021, pp.
12199–206.
[3] Jiao, Ziyuan, et al.
“Consolidating Kinematic Models to Promote Coordinated Mobile Manipulations.
” 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2021, doi:10.
1109/iros51168.
2021.
9636351.
[4] Jiao, Ziyuan, et al.
“Efficient Task Planning for Mobile Manipulation: A Virtual Kinematic Chain Perspective.
” 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), ieeexplore.
ieee.
org, 2021, pp.
8288–94.
[5] Jiao, Ziyuan, et al.
“Sequential Manipulation Planning on Scene Graph.
” 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), ieeexplore.
ieee.
org, 2022.