echemi logo
Product
  • Product
  • Supplier
  • Inquiry
    Home > Biochemistry News > Biotechnology News > The teams of Zhu Songchun and Zhu Yixin of the Artificial Intelligence Research Institute have made important progress in the reconstruction of robot scenes and the use of action information to help robots plan independently

    The teams of Zhu Songchun and Zhu Yixin of the Artificial Intelligence Research Institute have made important progress in the reconstruction of robot scenes and the use of action information to help robots plan independently

    • Last Update: 2022-10-14
    • Source: Internet
    • Author: User
    Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com
      

    Recently, the team of Professor Zhu Songchun and Zhu Yixin of the Institute of Artificial Intelligence published a paper "Scene Reconstruction with Functional Objects for Robot Autonomy" in IJCV 2022, proposing a new scene reconstruction problem and scene diagram characteristic, providing necessary information for robot autonomous planning, and providing interactive virtual scenes with similar functions to real scene
    functions for its simulation test 。 At the same time, this work also developed a complete machine vision system to realize the scenario reconstruction problems
    raised.
    Experiments show the effectiveness of the proposed scenario reconstruction method, and the potential
    of scenario diagram in robot autonomous planning.

    Perceiving the three-dimensional environment and understanding the information contained in it is an important embodiment of human intelligence, and it is also the premise of
    human interaction with the environment.
    In addition to the geometric features of the environment and the semantic information of objects, we can also "perceive" the potential interaction with the environment, which we call actionable information
    。 For example, when we see the doorknob in Figure 1(a), we will naturally have in our minds the potential action of turning the doorknob and pulling the door open, while in the scene of Figure 1(b), we can easily observe the constraint relationship between the stacked teacups and the dishes (supporting each other), as well as the effects of different actions on their state (directly extracting the dishes below will overturn the dishes and cups above, while removing the top objects one by one can safely take the dishes below).

    Understanding the impact of potential actions on the scene forms the basis for
    us to perform and interact with tasks in the scene.
    Correspondingly, intelligent robots need similar perceptual capabilities to autonomously complete complex long-horizon planning in their
    environment.

    Figure 1 (a) Doorknob, (b) Stacked teacups and dishes (Image from the Internet, copyright belongs to the original author)

    As 3D scene reconstruction and semantic mapping techniques mature, robots have been able to effectively build three-dimensional maps containing geometric and semantic information, such as a panoptic map that includes the structure of objects and rooms, as shown in Figure 2(b
    ).
    However, there is still an insurmountable gap
    between the scene representation of these traditional scene reconstructions and the realization of autonomous planning of robots.
    So the question is, how can we construct a scenario representation that is used for robot perception and planning to improve the robot's autonomous planning ability? How can a robot use its own sensor inputs, such as RGB-D cameras, to establish such a scene representation in a real-world scenario?

    In this paper[1], the researchers present a completely new research question: reconstruct functionally-equivalent, interactive, virtual scenes that are functionally identical to the real scene to retain potential action information
    from the original scene.
    The reconstructed virtual scene can be used for simulation training and testing
    of autonomous robot planning.
    To achieve this reconstruction task, the researchers proposed a scenario graph based on the relationship between the physical support (supporting relation) and the proximal relation, as shown in Figure 2(a); Each of its nodes represents an object in the scene or a room structure (wall/floor/roof
    ).
    This scenario diagram organically organizes the reconstructed scene and the physical constraints contained therein to ensure that the resulting virtual scene is in line with physical common sense
    .
    At the same time, it can be directly transformed into a kinematic tree of the environment, which completely describes the kinematic relationship state of the environment and supports forward prediction of the impact of robot actions on the environment, which can be directly used in
    robot planning tasks.
    The paper also proposes a complete machine vision system to achieve this reconstruction task, and designs an output interface for the reconstructed scene to be seamlessly integrated into robot simulators (e.
    g.
    , Gazebo) and VR environments
    .
    Part of the preliminary work on this paper[2] was published in ICRA 2021
    .

    Figure 2 (a) a scene diagram based on the support and immediate neighbor relationship, (b) a volumetric semantic panorama map, and (c) an interactive virtual scene with the same function as the real scene, which can be used for simulation testing of robot autonomous planning

    Reconstructing real-world scenarios in a virtual environment to support robot simulation is not a simple problem
    .
    There are three main difficulties: First, how to accurately reconstruct and segment the geometry of each object and structure in a chaotic real scene, and estimate the physical constraints between objects (such as support relationships, etc.
    ); The second is how to replace the reconstructed incomplete geometry with a complete, interactive object (such as a CAD model); The third is how to organically integrate all this information into some kind of common scene expression, while helping scene reconstruction and robot autonomous planning
    .

    This work proposes to use a special scene diagram as a bridge connecting the scene reconstruction and interaction with the robot, helping to reconstruct the virtual scene in line with physical knowledge, while providing the necessary information
    for the robot's autonomous planning.
    On the one hand, this scenario diagram organizes the perceived objects in the scene, the room structure, and the relationships between them, as shown in
    Figure 3(a).
    Each of its nodes represents the object or room structure in the real scene identified and reconstructed, including its geometry (such as the reconstructed three-dimensional mesh (mesh), the three-dimensional minimum bounding box, the extracted plane features, etc.
    ) and semantic information (such as examples and semantic labels); Each edge represents the support relationship between the nodes [see directional edges in Figure 3 (a)] or adjacent relationships [undirected edges in Figure 3 (a)], representing some physical constraint information
    .
    For example, for the support relationship, the parent node needs to contain a horizontal support surface to achieve stable support for the child node; For example, for close proximity, the three-dimensional geometry of two nodes that are close to each other should not overlap each other
    .
    On the other hand, according to the likeness of semantics and geometry and the constraints between nodes, the nodes in Figure 3(a) are replaced with geometrically complete, interactive CAD models [including articulated CAD models], and then a virtual scene can be generated for robot simulation interaction, as shown in Figure 3 (b).

    Such a virtual scene retains the function of the real scene as much as possible within the scope of perceptual ability, that is, the potential action information, and can effectively realize the simulation
    of the interaction with the object in the real scene.
    Correspondingly, the resulting scenario graph also contains a complete description of environmental kinematics and constraint states, which can be used to predict the short-term quantitative impact of robot actions on kinematic states and help robot motion planning, as well as to estimate the long-term qualitative impact of robot actions on constraint relationships and support robot task planning
    .

    Figure 3 (a) a directly reconstructed scene diagram, and (b) an interactive scene diagram after replacing the CAD model

    Figure 4 Flowchart of the machine vision system used to rebuild the task

    To achieve the above reconstruction tasks, the authors designed and implemented a multi-module machine vision system: a volumetric semantic panoramic construction module [Figure 4 (A)], and a CAD model based on physics and geometry to replace the inference module [Figure 4 (B)
    ].
    。 The former is used to robustly identify, segment and reconstruct the dense geometry of objects and room structures with the help of RGB-D cameras in complex real-world environments, and to estimate the constrained relationship between them to obtain a scene diagram as shown in Figure 3(a); The latter focuses on how to select the most appropriate CAD model from the CAD model library based on the geometric features of the reconstructed object and the recognized constraints, and estimate its pose and scale to achieve the most accurate alignment with the original object, resulting in an interactive scene diagram
    shown in Figure 3(b).
    Figure 5 shows the results of the author's reconstruction of a real office scene with the help of a Kinect2 camera, including a volumetric panoramic reconstruction [Figure 5 (a)], a common interactive virtual scene [Figure 5 (b)], and an example of robot interaction after importing the virtual scene into the robot simulator [Figure 5 (c)].

    We can see that even in complex, multi-occlusive real scenes, the reconstruction system proposed in the paper can better establish interactive virtual scenes
    .
    Figure 5 (d-f) shows some interesting examples of this experiment: in Figure 5 (d), the same table is rebuilt into two relatively short tables due to chair obstruction of the table; The workstations shown in Figure 5 (e) have been reconstructed with relatively high quality, and all objects have been replaced with CAD models with similar appearances; The chair in Figure 5(f) failed to be recognized, and its occlusion to the back table caused a similar situation to Figure 5(d), while the refrigerator and microwave oven in the scene were reconstructed and replaced with a multi-articulated CAD model
    that could host complex interactions.

    Figure 5 Reconstruction results with a Kinect2 camera in a real-world environment

    Figure 6 Robot tasks and action planning in a reconstructed virtual scene

    In the reconstructed interactive virtual scene, the robot can perform task and action planning with the help of the motion chain and constraint information reflected in the scene diagram [3,4], and the simulation effect is shown in
    Figure 6.
    In recent related work [5], based on the scenario graph features described above, the robot can perform complex task planning directly based on graph editing distance and efficiently generate actions
    .

    This work proposes a new scenario reconstruction problem and scenario diagram characteristic, provides the necessary information for the robot's autonomous planning, and provides an interactive virtual scene with similar functions to the real scene for its simulation test
    .
    At the same time, this work also developed a complete machine vision system to realize the scenario reconstruction problems
    raised.
    Experiments show the effectiveness of the proposed scenario reconstruction method, and the potential
    of scenario diagram in robot autonomous planning.

    In the future, we look forward to the further expansion of this work: how to more robustly and accurately achieve rigid body and multi-joint CAD models and reconstruction geometry, how to integrate more complex potential action information in the scene diagram, and how to better use the scene to plan
    robots.
    Scene map reconstruction helps autonomous planning, and more intelligent robots are in the near future
    .

    References:

    [1] Han, Muzhi, et al.
    “Scene Reconstruction with Functional Objects for Robot Autonomy.
    ” 2022 International Journal of Computer Vision (IJCV), link.
    springer.
    com, 2022.

    [2] Han, Muzhi, et al.
    “Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments.
    ” 2021 IEEE International Conference on Robotics and Automation (ICRA), ieeexplore.
    ieee.
    org, 2021, pp.
    12199–206.

    [3] Jiao, Ziyuan, et al.
    “Consolidating Kinematic Models to Promote Coordinated Mobile Manipulations.
    ” 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2021, doi:10.
    1109/iros51168.
    2021.
    9636351.

    [4] Jiao, Ziyuan, et al.
    “Efficient Task Planning for Mobile Manipulation: A Virtual Kinematic Chain Perspective.
    ” 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), ieeexplore.
    ieee.
    org, 2021, pp.
    8288–94.

    [5] Jiao, Ziyuan, et al.
    “Sequential Manipulation Planning on Scene Graph.
    ” 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), ieeexplore.
    ieee.
    org, 2022.

    This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

    Contact Us

    The source of this page with content of products and services is from Internet, which doesn't represent ECHEMI's opinion. If you have any queries, please write to service@echemi.com. It will be replied within 5 days.

    Moreover, if you find any instances of plagiarism from the page, please send email to service@echemi.com with relevant evidence.