Scene-level Pose Estimation for Multiple Instances of Densely Packed Objects

TitleScene-level Pose Estimation for Multiple Instances of Densely Packed Objects
Publication TypeConference Paper
Year of Publication2019
AuthorsMitash, C, Wen, B, Bekris, KE, Boularias, A
Conference NameConference on Robot Learning (CoRL)
Date Published10/2019
Conference LocationOsaka, Japan

This paper introduces key machine learning operations that allow the
realization of robust, joint 6D pose estimation of multiple instances
of objects either densely packed or in unstructured piles from RGB-D
data. The first objective is to learn semantic and instance-boundary
detectors without manual labeling. An adversarial training framework
in conjunction with physics-based simulation is used to achieve
detectors that behave similarly in synthetic and real data. Given the
stochastic output of such detectors, candidates for object poses are
sampled. The second objective is to automatically learn a single score
for each pose candidate that represents its quality in terms of
explaining the entire scene via a gradient boosted tree. The proposed
method uses features derived from surface and boundary alignment
between the observed scene and the object model placed at hypothesized
poses. Scene-level, multi-instance pose estimation is then achieved by
an integer linear programming process that selects hypotheses that
maximize the sum of the learned individual scores, while respecting
constraints, such as avoiding collisions. To evaluate this
method, a dataset of densely packed objects with
challenging setups for state-of-the-art approaches is
collected. Experiments on this dataset and a public one show that
the method significantly outperforms alternatives in terms of 6D
pose accuracy while trained only with synthetic datasets.