Skip to yearly menu bar Skip to main content


Poster

MixSup: Mixed-grained Supervision for Label-efficient LiDAR-based 3D Object Detection

Yuxue Yang · Lue Fan · Zhaoxiang Zhang

Halle B

Abstract:

Label-efficient LiDAR-based 3D object detection is currently dominated by weak/semi-supervised methods. Instead of exclusively following one of them, we propose MixSup, a more practical paradigm simultaneously utilizing massive cheap coarse labels and a limited number of accurate labels for Mixed-grained Supervision. We start by observing that point clouds are usually textureless, making it hard to learn semantics. However, point clouds are geometrically rich and scale-invariant to the distances from sensors, making it relatively easy to learn the geometry of objects, such as poses and shapes. Thus, MixSup leverages massive coarse cluster-level labels to learn semantics and a few expensive box-level labels to learn accurate poses and shapes. We redesign the label assignment in mainstream detectors, which allows them seamlessly integrated into MixSup, enabling practicality and universality. We validate its effectiveness in nuScenes, Waymo Open Dataset, and KITTI, employing various detectors. MixSup achieves up to 97.31% of fully supervised performance, using cheap cluster annotations and only 10% box annotations. Furthermore, we utilize the emerging Segment Anything Model (SAM) to automatically generate massive coarse labels, further reducing the annotation burden. The code will be made publicly available.

Chat is not available.