Awesome Talk
in
Workshop: Scene Representations for Autonomous Driving
Optimizing Internal Network Representations for Geometric and Semantic Perception
Christos Sakaridis
Dense visual perception of the geometry and semantics of the surrounding scene differs from basic global tasks such as classification in that it requires to distinguish fine spatial details besides aggregating context across the extent of the input. These two conflicting goals are typically pursued through encoder-decoder architectures, which involve an information bottleneck at the interface of the encoder and the decoder. Motivated by the fact that this internal bottleneck marks the limit between context aggregation and fine-grained parsing, we operate on the respective internal representations and optimize them in conjunction with the output representations by introducing dedicated modules and losses in various geometric and semantic perception settings. The talk will include a thorough review of a geometric case and two semantic cases of the above internal representation optimization paradigm. The former case consists in our internal discretization method for dense regression, which has general application on geometric tasks such as supervised monocular depth estimation and surface normal estimation. The latter cases consist in our condition-invariant methods for unsupervised domain adaptation of semantic segmentation models, which introduce feature invariance and cross-domain contrastive losses on the internal network representations.