Skip to yearly menu bar Skip to main content


Spotlight

A Benchmark on Robust Semi-Supervised Learning in Open Environments

Lin-Han Jia · Lan-Zhe Guo · Zhi Zhou · Yu-Feng Li

[ ]

Abstract:

Semi-supervised learning (SSL) has emerged as a promising paradigm to alleviate the dependency on abundant labeled data by harnessing the power of unlabeled data. Although many SSL algorithms have been proposed, their performance in practical applications is not robust because the assumption that labeled and unlabeled data are consistent does not hold. In open environments, the sources of labeled and unlabeled data may differ, leading to inconsistent data distributions and even data spaces. This paper points out that previous research on robust SSL has approached the problem from a static perspective, thereby only achieving local robustness rather than global robustness. We reshape the research framework of robust SSL by using the Robustness Analysis Curve (RAC) and the associated metrics defined based on it. Based on these metrics, we build a benchmark that encompasses three types of open environments: inconsistent data distributions, inconsistent label spaces, and inconsistent feature spaces to assess the performance of widely used statistical and deep SSL algorithms with tabular, image, and text datasets. This paper also conducted a detailed analysis, based on experimental results and theory, on how to make SSL algorithms more robust in open environments.

Chat is not available.