Poster
in
Workshop: Setting up ML Evaluation Standards to Accelerate Progress
Rethinking Streaming Machine Learning Evaluation
Shreya Shankar · Bernease Herman · Aditya Parameswaran
Abstract:
While most work on evaluating machine learning (ML) models focuses on batches of data, computing the same metrics in a streaming setting (i.e., unbounded, timestamp-ordered datasets) fails to accurately identify when models are performing unexpectedly. In this position paper, we discuss how sliding windows--that ML metrics are evaluated over--can be negatively affected by real-world phenomena (e.g., delayed arrival of labels) and propose additional metrics to assess streaming ML performance.
Chat is not available.