Skip to yearly menu bar Skip to main content


Poster

Bayesian Coreset Optimization for Personalized Federated Learning

Prateek Chanda · Shrey Modi · Ganesh Ramakrishnan

Halle B
[ ]
Wed 8 May 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract: In a distributed machine learning setting like Federated Learning where there are multiple clients involved which update their individual weights to a single central server, often training on the entire individual client's dataset for each client becomes cumbersome. To address this issue we propose CORESET-PFEDBAYES: a personalized coreset weighted federated learning setup where the training updates for each individual clients are forwarded to the central server based on only individual client coreset based representative data points instead of the entire client data. Through theoretical analysis we present how the average generalization error is minimax optimal up to logarithm bounds $\mathcal{O}(n_k^{-\frac{2 \beta}{2 \beta+d}} \log ^{2 \delta^{\prime}}(n_k))$, where $n_k$ denotes the coreset size and how the approximation error on the data likelihood differs from a vanilla Federated Learning setup as a function $G(\boldsymbol{w})$ of the coreset weights $\boldsymbol{w}$. Our experiments on different benchmark datasets based on a variety of recent personalized federated learning architectures show significant gains (+4.87\% on MNIST, +8.61\% on FashionMNIST, +9.71\% on CIFAR in terms of model accuracy across ) as compared to random sampling on the training data followed by federated learning, thereby indicating how intelligently selecting such training samples can help in performance. Additionally, through experiments on medical datasets our proposed method showcases some gains (e.g. +9.74\% under COVID-19 dataset) as compared to other submodular optimization based approaches used for subset selection on client's data.

Chat is not available.