Personalized Federated Learning
Think of a language task where a company aims to train a voice assistant that interacts with the user in English. One straightforward approach to improve the model is to use edge data that is generated through the voice input of users. The statistics of data has commonalities across the world’s population as all the users will interact in English. However, each user will be a part of a community that speaks a slightly different accent, and even within communities each individual would have a unique way of speaking English; in this context this diversity can be perceived as statistical heterogeneity of edge data. Or one can think of a personalized health monitoring system, where sensors are able to continuously measure critical indicators (e.g., blood-pressure, heart-rate, oxygenation levels, insulin etc.), and builds a learning model that offers predictions and suggestions for action. The statistics of the data have commonalities across a population, but also have distinct individual characteristics; the differences in statistics could depend on race, gender, health history and other characteristics. Heterogeneity in these two examples implies the need for building personalized models for each individual/community. Of course there is not enough data in individual edge devices, moreover, we would like to utilize the commonalities while building personalized models. Therefore, a natural question is whether and how we can leverage large scale local data collection in this emerging ecosystem to collaboratively build (distributed) personalized learning systems in a communication efficient way that respects privacy of the users.
We developed QuPeD, based on a knowledge-distillation based formulation to enable collaboration with heterogeneous models (architectures). we analyzed its convergence properties which gave insights into how it is impacted by diverse models (sizes & architecture) and their data heterogeneity. In fact the resource rich clients (with larger networks) could aid learning of limited resource clients (with smaller networks) through this collaboration (promoting equity). More recently, we introduced a statistical framework that could unify several different algorithms and provide a deeper understanding on personalized FL formulations. This framework allowed us to understand the relationship between the generative model/heterogeneity of edge data and the resulting optimal approach to personalization. For instance, we could identify the type of generative (population) model of data which results in knowledge distillation based regularization (such as in QuPeD). Our work on personalization in Federated Learning explores how to design effective personalization algorithms using statistical and information theoretic viewpoints; and how personalization interacts with notions such as heterogeneity, communication efficiency, privacy and so on.