Encoded distributed optimization

Modern methods for large-scale optimization problems distribute the problem over multiple nodes; however when distributed nodes either fail or are significantly delayed, the slowdown obviates the gains of distributed methods. We established a new framework for encoded distributed optimization, which enables the mitigation of such effects.

The idea is to linearly (redundantly) encode the data variables in the optimization problem, place the encoded data in the nodes, and let the nodes operate as if they are solving the original problem. This is in contrast to careful cloning of processes or replication which require maintaining the some state. In particular we propose to embed the redundancy in the formulation of the optimization problem, and alter the optimization problem itself. This has several advantages.

First, we can show that this allows the nodes to operate obliviously to encoding, enabling use of existing distributed computing infrastructure and software with the proposed architecture, without additional control/coordination messaging. Second, with proper encoding we theoretically prove that it converges deterministically to an approximate solution to the original problem despite stragglers, even with small redundancy (e.g., 2). Since in large-scale machine learning and data analysis tasks one is typically not interested in the exact optimum, but rather a “sufficiently” good solution that achieves a good generalization error, such an approximation could be acceptable in many scenarios. Third, this approach is not limited to a specific computational operation, but is applicable directly to large class of practically relevant optimization problems; in particular, any optimization that can be formulated as generalized linear model, including linear regression, support vector machines, compressed sensing, projection, etc. This was shown using extensions to both gradient and co-ordinate descent. The was selected as a spotlight oral presentation in NeurIPS 2017.