π₯ Quickstart! (Ubuntu) π₯
Flame is a platform that enables developers to compose and deploy federated learning (FL) training workloads easily. The system is comprised of a service (control plane) and a python library (data plane). The service manages machine learning workloads, while the python library facilitates composition of ML workloads. And the library is also responsible for executing FL workloads. With extensibility of its library, Flame can support various experimentations and use cases.
We have improved Flame with a redesigned control plane and data plane (called LIFL) for efficient FL aggregation at scale. LIFL leverages shared memory processing to achieve high-performance communication for hierarchical aggregation. We also introduce locality-aware placement in LIFL to maximize the benefits of shared memory processing. LIFL precisely scales and carefully reuses the resources for hierarchical aggregation to achieve the highest degree of parallelism while minimizing the aggregation time and resource consumption.
π₯ Quickstart with LIFL π₯
The target runtime environment is Linux. Development has been mainly conducted under macOS environment. One should first set up a development environment. For more details, refer to here.
This repo has the following directory structure:
flame
βββ CODE_OF_CONDUCT.md
βββ CONTRIBUTING.md
βββ LICENSE
βββ Makefile -> build/Makefile
βββ README.md
βββ api (specification of REST API for flame apiserver)
βββ build (configuration files for building flame binaries and container image)
βββ cmd (source files for flame control plane)
βββ docs (document folder)
βββ examples (example folder)
βββ fiab (dev/test env in a single box)
βββ go.mod
βββ go.sum
βββ lib (python library for core flame data plane)
βββ lint.sh
βββ pkg (go packages for cmd)
βββ scripts (utility scripts)
Method | Note |
---|---|
FedAvg | https://arxiv.org/pdf/1602.05629.pdf |
FedYogi | https://arxiv.org/pdf/2003.00295.pdf |
FedAdam | https://arxiv.org/pdf/2003.00295.pdf |
FedAdaGrad | https://arxiv.org/pdf/2003.00295.pdf |
FedProx | https://arxiv.org/pdf/1812.06127.pdf |
FedBuff | Asynchronous FL (https://arxiv.org/pdf/2106.06639.pdf and https://arxiv.org/pdf/2111.04877.pdf); secure aggregation is excluded |
FedDyn | https://arxiv.org/pdf/2111.04263.pdf |
OORT | https://arxiv.org/pdf/2010.06081.pdf; client selection algorithm / mechanism; experimental release |
Hierarchical FL | https://arxiv.org/pdf/1905.06641.pdf; a simplified version where k2 = 1; support both synchronous and asynchronous FL |
Hybrid FL | A hybrid approach to combine federated learning with ring-reduce; topology motivated from https://openreview.net/pdf?id=H0oaWl6THa |
A full document can be found here. The document will be updated on a regular basis.
We welcome feedback, questions, and issue reports.
- Maintainers' email address: flame-github-owners@cisco.com
- GitHub Issues
@inproceedings{flame2023,
author = {Harshit Daga and Jaemin Shin and Dhruv Garg and Ada Gavrilovska and Myungjin Lee and Ramana Rao Kompella},
title = {Flame: Simplifying Topology Extension in Federated Learning},
year = {2023},
booktitle = {Proceedings of the 2023 ACM Symposium on Cloud Computing},
keywords = {Federated Learning, Distributed Machine Learning},
series = {SoCC '23}
}
@inproceedings{lifl-mlsys24,
author = {Qi, Shixiong and Ramakrishnan, K. K. and Lee, Myungjin},
title = {LIFL: A Lightweight, Event-Driven Serverless Platform for Federated Learning},
year = {2024},
booktitle = {Proceedings of Machine Learning and Systems},
}