|
1 | 1 | # LIFL Instructions
|
2 | 2 |
|
3 | 3 | This document provides instructions on how to use LIFL in flame.
|
| 4 | + |
| 5 | +## Prerequisites |
| 6 | +The target runtime environment of LIFL is Linux **only**. LIFL requires Linux kernel version >= 5.15. We have tested LIFL on Ubuntu 20. |
| 7 | + |
| 8 | +## Environment Setup |
| 9 | + |
| 10 | +### 1. Upgrade kernel |
| 11 | +*Note: if you have kernel version >=5.15, please skip this step* |
| 12 | + |
| 13 | +```bash |
| 14 | +# Execute the kernel upgrade script |
| 15 | +cd third_party/spright_utility/scripts |
| 16 | +./upgrade_kernel.sh |
| 17 | +``` |
| 18 | + |
| 19 | +### 2. Install libbpf |
| 20 | + |
| 21 | +```bash |
| 22 | +# Install deps for libbpf |
| 23 | +sudo apt update && sudo apt install -y flex bison build-essential dwarves libssl-dev \ |
| 24 | + libelf-dev pkg-config libconfig-dev clang gcc-multilib |
| 25 | + |
| 26 | +# Execute the libbpf installation script |
| 27 | +cd third_party/spright_utility/scripts |
| 28 | +./libbpf.sh |
| 29 | +``` |
| 30 | + |
| 31 | +## Shared Memory Backend in LIFL |
| 32 | + |
| 33 | +The [shared memory backend](../../lib/python/flame/backend/shm.py) in LIFL uses eBPF's sockmap and SK_MSG to pass buffer references between aggregators. We introduce a "[sockmap_manager](../../third_party/spright_utility/src/sockmap_manager.c)" on each node to manage the registration of aggregator's socket to the in-kernel sockmap. You must run the `sockmap_manager` first. |
| 34 | + |
| 35 | +```bash |
| 36 | +# Execute the sockmap_manager |
| 37 | +cd third_party/spright_utility/ |
| 38 | + |
| 39 | +sudo ./bin/sockmap_manager |
| 40 | +``` |
| 41 | + |
| 42 | +To enable Shared Memory Backend in the channel, you need to add `shm` to the `brokers` field in the config: |
| 43 | + |
| 44 | +```yaml |
| 45 | + "brokers": [ |
| 46 | + { |
| 47 | + "host": "localhost", |
| 48 | + "sort": "mqtt" |
| 49 | + }, |
| 50 | + { |
| 51 | + "host": "localhost:10104", |
| 52 | + "sort": "p2p" |
| 53 | + }, |
| 54 | + { |
| 55 | + "host": "localhost:10105", |
| 56 | + "sort": "shm" |
| 57 | + } |
| 58 | + ], |
| 59 | +``` |
| 60 | + |
| 61 | +You also need to specify the backend type of the channel to `shm` so that the channel will choose to use shared memory backend during its initialization. |
| 62 | + |
| 63 | +```yaml |
| 64 | + "channels": [ |
| 65 | + { |
| 66 | + "name": "top-agg-coord-channel", |
| 67 | + ... |
| 68 | + }, |
| 69 | + { |
| 70 | + "name": "global-channel", |
| 71 | + ... |
| 72 | + "backend": "shm", |
| 73 | + ... |
| 74 | + } |
| 75 | + ], |
| 76 | +``` |
| 77 | + |
| 78 | +We offer sample configs in the [coord_3_hier_syncfl_mnist](../../lib/python/examples/coord_3_hier_syncfl_mnist/) and [coord_hier_syncfl_mnist](../../lib/python/examples/coord_hier_syncfl_mnist/) examples. |
| 79 | + |
| 80 | +## Hierarchical Aggregation in LIFL |
| 81 | + |
| 82 | +Flame initially supports hierarchical aggregation with two levels: top level and leaf level. The example of two-level hierarchical aggregation is at [coord_hier_syncfl_mnist](../../lib/python/examples/coord_hier_syncfl_mnist/). LIFL extends hierarchical aggregation in Flame with three levels: top level, middle level, and leaf level. The example of three-level hierarchical aggregation is at [coord_3_hier_syncfl_mnist](../../lib/python/examples/coord_3_hier_syncfl_mnist/). |
| 83 | + |
| 84 | +## Eager Aggregation in LIFL |
| 85 | + |
| 86 | +Flame initially supports lazy aggregation only. LIFL adds additional support for having eager aggregation in Flame, which gives us more flexible timing on the aggregation process. The example to run eager aggregation is availble at [eager_hier_mnist](../../lib/python/examples/eager_hier_mnist/). The implementation of eager aggregation is available at [eager_syncfl](../../lib/python/flame/mode/horizontal/eager_syncfl/). |
| 87 | + |
| 88 | +## Problems when running LIFL |
| 89 | +1. When you run `sudo ./bin/sockmap_manager`, you receive |
| 90 | +``` |
| 91 | +./bin/sockmap_manager: error while loading shared libraries: libbpf.so.0: cannot open shared object file: No such file or directory |
| 92 | +``` |
| 93 | + |
| 94 | +Solutions: This may happen when you use Ubuntu 22, which has the libbpf 0.5.0 pre-installed. You need to re-link the `/lib/x86_64-linux-gnu/libbpf.so.0` to `libbpf.so.0.6.0` |
| 95 | +```bash |
| 96 | +# Assume you have executed the libbpf installation script |
| 97 | +cd third_party/spright_utility/scripts/libbpf/src |
| 98 | + |
| 99 | +# Copy libbpf.so.0.6.0 to /lib/x86_64-linux-gnu/ |
| 100 | +sudo cp libbpf.so.0.6.0 /lib/x86_64-linux-gnu/ |
| 101 | + |
| 102 | +# Re-link libbpf.so.0 |
| 103 | +sudo ln -sf /lib/x86_64-linux-gnu/libbpf.so.0.6.0 /lib/x86_64-linux-gnu/libbpf.so.0 |
| 104 | +``` |
0 commit comments