-
Notifications
You must be signed in to change notification settings - Fork 133
[WIP] scx_rusty: add AMD IBS performance monitoring #1724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example trace:
<idle>-0 [003] dnZ3. 10777.212041: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [016] d.Z2. 10777.212044: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [024] d.Z3. 10777.212047: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [017] d.Z2. 10777.212048: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [027] d.Z2. 10777.212051: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [020] d.Z3. 10777.212053: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [003] dnZ2. 10777.212053: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [020] d.Z2. 10777.212059: bpf_trace_printk: LOAD (0x1,0x1,0x0) [104, 0]
<idle>-0 [016] d.Z3. 10777.212061: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [017] d.Z2. 10777.212064: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [003] dnZ2. 10777.212066: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [006] d.Z2. 10777.212069: bpf_trace_printk: LOAD (0x1,0x1,0x0) [110, 0]
<idle>-0 [002] d.Z3. 10777.212070: bpf_trace_printk: STORE (0x1,0x1,0x0) [348, 0]
<idle>-0 [020] d.Z2. 10777.212075: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [024] d.Z3. 10777.213024: bpf_trace_printk: LOAD (0x1,0x1,0x0) [390, 0]
<idle>-0 [006] d.Z3. 10777.213026: bpf_trace_printk: LOAD (0x1,0x1,0x0) [fddb05e68, ffffadac0044ce68]
<idle>-0 [017] d.Z3. 10777.213028: bpf_trace_printk: LOAD (0x1,0x1,0x0) [fde085e60, ffffadac00690e60]
<idle>-0 [018] d.Z3. 10777.213030: bpf_trace_printk: STORE (0x1,0x1,0x0) [874a3c710, ffffffffa6a3c710]
<idle>-0 [027] d.Z2. 10777.213032: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [003] dNZ2. 10777.213035: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [017] d.Z2. 10777.213038: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [016] d.Z2. 10777.213044: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [027] d.Z2. 10777.213044: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [008] d.Z3. 10777.213045: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [020] d.Z3. 10777.213046: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [024] dnZ2. 10777.213047: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [017] d.Z2. 10777.213051: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [003] dNZ3. 10777.213052: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [027] d.Z2. 10777.213058: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [020] d.Z3. 10777.213061: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [016] d.Z2. 10777.213061: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [018] d.Z2. 10777.213061: bpf_trace_printk: LOAD (0x1,0x1,0x0) [104, 0]
<idle>-0 [022] d.Z3. 10777.213062: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [020] d.Z2. 10777.213066: bpf_trace_printk: LOAD (0x1,0x1,0x0) [2, 0]
<idle>-0 [017] d.Z2. 10777.213068: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [003] dNZ2. 10777.213068: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [016] d.Z3. 10777.214025: bpf_trace_printk: STORE (0x2,0x1,0x0) [101f5dfec, ffff8a88c1f5dfec]
<idle>-0 [020] d.Z3. 10777.214027: bpf_trace_printk: STORE (0x1,0x1,0x0) [874a419e0, ffffffffa6a419e0]
<idle>-0 [017] d.Z3. 10777.214030: bpf_trace_printk: STORE (0x1,0x1,0x0) [fde08ffb8, fffffe24ebccefb8]
<idle>-0 [008] d.Z2. 10777.214032: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [020] d.Z3. 10777.214033: bpf_trace_printk: LOAD (0x1,0x1,0x0) [2, 0]
<idle>-0 [027] d.Z3. 10777.214038: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [016] d.Z3. 10777.214039: bpf_trace_printk: STORE (0x1,0x1,0x0) [20c, 0]
<idle>-0 [002] d.Z3. 10777.214041: bpf_trace_printk: LOAD (0x1,0x1,0x0) [f2, 0]
<idle>-0 [008] d.Z3. 10777.214041: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [003] dNZ2. 10777.214042: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [020] d.Z2. 10777.214044: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [017] d.Z2. 10777.214045: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [027] d.Z2. 10777.214048: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [016] d.Z3. 10777.214050: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [022] d.Z2. 10777.214051: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [008] d.Z2. 10777.214053: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [003] dNZ2. 10777.214059: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [016] d.Z2. 10777.214059: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [008] d.Z2. 10777.214059: bpf_trace_printk: LOAD (0x1,0x1,0x0) [fddc0fec8, fffffe4a3f98bec8]
<idle>-0 [027] d.Z2. 10777.214060: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [020] d.Z2. 10777.214060: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [022] d.Z2. 10777.214061: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [017] d.Z2. 10777.214061: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [016] d.Z2. 10777.214062: bpf_trace_printk: LOAD (0x1,0x1,0x0) [110, 0]
<idle>-0 [008] d.Z3. 10777.215024: bpf_trace_printk: STORE (0x1,0x1,0x0) [fddc05e68, ffffadac004b4e68]
<idle>-0 [002] d.Z3. 10777.215027: bpf_trace_printk: LOAD (0x1,0x1,0x0) [fdd905e30, ffffadac0037ce30]
<idle>-0 [018] d.Z3. 10777.215030: bpf_trace_printk: STORE (0x1,0x1,0x0) [fde10ffc0, fffffe47649b7fc0]
<idle>-0 [027] d.Z3. 10777.215032: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [024] d.Z3. 10777.215033: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [003] dnZ3. 10777.215034: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [017] d.Z2. 10777.215036: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [027] d.Z3. 10777.215039: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [020] d.Z2. 10777.215043: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [003] dnZ2. 10777.215047: bpf_trace_printk: LOAD (0x1,0x1,0x0) [11a, 0]
<idle>-0 [017] d.Z3. 10777.215052: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [020] d.Z2. 10777.215055: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [003] dnZ3. 10777.215057: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
<idle>-0 [017] d.Z2. 10777.215060: bpf_trace_printk: STORE (0x1,0x1,0x0) [c1, 0]
@etsal -- The idea of leveraging the perf counter in making scheduling decisions is super cool! I wonder how much overhead it imposes. |
One possible issue wrt overhead is that right now we trigger for every single perf event, which is not really necessary. This is because we need fine granularity for samples but do not really care about receiving them immediately. If that turns out ot be a problem we should should either find a way to batch the delivery of these events, or we should write one :) |
I think you can change the sampling rate of AMD IBS (like Intel PEBS). If so, you can introduce a logic to autotune the sampling rate to control the overhead (e.g., 1000 samples / second). |
It is, but ideally we would still get a very high amount of samples in batches instead of one at a time to avoid running the BPF callback as often, which I'm not sure is currently possible. |
Sure, the batching will be essential to put aside processing IBS samples from the critical path. |
scx_rusty is currently using execution time as the main metric with which is load balances tasks between domains. However, there are other forms of resource contention that we can avoid through load balancing, e.g., L3 footprint and memory bandwidth. We can track these metrics accurately, per-process using hardware monitoring extensions like AMD IBS. Add initial support for reading data from these extensions into the scheduler.
STATUS: