-
Notifications
You must be signed in to change notification settings - Fork 33
/
Copy pathTODO
402 lines (290 loc) · 15.5 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
To-do list for nbdkit
======================================================================
General ideas for improvements
------------------------------
* Listen on specific interfaces or protocols.
* Performance - measure and improve it. Chart it over various buffer
sizes and threads, as that should make it easier to identify
systematic issues.
* For parallel plugins, only create threads on demand from parallel
client requests, rather than pre-creating all threads at connection
time, up to the thread pool size limit. Of course, once created, a
thread is reused as possible until the connection closes.
* A new threading model, SERIALIZE_RETIREMENT, which lets the client
queue up multiple requests and processes them in parallel in the
plugin, but where the responses sent back to the client are in the
same order as the client's request rather than the plugin's
completion. This is stricter than fully parallel, but looser than
SERIALIZE_REQUESTS (in particular, a client can get
non-deterministic behavior if batched requests touch the same area
of the export).
* Async callbacks. The current parallel support requires one thread
per pending message; a solution with fewer threads would split
low-level code between request and response, where the callback has
to inform nbdkit when the response is ready:
https://www.redhat.com/archives/libguestfs/2018-January/msg00149.html
* More NBD protocol features. The currently missing features are
structured replies for sparse reads, and online resize.
* Test that zero-length read/write/extents requests behave sanely
(NBD protocol says they are unspecified).
* If a client negotiates structured replies, and issues a read/extents
call that exceeds EOF (qemu 3.1 is one such client, when nbdkit
serves non-sector-aligned images), return the valid answer for the
subset of the request in range and then NBD_REPLY_TYPE_ERROR_OFFSET
for the tail, rather than erroring the entire request.
* Audit the code base to get rid of strerror() usage (the function is
not thread-safe); however, using geterror_r() can be tricky as it
has a different signature in glibc than in POSIX.
* Teach nbdkit_error() to have smart newline appending (for existing
inconsistent clients), while fixing internal uses to omit the
newline. Commit ef4f72ef has some ideas on smart newlines, but that
should probably be factored into a utility function.
* Add a mode of operation where nbdkit is handed a pre-opened fd to be
used immediately in transmission phase (skipping handshake). There
are already third-party clients of the kernel's /dev/nbdX which rely
on their own protocol instead of NBD handshake, before calling
ioctl(NBD_SET_SOCK); this mode would let the third-party client
continue to keep their non-standard handshake while utilizing nbdkit
to prototype new behaviors in serving the kernel.
* "nbdkit.so": nbdkit as a loadable shared library. The aim of nbdkit
is to make it reusable from other programs (see nbdkit-captive(1)).
If it was a loadable shared library it would be even more reusable.
API would allow you to create an nbdkit instance, configure it (same
as the current command line), start it serving on a socket, etc.
However perhaps the current ability to work well as a subprocess is
good enough? Also allowing multiple instances of nbdkit to be
loaded in the same process is probably impossible.
* Examine other fuzzers: https://gitlab.com/akihe/radamsa
* common/utils/vector.h could be extended and used in other places:
- there are some more possible places in the server (anywhere using
realloc is suspect)
- add more iterators, map function, etc, as required.
* password=- to mean read a password interactively from /dev/tty (not
stdin).
Suggestions for plugins
-----------------------
Note: qemu supports other formats such as libnfs, iscsi, gluster and
ceph/rbd, and while similar plugins could be written for nbdkit there
is no compelling reason unless the result is better than qemu-nbd.
For the majority of users it would be better if they were directed to
qemu-nbd for these use cases.
* XVA files
https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg02971.html
is a partial solution but it needs cleaning up.
nbdkit-floppy-plugin:
* Add boot sector support. In theory this is easy (eg. using
SYSLINUX), but the practical reality of making a fully bootable
floppy is rather more complex.
* Add multiple dir merging.
nbdkit-linuxdisk-plugin:
* Add multiple dir merging (in e2fsprogs mke2fs).
VDDK:
* Map export name to file parameter, allowing clients to access
multiple disks from a single VM.
* For testing VDDK, try to come up with delay filter + sparse plugin
settings which behave closely (in terms of performance and API
latency) to the real thing. This would allow us to tune some
performance tools without needing VMware all the time.
nbdkit-torrent-plugin:
* There are lots of settings we could map into parameters:
https://www.libtorrent.org/reference-Settings.html#settings_pack
* The C++ could be a lot more natural. At the moment it's a kind of
“C with C++ extensions”.
nbdkit-ondemand-plugin:
* Implement more callbacks, eg. .zero
* Allow client to select size up to a limit, eg. by sending export
names like ‘export:4G’.
nbdkit-data-plugin:
* Allow expressions to evaluate to numbers, offsets, etc so that this
works:
nbdkit data '(0x55 0xAA)*n' n=1000
* Allow inclusion of files where the file is not binary but is written
in the data format.
* Like $VAR but the variable is either binary or base64.
nbdkit-cdi-plugin:
* Look at using skopeo instead of podman pull
(https://github.com/containers/skopeo)
nbdkit-blkio-plugin:
* Use event-driven mode instead of blocking mode. This involves
restructuring the plugin so that there is one or more background
threads to handle the events, and nbdkit threads issue requests to
these threads. (See how it is done in the VDDK plugin.)
Suggestions for language plugins
--------------------------------
Python:
* Get the __docstring__ from the module and print it in --help output.
This requires changes to the core API so that config_help is a
function rather than a variable (see V3 suggestions below).
Suggestions for filters
-----------------------
* Add shared filter. Take advantage of filter context APIs to open a
single context into the backend shared among multiple client
connections. This may even allow a filter to offer a more parallel
threading model than the underlying plugin.
* CBT filter to track dirty blocks. See these links for inspiration:
https://www.cloudandheat.com/block-level-data-tracking-using-davice-mappers-dm-era/
https://github.com/qemu/qemu/blob/master/docs/interop/bitmaps.rst
* masking plugin features for testing clients (see 'nozero' and 'fua'
filters for examples)
* "bandwidth quota" filter which would close a connection after it
exceeded a certain amount of bandwidth up or down.
* "forward-only" filter. This would turn random access requests from
the client into serial requests in the plugin, meaning that the
plugin could be written to assume that requests only happen from
beginning to end. This is would be useful for plugins that have to
deal with non-seekable compressed data. Note the filter would have
to work by caching already-read data in a temporary file.
* nbdkit-cache-filter should handle ENOSPC errors automatically by
reclaiming blocks from the cache
* nbdkit-cache-filter could use a background thread for reclaiming.
* zstd filter was requested as a way to do what we currently do with
xz but saving many hours on compression (at the cost of hundreds of
MBs of extra data)
https://github.com/facebook/zstd/issues/395#issuecomment-535875379
* nbdkit-exitlast-filter could probably use a configurable timeout so
that there is a grace period in case another connection comes along.
* nbdkit-pause-filter would probably be more useful if the filter
could inject a flush after pausing. However this requires that
filter background threads have access to the plugin (see above).
nbdkit-luks-filter:
* This filter should also support LUKSv2 (and so should qemu).
* There are some missing features: ESSIV, more ciphers.
* Implement trim and zero if possible.
nbdkit-readahead-filter:
* The filter should open a new connection to the plugin per background
thread so it is able to work with plugins that use the
serialize_requests thread model (like curl). At the moment it makes
requests on the same connection, so it requires plugins to use the
parallel thread model.
* It should combine (or avoid) overlapping cache requests.
nbdkit-rate-filter:
* allow other kinds of traffic shaping such as VBR
* limit traffic per client (ie. per IP address)
* split large requests to avoid long, lumpy sleeps when request size
is much larger than rate limit
nbdkit-retry-filter:
* allow user to specify which errors cause a retry and which ones are
passed through; for example there's probably no point retrying on
ENOMEM
* there are all kinds of extra complications possible here,
eg. specifying a pattern of retrying and reopening:
retry-method=RRRORRRRRORRRRR meaning to retry the data command 3
times, reopen, retry 5 times, etc.
* subsecond times
nbdkit-ip-filter:
* permit hostnames and hostname wildcards to be used in the
allow and deny lists
* the allow and deny lists should be updatable while nbdkit is
running, for example by storing them in a database file
nbdkit-extentlist-filter:
* read the extents generated by qemu-img map, allowing extents to be
ported from a qemu block device
* make non-read-only access safe by updating the extent list when the
filter sees writes and trims
nbdkit-exportname-filter:
* find a way to call the plugin's .list_exports during .open, so that
we can enforce exportname-strict=true without command line redundancy
* add a mode for passing in a file containing exportnames in the same
manner accepted by the sh/eval plugins, rather than one name (and no
description) per config parameter
Filters for security
--------------------
Things like filtering by IP address can be done using external
wrappers (TCP wrappers, systemd), or nbdkit-ip-filter.
However it might be nice to have a configurable filter for preventing
valid but not sensible requests. The server already filters invalid
requests. This would be like seccomp, and could be implemented using
an eBPF-based parser. Unfortunately actual eBPF is difficult to use
for userspace processes. The "standard" isn't solidly defined - the
Linux kernel implementation is the standard - and Linux has by far the
best implementation, particularly around bytecode verification and
JITting. There is a userspace VM (ubpf) but it has very limited
capabilities compared to Linux.
Composing nbdkit
----------------
Filters allow certain types of composition, but others would not be
possible, for example RAIDing over multiple nbd sources. Because the
plugin API limits us to loading a single plugin to the server, the
best way to do this (and the most robust) is to compose multiple
nbdkit processes. Perhaps libnbd will prove useful for this purpose.
Build-related
-------------
* Figure out how to get 'make distcheck' working. VPATH builds are
working, but various pkg-config results that try to stick
bash-completion and ocaml add-ons into their system-wide home do
not play nicely with --prefix builds for a non-root user.
* Right now, 'make check' builds keys with an expiration of 1 year
only if they don't exist, and we leave the keys around except under
'make distclean'. This leads to testsuite failures when
(re-)building in an incremental tree first started more than a year
ago. Better would be having make unconditionally run the generator
scripts, but tweak the scripts themselves to be a no-op unless the
keys don't exist or have expired.
Windows port
------------
Currently many features are missing, including:
* Daemonization. This is not really applicable for Windows where you
would instead want to run nbdkit as a service using something like
SRVANY. You must use the -f option or one of the other options that
implies -f.
* These options are all unimplemented:
--exit-with-parent, --group, --run, --selinux-label, --single, --swap,
--user, --vsock
* Many other plugins and filters.
* errno_is_preserved should use GetLastError and/or WSAGetLastError
but currently does neither so errors from plugins are probably wrong
in many cases.
* Most tests are skipped because of the missing features above.
V3 plugin protocol
------------------
From time to time we may update the plugin protocol. This section
collects ideas for things which might be fixed in the next version of
the protocol.
Note that we keep the old protocol(s) around so that source
compatibility is retained. Plugins must opt in to the new protocol
using ‘#define NBDKIT_API_VERSION <version>’.
* All methods taking a ‘count’ field should be uint64_t (instead of
uint32_t). Although the NBD protocol does not support 64 bit
lengths, it might do in future.
* v2 .can_zero is tri-state for filters, but bool for plugins; in v3,
even plugins should get a say on whether to emulate
* v2 .can_extents is bool, and cannot affect our response to
NBD_OPT_SET_META_CONTEXT (that is, we are blindly emulating extent
support regardless of the export name). In v3, .can_extents should
be tri-state, with an explicit way to disable that context on a
per-export basis.
* pread could be changed to allow it to support Structured Replies
(SRs). This could mean allowing it to return partial data, holes,
zeroes, etc. For a client that negotiates SR coupled with a plugin
that supports .extents, the v2 protocol would allow us to at least
synthesize NBD_REPLY_TYPE_OFFSET_HOLE for less network traffic, even
though the plugin will still have to fully populate the .pread
buffer; the v3 protocol should make sparse reads more direct.
* Parameters should be systematized so that they aren't just (key,
value) strings. nbdkit should know the possible keys for the plugin
and filters, and the type of the values, and both check and parse
them for the plugin.
* Modify open() API so it takes an export name and tls parameter.
* Modify get_ready() API to pass final selected thread model. Filters
already get this information, but to date, we have not yet seen a
reason to add nbdkit_get_thread_model for use by v2 plugins.
* Change config_help from a variable to a function.
* Consider extra parameters to .block_size(). First so that we can
separately control maximum data request (eg. pread) vs maximum
virtual request (eg. zero). Note that the NBD protocol does not yet
support this distinction. We may also need a 'bool strict'
parameter to specify whether the client requested block size
information or not. qemu-nbd can distinguish these two cases.
* Renumber thread models. Redefine existing thread models like this:
model existing value new value
SERIALIZE_CONNECTIONS 0 1000
SERIALIZE_ALL_REQUESTS 1 2000
SERIALIZE_REQUESTS 2 3000
PARALLEL 3 4000
This allows new thread models to be inserted before and between the
existing ones. In particular SERIALIZE_RETIREMENT has been
suggested above (inserted between SERIALIZE_REQUESTS and PARALLEL).
We could also imagine a thread model more like the one VDDK really
wants which calls open and close from the main thread.
For backwards compatibility the old numbers 0-3 can be transparently
mapped to the new values.