Coyote transformations: add a new module to run JS transformations.#648
Coyote transformations: add a new module to run JS transformations.#648
Conversation
|
||||||||||||||||||
|
Test Coverage: 81.79% |
6b48587 to
d46daed
Compare
svix-jbrown
left a comment
There was a problem hiding this comment.
TBH this whole thing feels like scope creep; I definitely would not want my database to be able to run arbitrary user-submitted javascripts.
Particularly, since transformations are entirely stateless, it makes no sense to me that anyone would run them in a database, because the way you scale them isn't related to the way you scale the stateful database.
| fn apply_sandboxing() -> anyhow::Result<()> { | ||
| use landlock::{ | ||
| ABI, Access, AccessFs, AccessNet, CompatLevel, Compatible, RestrictionStatus, Ruleset, | ||
| RulesetAttr, RulesetStatus, | ||
| }; | ||
| let abi = ABI::V4; | ||
| let status: RestrictionStatus = Ruleset::default() | ||
| .set_compatibility(CompatLevel::BestEffort) | ||
| .handle_access(AccessFs::from_all(abi))? | ||
| .handle_access(AccessNet::from_all(abi))? | ||
| .create()? | ||
| .restrict_self()?; | ||
| if status.ruleset != RulesetStatus::FullyEnforced { | ||
| tracing::warn!( | ||
| ?status, | ||
| "Landlock sandbox only partially enforced; some restrictions may be unsupported on this kernel" | ||
| ); | ||
| } | ||
|
|
||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
thius should definitely be behind a config flag, since a large number of environments don't support landlock. I also don't even know that landlock is a good approach compared to BPF-based pledge implementations
There was a problem hiding this comment.
.set_compatibility(CompatLevel::BestEffort)
It's on a best effort basis (so won't fail if not supported in the kernel).
I was going to do BPF, but that felt more annoying with the compilation step, and platform specific.
There was a problem hiding this comment.
it will still trigger a bunch of audit alarms, though, and fail if someone is running under a seccomp-bpf sandbox that blocks the landlock_* system calls. Again, I'd put it behind a config flag.
There was a problem hiding this comment.
I really want to avoid configuration bloat, though I agree with your concerns.
Any other suggestions? Maybe switch to BPF?
There was a problem hiding this comment.
I wrote a version that uses seccomp BPF. Though ngl, it feels very fragile. E.g. my first version didn't allow libc::SYS_openat because I didn't want to give file access, but that seems to be needed by tokio. 🤷🏻
| } | ||
|
|
||
| impl WorkerProcess { | ||
| fn spawn(exe_path: &PathBuf) -> anyhow::Result<Arc<Self>> { |
There was a problem hiding this comment.
every time this runs, it's going to nuke performance by invalidating every PTE for this process (which, if we have gigabytes of RAM used, is a lot).
The typical approach for such servers is to fork a single zygote very early in the process lifetime, and then whenever you need a new worker, send the zygote a message asking it to fork (typically over a unix domain socket, so that it can send you back the FD to the pipe as out-of-band data)
There was a problem hiding this comment.
It only spawns once on first use, and then keeps the process alive. Unless I missed what you're saying?
|
|
||
| fn get_worker() -> &'static Worker { | ||
| WORKER.get_or_init(|| { | ||
| let max_workers = available_parallelism().map(|n| n.get()).unwrap_or(4); |
There was a problem hiding this comment.
this should be a config variable; I wouldn't think most people would want to spent 100% of the threads on their system running javascript
There was a problem hiding this comment.
Yeah, I wasn't sure about this one (and what value makes sense). I at some point started playing around with setting a high niceness value (but you can't really do it without unsafe rust, it seems).
I don't like the idea of configs, because the whole idea is that this should be fast/simple to run. Any other ideas?
d46daed to
1e189d9
Compare
| @@ -0,0 +1,417 @@ | |||
| use std::time::{Duration, Instant}; | |||
There was a problem hiding this comment.
@svix-jbrown said (putting it here so that we have a thread):
TBH this whole thing feels like scope creep; I definitely would not want my database to be able to run arbitrary user-submitted javascripts.
Particularly, since transformations are entirely stateless, it makes no sense to me that anyone would run them in a database, because the way you scale them isn't related to the way you scale the stateful database.
Thoughts:
- Coyote is not a database. It's a backend multi-tool, so this is potentially one of the tools that it supports.
- You can run a separate coyote / set of coyotes for the JS transformations, it doesn't need to be in the deployment if you want to separate it. Though for simplicity you can run it in the same process.
- We do the fork-exec and landlock to isolate it so that it's not in the same process as the rest of the data and effectively a different daemon.
- We want to use it ourselves - let Coyote users (so our users, not their users) transform data when using
msgs-> HTTP/kafka/etc.
There was a problem hiding this comment.
I'm with @svix-jbrown here, this feels like a feature we should not take on lightly. Some thoughts:
- I don't remember reading about the
msgsfeature you allude to in 4., were they part of an RFC? - Does this feature benefit from coyote's distributed architecture in any way? I guess not (this is what JB was talking about in the second paragraph right?)
- Every time the JS runtime we use here has some vulnerability we will have to upgrade, publish a security statement ourselves, do a new release (even if just to maintain a good reputation; debating people about underlying vulnerabilities not being exploitable usually does not end well) - this will almost certainly annoy users of coyote that never touch this feature
- I think we've got a solid set of initial features, and if we want to have a reputation of quality and trustworthiness (set ourselves apart from vibe-coded projects as discussed yesterday), we really shouldn't add another significant feature so close to public launch
There was a problem hiding this comment.
I'm with @svix-jbrown here, this feels like a feature we should not take on lightly. Some thoughts:
* I don't remember reading about the `msgs` feature you allude to in 4., were they part of an RFC?
I think only in my internal notes, I don't think I converted them to the RFCs. Though it's been in my notes since before I shared Coyote with anyone. It's essential for bridge/send directly to HTTP/kafka/etc kind of use-cases.
* Does this feature benefit from coyote's distributed architecture in any way? I guess not (this is what JB was talking about in the second paragraph right?)
It doesn't, it's just per node. Though I don't see why this should be a decision factor.
* Every time the JS runtime we use here has some vulnerability we will have to upgrade, publish a security statement ourselves, do a new release (even if just to maintain a good reputation; debating people about underlying vulnerabilities not being exploitable usually does not end well) - this will almost certainly annoy users of coyote that never touch this feature
This is a very strong point. I don't think quickjs ever had a vulnerability, but it doesn't mean it won't. And actually, I think the rust crate depends on quickjs-ng which did have a vulnerability once!
I think though while a strong point, maybe slightly overstated, because a lot of pieces of software have issues in areas that are unused, and not upgrading by knowing they are unused is a common pattern.
* I think we've got a solid set of initial features, and if we want to have a reputation of quality and trustworthiness (set ourselves apart from vibe-coded projects as discussed yesterday), we really shouldn't add another significant feature so close to public launch
I think we need another marquee feature, and transformations felt it could be that. It's very useful and annoying to build right. This can also be a "alpha" feature, so an addendum to main set of features.
f58ccb2 to
34f93c7
Compare
The way it works is that people can pass an input (at the moment has to be JSON string, but we can change that) and a script. The script is then ran, and we return the JSON that it returns.
… there. This change is intended to make the transformation module more safe in case there's a JS sandbox escape or something like that. It forks-execs the process, scrubs env variables, and communicates with the main process over a pipe. It also blocks filesystem and network access when the kernel supports it.
This was suggested in review as a more widely supported solution.
34f93c7 to
90657a0
Compare
The way it works is that people can pass an input (at the moment has to
be JSON string, but we can change that) and a script. The script is then
ran, and we return the JSON that it returns.
To make it more secure against a JS sandbox escape / RCE this module
forks-execs the process, scrubs env variables, and communicates with
the main process over a pipe.
It also blocks filesystem and network access when the kernel supports
it.