diagnostics_channel: add iterator support to tracing channel#61686
diagnostics_channel: add iterator support to tracing channel#61686rochdev wants to merge 4 commits intonodejs:mainfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #61686 +/- ##
==========================================
- Coverage 89.75% 89.65% -0.10%
==========================================
Files 674 676 +2
Lines 204394 206626 +2232
Branches 39278 39560 +282
==========================================
+ Hits 183448 185248 +1800
- Misses 13238 13492 +254
- Partials 7708 7886 +178
🚀 New features to boost your workflow:
|
Qard
left a comment
There was a problem hiding this comment.
Thanks for addressing this! I've left some feedback. I think we may want to consider a deeper refactor of TracingChannel. It's still not marked stable, so I think it may be a good idea to steer it in a more flexible direction where we can build a wider variety of specializations more flexibly. 🤔
| return start.runStores(context, () => { | ||
| try { | ||
| const result = ReflectApply(fn, thisArg, args); | ||
| // TODO: Should tracePromise just always do this? |
There was a problem hiding this comment.
No, because thenables. That's why it does the PromiseResolve(...) currently, though that isn't actually quite right either. What we should be doing is using promise.then(resolve, reject) after rather than PromisePrototypeThen(result, resolve, reject) as that assumes it's a native promise, thus the PromiseResolve(...) to convert it into one if it is not one already.
There was a problem hiding this comment.
What about functions that can return both though? Sometimes the caller handles that, and changing the return type may be unexpected. In some cases we end up having to manually do a traceSync and handle the promise. Relying on duck-typing might be more flexible no? It just feels weird to have this TracingChannel in Node when in at least half of cases we need to do everything manually because it's too rigid. I guess an alternative would be to keep the individual specialized ones and add a traceAny or similar that handles all cases automatically in a less strict way. It could even have an optional position parameter so that it can also do callbacks (so many cases where functions can be all 3 at the same time).
There was a problem hiding this comment.
It was originally built to just be the minimum viable implementation to get moving with the pattern. It was never intended to be universal. Now that we've learned more about the weaknesses I think it's better to just make a bunch of specialization classes rather than trying to squeeze a complicated fractal of possibilities into a single implementation. That'd just kill the performance. My original thinking was we'd specialize with more methods, but it's turning out that we really need different channel sets for different situations too.
| const nextChannel = this.#nextChannel ||= tracingChannel({ | ||
| start: channel(start.name.slice(0, -6) + ':next:start'), | ||
| end: channel(end.name.slice(0, -4) + ':next:end'), | ||
| asyncStart: channel(asyncStart.name.slice(0, -11) + ':next:asyncStart'), | ||
| asyncEnd: channel(asyncEnd.name.slice(0, -9) + ':next:asyncEnd'), | ||
| error: channel(error.name.slice(0, -6) + ':next:error'), | ||
| }); |
There was a problem hiding this comment.
Not a huge fan of magically deriving these from the main channels. Could we perhaps implement this traceIterator functionality as an entirely other class which we could simply construct with two separate TracingChannel instances for the overall execution and the per-yield execution?
I've been thinking we might want to split up TracingChannel into specializations anyway, something like: SyncTracingChannel, CallbackTracingChannel, PromiseTracingChannel, and a new IteratorTracingChannel?
There was a problem hiding this comment.
Not a huge fan of magically deriving these from the main channels. Could we perhaps implement this traceIterator functionality as an entirely other class which we could simply construct with two separate TracingChannel instances for the overall execution and the per-yield execution?
I agree and I'm also not a fan, but it felt like the simplest approach.
Are you saying to do instead something like this?
const ctx = {}
const iter = traceSync(fn, ctx)
return traceIterator(iter, ctx)I thought about that, but I wasn't sure since it would be the first time that TracingChannel instruments anything that is not a function (I mean it still patches the functions on the iterator, but that's more magic than usual). It also makes it impossible to reimplement differently later, although in practice I'm not sure that would happen. And it would also potentially be inconsistent across implementors. One could use :yield, another could use :next, etc, and thenTracingChannel would lose the consistency it was built for, and subscribers could not be built generically.
Worth noting: I'm adding the same functionality to Orchestrion-JS right now and it's unclear whether the FunctionType should be Iterator or not. If this is split, then I guess it would probably be a separate IteratorType?
I've been thinking we might want to split up TracingChannel into specializations anyway, something like: SyncTracingChannel, CallbackTracingChannel, PromiseTracingChannel, and a new IteratorTracingChannel?
I prefer the one class because we often have to mix functionalities or use the underlying channels directly for example when functions accept a callback but can also return a promise or a sync value.
There was a problem hiding this comment.
No, I did not mean to run the sync trace and then wrap another thing around the iterator separately. I meant to have something like new IteratorTracingChannel(execChannels, yieldChannels) which would take two sets of channels for the two separate parts of the execution but then would have a single trace(...) method which would do all of that internally.
As the interfaces are currently though, they don't really compose well. That's why I was thinking we make full separate classes for this. In a case like mixed promise/callback functions we could simply have another type which expresses that pattern.
The current TracingChannel has a bunch of issues I'd like to address before we consider it stable. The composability is one issue, but there's also the lack of wrap without execution so you can build the wrapper closures once rather than on every execution, probably using some context builder function to take the inputs and do something with them to produce the context object.
There was a problem hiding this comment.
A few questions:
- Would
execChannelsandyieldChannelsbe themselves specialized? So do you have to decide upfront if the function is sync or async and if the returned iterator is sync or async? And if so, how do you deal with cases where they are mixed? Or would there be additional looser specialized types for these cases likeMaybePromiseTracingChannel? - Is this a blocker to move forward with iterator support?
- What would this look like in Orchestrion-JS? It's also not composable in its current state as there is a single
FunctionKindvalue.
There was a problem hiding this comment.
- I was literally just thinking those would be the same
nameOrChannelsinput thatTracingChannelreceives currently. For differentiating between sync or async iterators I think we should have separate sync and async *IteratorTracingChannel types. For mixed cases--that wouldn't really be a thing, would it? It would not be a standard Iterator or AsyncIterator if it was not consistent. - The full design I'm describing should probably not be a complete blocker, but I do think we should at least get that dynamically generated nextChannel out of the hot-path. I would prefer if we split out this particular case to start the momentum of converting to specializations.
- That's specifically why I think we need specialization types. It currently can't cleanly map many patterns like mixed callback/promise, so we should have specializations which cover that case and then just have Orchestrion match the kinds to the appropriate specialization for the scenario.
There was a problem hiding this comment.
I was literally just thinking those would be the same nameOrChannels input that TracingChannel receives currently. For differentiating between sync or async iterators I think we should have separate sync and async *IteratorTracingChannel types.
The thing is you can have a sync function returning and async iterator, an async function returning an async iterator, a sync function returning a sync iterator and an async function returning a sync iterator. Technically you could even have callbacks involved in all of this. Even if this was specialized, it needs to be specialized on multiple fronts. That's why I thought the proposal was something like IteratorTracingChannel(PromiseChannel, SyncChannel) because otherwise if the sub-channels are passed directly there is no way to know the combination unless they are all handled automatically or it ends up being something like AsyncIteratorButSyncFunctionTracingChannel, or some different combinations of methods (like if AsyncIteratorTracingChannel would have a tracePromise and traceSync function).
For mixed cases--that wouldn't really be a thing, would it? It would not be a standard Iterator or AsyncIterator if it was not consistent.
Unfortunately it tends to be a case in enough places that it's a nuisance even today before any iterator support. People do all sorts of weird stuff in the wild. I guess the argument could be made to just do the sub-channels manually at that point, but outside of APM vendors not many people are familiar enough to do it right.
The full design I'm describing should probably not be a complete blocker, but I do think we should at least get that dynamically generated nextChannel out of the hot-path.
It's out of the hot path in the sense that it's only created on the first call. So in that sense it's not any more overhead than creating the main channel itself in the first place. Yes it's done lazily so it happens in the hot path once, but that shouldn't be enough to have any sort of real life impact.
I would prefer if we split out this particular case to start the momentum of converting to specializations.
Happy to do that, just not sure what the API would be, I guess it will require a bit more thinking/discussing. I think I would actually call figuring that out a blocker though, because I don't want to add something that will instantly be deprecated.
That's specifically why I think we need specialization types. It currently can't cleanly map many patterns like mixed callback/promise, so we should have specializations which cover that case and then just have Orchestrion match the kinds to the appropriate specialization for the scenario.
Do you have an example of what this could look like? Right now there is only FunctionKind: "Sync" | "Async" | "Callback". Would you change that completely or add to it? What would a new format look like?
There was a problem hiding this comment.
It's created once, but it's checked every single time.
For the iterator consistency, I meant within the iterator itself, not how it is returned. Between yields they're all going to be promises or not. For the weird cases we may want to do some nesting thing like you suggest, but for the common cases we should try to have single specializations which handle the full behaviour with a single thing.
One possible thing we could try is having a composable builder thing where you can separately specify how to instrument the call and how to instrument the return value, if relevant. Something like:
channel
.context(context)
.wrapSync(fn)
.wrapIterator()
.call(thisArg, ...args)
Most tracing channel use cases can already be handled by a combination of
traceSync,tracePromiseandtraceCallback. However, functions returning an iterator need the iterator to also be instrumented. This is because for the operation executed by the function to actually be done the iterator needs to reach its done state, and to collect the result the chunks from each iteration must be available.This PR adds support for a new
traceIteratorfunction that wraps a function returning an iterator so that both the function and every iterator iterations are traced.Open question:
traceIteratorjust wrap iterators and not functions? This would deviate from othertraceXmethods but would remove the need to handle sync and async with the same function.tracePromisealso handle synchronous return values similar to the new#traceMaybePromise? Right now it converts to a promise, which may incorrectly change the return type.