2024-02-15 kernel meeting notes #117
zachschuermann
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
summary
merged robert's PR!! discussion on nick's - and some concern of visitor complexity.
action items
PRs in flight:
attendees
@roeap @zachschuermann @nicklan @ryan-johnson-databricks @hntd187 @vkorukanti
notes stream
PRs in flight (listed above).
need to create filter data engine side
after nick + robert PR then we unblock:
for nick's PR:
Box<dyn List<item>>
ryan: concern. data visitor pattern feeling very complex. might be going a bad direction.
robert: should we just ask for a copy every time? do we want ownership in the first place?
main concern is StringType. anything that implements Copy is easy. Memory backing the thing is owned by rust. For FFI we would have memory backed by non-rust.
nick: less about data format (always have to convert between engine/kernel format). more about allocations/frees across boundary. kernel can do all allocations for the things its building and rust will track lifetimes of that. a solution with the engine making copies may lead to a new kind of complexity in which we have to move lifetime tracking from the engine over to rust. main goal of the visitor model was making the lifetimes much more sane :) - a core part of FFI
ryan: should we keep trivial cast warning enabled? nope
From last time
nick: should we just parse all stats at once? we should have a clear picture of when we would use this? add microbenchmark
return result in C: (1) slot in parameters, return code or (2) null means error, not-null means success
module naming: rename defaultclient/simpleclient. good to have non-async client.
from last time: nick's scan result. doing selection vector now? yep! has a large boolean vector with it. Allocating with lots of false is fast. True takes a while.
from last time: How can we flatten roaring bitmap? Maybe build the dumb thing now (for loop) and take this on as an optimization later? Roaring bitmap stores indicies but we would want a sparse array.
Aside (zach): does allocating for a selection vector cause memory management issues?
Allocate big vector, then set indicies in the treemap.
Ryan: 3 things
FFI version of
EngineInterface
From nick/robert: unsafe in kernel used in one place to do
Box::into_raw
can we work around this? Fixed: we can implement a method that takes the box rather than the type itself. self if a box of the thing, then into_any, then downcast. yay!Box<trait> -> Box<any> -> Box<concrete>
then turn into record batch. No transmute.Beta Was this translation helpful? Give feedback.
All reactions