A Node.js binding for the RocksDB library.
- Supports optimistic and pessimistic transactions
- Hybrid sync/async data retrieval
- Range queries return an iterable with array-like methods and lazy evaluation
- Transaction log system for recording transaction related data
- Custom stores provide ability to override default database interactions
- Efficient binary key and value encoding
- Designed for Node.js and Bun on Linux, macOS, and Windows
const db = RocksDatabase.open('/path/to/db');
for (const key of ['a', 'b', 'c', 'd', 'e']) {
await db.put(key, `value ${key}`);
}
console.log(await db.get('b')); // `value b`
for (const { key, value } of db.getRange({ start: 'b', end: 'd' })) {
console.log(`${key} = ${value}`);
}
await db.transaction(async (txn: Transaction) => {
await txn.put('f', 'value f');
await txn.remove('c');
});Creates a new database instance.
path: stringThe path to write the database files to. This path does not need to exist, but the parent directories do.options: object[optional]disableWAL: booleanWhether to disable the RocksDB write ahead log.name: stringThe column family name. Defaults to"default".noBlockCache: booleanWhentrue, disables the block cache. Block caching is enabled by default and the cache is shared across all database instances.parallelismThreads: numberThe number of background threads to use for flush and compaction. Defaults to1.pessimistic: booleanWhentrue, throws conflict errors when they occur instead of waiting until commit. Defaults tofalse.store: StoreA custom store that handles all interaction between theRocksDatabaseorTransactioninstances and the native database interface. See Custom Store for more information.transactionLogMaxAgeThreshold: numberThe threshold for the transaction log file's last modified time to be older than the retention period before it is rotated to the next sequence number. Value must be between0.0and1.0. A threshold of0.0means ignore age check. Defaults to0.75.transactionLogMaxSize: numberThe maximum size of a transaction log file. If a log file is empty, the first log entry will always be added regardless if it's larger than the max size. If a log file is not empty and the entry is larger than the space available, the log file is rotated to the next sequence number. Defaults to 16 MB.transactionLogRetention: string | numberThe number of minutes to retain transaction logs before purging. Defaults to'3d'(3 days).transactionLogsPath: stringThe path to store transaction logs. Defaults to"${db.path}/transaction_logs".
Closes a database. This function can be called multiple times and will only close an opened database. A database instance can be reopened once its closed.
const db = RocksDatabase.open('foo');
db.close();Sets global database settings.
options: objectblockCacheSize: numberThe amount of memory in bytes to use to cache uncompressed blocks. Defaults to 32MB. Set to0(zero) disables block cache for future opened databases. Existing block cache for any opened databases is resized immediately. Negative values throw an error.
RocksDatabase.config({
blockCacheSize: 100 * 1024 * 1024 // 100MB
})Opens the database at the given path. This must be called before performing any data operations.
import { RocksDatabase } from '@harperdb/rocksdb-js';
const db = new RocksDatabase('path/to/db');
db.open();There's also a static open() method for convenience that performs the same
thing:
const db = RocksDatabase.open('path/to/db');Asychronously removes all data in the current database.
options: objectbatchSize?: numberThe number of records to remove at once. Defaults to10000.
Returns the number of entries that were removed.
Note: This does not remove data from other column families within the same database path.
for (let i = 0; i < 10; i++) {
db.putSync(`key${i}`, `value${i}`);
}
const entriesRemoved = await db.clear();
console.log(entriesRemoved); // 10Synchronous version of db.clear().
options: objectbatchSize?: numberThe number of records to remove at once. Defaults to10000.
for (let i = 0; i < 10; i++) {
db.putSync(`key${i}`, `value${i}`);
}
const entriesRemoved = db.clearSync();
console.log(entriesRemoved); // 10Retreives the value for a given key. If the key does not exist, it will resolve
undefined.
const result = await db.get('foo');
assert.equal(result, 'foo');If the value is in the memtable or block cache, get() will immediately return
the value synchronously instead of returning a promise.
const result = db.get('foo');
const value = result instanceof Promise ? (await result) : result;
assert.equal(result, 'foo');Note that all errors are returned as rejected promises.
Synchronous version of get().
Retrieves all keys within a range.
for (const key of db.getKeys()) {
console.log(key);
}Retrieves the number of keys within a range.
const total = db.getKeysCount();
const range = db.getKeysCount({ start: 'a', end: 'z' });Returns the current timestamp as a monotonically increasing timestamp in milliseconds represented as a decimal number.
const ts = db.getMonotonicTimestamp();
console.log(ts); // 1764307857213.739Returns a number representing a unix timestamp of the oldest unreleased snapshot.
Snapshots are only created during transactions. When the database is opened in optimistic mode (the default), the snapshot will be created on the first read. When the database is opened in pessimistic mode, the snapshot will be created on the first read or write.
console.log(db.getOldestSnapshotTimestamp()); // returns `0`, no snapshots
const promise = db.transaction(async (txn) => {
// perform a write to create a snapshot
await txn.get('foo');
await setTimeout(100);
});
console.log(db.getOldestSnapshotTimestamp()); // returns `1752102248558`
await promise;
// transaction completes, snapshot released
console.log(db.getOldestSnapshotTimestamp()); // returns `0`, no snapshotsRetrieves a range of keys and their values. Supports both synchronous and asynchronous iteration.
// sync
for (const { key, value } of db.getRange()) {
console.log({ key, value });
}
// async
for await (const { key, value } of db.getRange()) {
console.log({ key, value });
}
// key range
for (const { key, value } of db.getRange({ start: 'a', end: 'z' })) {
console.log({ key, value });
}Creates a new buffer with the contents of defaultBuffer that can be accessed
across threads. This is useful for storing data such as flags, counters, or
any ArrayBuffer-based data.
options?: objectcallback?: () => voidA optional callback is called whennotify()on the returned buffer is called.
Returns a new ArrayBuffer with two additional methods:
notify()- Invokes theoptions.callback, if specified.cancel()- Removes the callback; futurenotify()calls do nothing
Note: If a shared buffer already exists for the given key, the returned
ArrayBuffer will reference this existing shared buffer. Once all
ArrayBuffer instances have gone out of scope and garbage collected, the
underlying memory and notify callback will be freed.
const buffer = new Uint8Array(
db.getUserSharedBuffer('isDone', new ArrayBuffer(1))
);
done[0] = 0;
if (done[0] !== 1) {
done[1] = 1;
}const incrementer = new BigInt64Array(
db.getUserSharedBuffer('next-id', new BigInt64Array(1).buffer)
);
incrementer[0] = 1n;
function getNextId() {
return Atomics.add(incrementer, 0, 1n);
}Stores a value for a given key.
await db.put('foo', 'bar');Synchronous version of put().
Removes the value for a given key.
await db.remove('foo');Synchronous version of remove().
Executes all database operations within the specified callback within a single transaction. If the callback completes without error, the database operations are automatically committed. However, if an error is thrown during the callback, all database operations will be rolled back.
import type { Transaction } from '@harperdb/rocksdb-js';
await db.transaction(async (txn: Transaction) => {
await txn.put('foo', 'baz');
});Additionally, you may pass the transaction into any database data method:
await db.transaction(async (transaction: Transaction) => {
await db.put('foo', 'baz', { transaction });
});Note that db.transaction() returns whatever value the transaction callback
returns:
const isBar = await db.transaction(async (txn: Transaction) => {
const foo = await txn.get('foo');
return foo === 'bar';
});
console.log(isBar ? 'Foo is bar' : 'Foo is not bar');Executes a transaction callback and commits synchronously. Once the transaction callback returns, the commit is executed synchronously and blocks the current thread until finished.
Inside a synchronous transaction, use getSync(), putSync(), and
removeSync().
import type { Transaction } from '@harperdb/rocksdb-js';
db.transactionSync((txn: Transaction) => {
txn.putSync('foo', 'baz');
});The transaction callback is passed in a Transaction instance which contains
all of the same data operations methods as the RocksDatabase instance plus:
txn.abort()txn.commit()txn.commitSync()txn.getTimestamp()txn.idtxn.setTimestamp(ts)
Rolls back and closes the transaction. This method is automatically called after the transaction callback returns, so you shouldn't need to call it, but it's ok to do so. Once called, no further transaction operations are permitted.
Commits and closes the transaction. This is a non-blocking operation and runs on a background thread. Once called, no further transaction operations are permitted.
Synchronously commits and closes the transaction. This is a blocking operation on the main thread. Once called, no further transaction operations are permitted.
Retrieves the transaction start timestamp in seconds as a decimal. It defaults to the time at which the transaction was created.
Type: number
The transaction ID represented as a 32-bit unsigned integer. Transaction IDs are unique to the RocksDB database path, regardless the database name/column family.
Overrides the transaction start timestamp. If called without a timestamp, it will set the timestamp to the current time. The value must be in seconds with higher precision in the decimal.
await db.transaction(async (txn) => {
txn.setTimestamp(Date.now() / 1000);
});The 'aftercommit' event is emitted after a transaction has been committed and
the transaction has completed including waiting for the async worker thread to
finish.
result: objectnext: nulllast: nulltxnId: numberThe id of the transaction that was just committed.
The 'beforecommit' event is emitted before a transaction is about to be
committed.
The 'begin-transaction' event is emitted right before the transaction function
is executed.
The 'committed' event is emitted after the transaction has been written. When
this event is emitted, the transaction is still cleaning up. If you need to know
when the transaction is fully complete, use the 'aftercommit' event.
rocksdb-js provides a EventEmitter-like API that lets you asynchronously
notify events to one or more synchronous listener callbacks. Events are scoped
by database path.
Unlike EventEmitter, events are emitted asynchronously, but in the same order
that the listeners were added.
const callback = (name) => console.log(`Hi from ${name}`);
db.addListener('foo', callback);
db.notify('foo');
db.notify('foo', 'bar');
db.removeListener('foo', callback);Adds a listener callback for the specific key.
db.addListener('foo', () => {
// this callback will be executed asynchronously
});
db.addListener(1234, (...args) => {
console.log(args);
});Gets the number of listeners for the given key.
db.listeners('foo'); // 0
db.addListener('foo', () => {});
db.listeners('foo'); // 1Alias for addListener().
Adds a one-time listener, then automatically removes it.
db.once('foo', () => {
console.log('This will only ever be called once');
});Removes an event listener. You must specify the exact same callback that was
used in addListener().
const callback = () => {};
db.addListener('foo', callback);
db.removeListener('foo', callback); // return `true`
db.removeListener('foo', callback); // return `false`, callback not foundAlias for removeListener().
Call all listeners for the given key. Returns true if any callbacks were
found, otherwise false.
Unlike EventEmitter, events are emitted asynchronously, but in the same order
that the listeners were added.
You can optionally emit one or more arguments. Note that the arguments must be
serializable. In other words, undefined, null, strings, booleans, numbers,
arrays, and objects are supported.
db.notify('foo');
db.notify(1234);
db.notify({ key: 'bar' }, { value: 'baz' });rocksdb-js includes a handful of functions for executing thread-safe mutually
exclusive functions.
Returns true if the database has a lock for the given key, otherwise false.
db.hasLock('foo'); // false
db.tryLock('foo'); // true
db.hasLock('foo'); // trueAttempts to acquire a lock for a given key. If the lock is available, the
function returns true and the optional onUnlocked callback is never called.
If the lock is not available, the function returns false and the onUnlocked
callback is queued until the lock is released.
When a database is closed, all locks associated to it will be unlocked.
db.tryLock('foo', () => {
console.log('never fired');
}); // true, callback ignored
db.tryLock('foo', () => {
console.log('hello world');
}); // false, already locked, callback queued
db.unlock('foo'); // fires second lock callbackThe onUnlocked callback function can be used to signal to retry acquiring the
lock:
function doSomethingExclusively() {
// if lock is unavailable, queue up callback to recursively retry
if (db.tryLock('foo', () => doSomethingExclusively())) {
// lock acquired, do something exclusive
db.unlock('foo');
}
}Releases the lock on the given key and calls any queued onUnlocked callback
handlers. Returns true if the lock was released or false if the lock did
not exist.
db.tryLock('foo');
db.unlock('foo'); // true
db.unlock('foo'); // false, already unlockedRuns a function with guaranteed exclusive access across all threads.
await db.withLock('key', async () => {
// do something exclusive
console.log(db.hasLock('key')); // true
});If there are more than one simultaneous lock requests, it will block them until the lock is available.
await Promise.all([
db.withLock('key', () => {
console.log('first lock blocking for 100ms');
return new Promise(resolve => setTimeout(resolve, 100));
}),
db.withLock('key', () => {
console.log('second lock blocking for 100ms');
return new Promise(resolve => setTimeout(resolve, 100));
}),
db.withLock('key', () => {
console.log('third lock acquired');
})
]);Note: If the callback throws an error, Node.js suppress the error. Node.js
18.3.0 introduced a --force-node-api-uncaught-exceptions-policy flag which
will cause errors to emit the 'uncaughtException' event. Future Node.js
releases will enable this flag by default.
A user controlled API for logging transactions. This API is designed to be generic so that you can log gets, puts, and deletes, but also arbitrary entries.
Returns an array of log store names.
const names = db.listLogs();Deletes transaction log files older than the transactionLogRetention (defaults
to 3 days).
options: objectdestroy?: booleanWhentrue, deletes transaction log stores including all log sequence files on disk.name?: stringThe name of a store to limit the purging to.
Returns an array with the full path of each log file deleted.
const removed = db.purgeLogs();
console.log(`Removed ${removed.length} log files`);Gets or creates a TransactionLog instance. Internally, the TransactionLog
interfaces with a shared transaction log store that is used by all threads.
Multiple worker threads can use the same log at the same time.
name: string | numberThe name of the log. Numeric log names are converted to a string.
const log1 = db.useLog('foo');
const log2 = db.useLog('foo'); // gets existing instance (e.g. log1 === log2)
const log3 = db.useLog(123);Transaction instances also provide a useLog() method that binds the returned
transaction log to the transaction so you don't need to pass in the transaction
id every time you add an entry.
await db.transaction(async (txn) => {
const log = txn.useLog('foo');
log.addEntry(Buffer.from('hello'));
});A TransactionLog lets you add arbitrary data bound to a transaction that is
automatically written to disk right before the transaction is committed. You may
add multiple enties per transaction. The underlying architecture is thread safe.
log.addEntry()log.query()
Adds an entry to the transaction log.
data: Buffer | UInt8ArrayThe entry data to store. There is no inherent limit beyond what Node.js can handle.transactionId: NumberA related transaction used to batch entries on commit.
const log = db.useLog('foo');
await db.transaction(async (txn) => {
log.addEntry(Buffer.from('hello'), txn.id);
});If using txn.useLog() (instead of db.useLog()), you can omit the transaction
id from addEntry() calls.
await db.transaction(async (txn) => {
const log = txn.useLog('foo');
log.addEntry(Buffer.from('hello'));
});Returns an iterator that streams all log entries for the given filter.
options: objectstart?: numberThe transaction start timestamp.end?: stringThe transction end timestamp.
The iterator produces an object with the log entry timestamp and data.
objectdata: BufferThe entry data.timestamp: numberThe entry timestamp used to collate entries by transaction.
const log = db.useLog('foo');
const iter = log.query();
for (const entry of iter) {
console.log(entry);
}
const lastHour = Date.now() - (60 * 60 * 1000);
const rangeIter = log.query({ start: lastHour, end: Date.now() });
for (const entry of iter) {
console.log(entry.timestamp, entry.data);
}In general, you should use log.query() to query the transaction log, however
if you need to load an entire transaction log into memory, you can use the
parseTransactionLog() utility function.
const everything = parseTransactionLog('/path/to/file.txnlog');
console.log(everything);Returns an object containing all of the information in the log file.
size: numberThe size of the file.version: numberThe log file format version.entries: LogEntry[]An array of transaction log entries.data: BufferThe entry data.flags: numberTransaction related flags.length: numberThe size of the entry data.timestamp: numberThe entry timestamp.
The store is a class that sits between the RocksDatabase or Transaction
instance and the native RocksDB interface. It owns the native RocksDB instance
along with various settings including encoding and the db name. It handles all
interactions with the native RocksDB instance.
The default Store contains the following methods which can be overridden:
constructor(path, options?)close()decodeKey(key)decodeValue(value)encodeKey(key)encodeValue(value)get(context, key, resolve, reject, txnId?)getCount(context, options?, txnId?)getRange(context, options?)getSync(context, key, options?)getUserSharedBuffer(key, defaultBuffer?)hasLock(key)isOpen()listLogs()open()putSync(context, key, value, options?)removeSync(context, key, options?)tryLock(key, onUnlocked?)unlock(key)useLog(context, name)withLock(key, callback?)
To use it, extend the default Store and pass in an instance of your store
into the RocksDatabase constructor.
import { RocksDatabase, Store } from '@harperdb/rocksdb-js';
class MyStore extends Store {
get(context, key, resolve, reject, txnId) {
console.log('Getting:' key);
return super.get(context, key, resolve, reject, txnId);
}
putSync(context, key, value, options) {
console.log('Putting:', key);
return super.putSync(context, key, value, options);
}
}
const myStore = new MyStore('path/to/db');
const db = RocksDatabase.open(myStore);
await db.put('foo', 'bar');
console.log(await db.get('foo'));Important
If your custom store overrides putSync() without calling super.putSync()
and it performs its own this.encodeKey(key), then you MUST encode the VALUE
before you encode the KEY.
Keys are encoded into a shared buffer. If the database is opened with the
sharedStructuresKey option, encoding the value will load and save the
structures which encodes the sharedStructuresKey overwriting the encoded
key in the shared key buffer, so it's ultra important that you encode the
value first!
options: objectadaptiveReadahead: booleanWhentrue, RocksDB will do some enhancements for prefetching the data. Defaults totrue. Note that RocksDB defaults this tofalse.asyncIO: booleanWhentrue, RocksDB will prefetch some data async and apply it if reads are sequential and its internal automatic prefetching. Defaults totrue. Note that RocksDB defaults this tofalse.autoReadaheadSize: booleanWhentrue, RocksDB will auto-tune the readahead size during scans internally based on the block cache data when block caching is enabled, an end key (e.g. upper bound) is set, and prefix is the same as the start key. Defaults totrue.backgroundPurgeOnIteratorCleanup: booleanWhentrue, after the iterator is closed, a background job is scheduled to flush the job queue and delete obsolete files. Defaults totrue. Note that RocksDB defaults this tofalse.fillCache: booleanWhentrue, the iterator will fill the block cache. Filling the block cache is not desirable for bulk scans and could impact eviction order. Defaults tofalse. Note that RocksDB defaults this totrue.readaheadSize: numberThe RocksDB readahead size. RocksDB does auto-readahead for iterators when there is more than two reads for a table file. The readahead starts at 8KB and doubles on every additional read up to 256KB. This option can help if most of the range scans are large and if a larger readahead than that enabled by auto-readahead is needed. Using a large readahead size (> 2MB) can typically improve the performance of forward iteration on spinning disks. Defaults to0.tailing: booleanWhentrue, creates a "tailing iterator" which is a special iterator that has a view of the complete database including newly added data and is optimized for sequential reads. This will return records that were inserted into the database after the creation of the iterator. Defaults tofalse.
Extends RocksDBOptions.
options: objectend: Key | Uint8ArrayThe range end key, otherwise known as the "upper bound". Defaults to the last key in the database.exclusiveStart: booleanWhentrue, the iterator will exclude the first key if it matches the start key. Defaults tofalse.inclusiveEnd: booleanWhentrue, the iterator will include the last key if it matches the end key. Defaults tofalse.start: Key | Uint8ArrayThe range start key, otherwise known as the "lower bound". Defaults to the first key in the database.
Extends RangeOptions.
options: objectreverse: booleanWhentrue, the iterator will iterate in reverse order. Defaults tofalse.
This package requires Node.js 18 or higher, pnpm, and a C++ compiler.
Tip
Enable pnpm log streaming to see full build output:
pnpm config set stream true
There are two things being built: the native binding and the TypeScript code. Each of those can be built to be debug friendly.
| Description | Command |
|---|---|
| Production build (minified + native binding) | pnpm build |
| TypeScript only (minified) | pnpm build:bundle |
| TypeScript only (unminified) | pnpm build:debug |
| Native binding only (prod) | pnpm rebuild |
| Native binding only (with debug logging) | pnpm rebuild:debug |
| Debug build everything | pnpm build:debug && pnpm rebuild:debug |
When building the native binding, it will download the appropriate prebuilt
RocksDB library for your platform and architecture from the
rocksdb-prebuilds GitHub
repository. It defaults to the pinned version in the package.json file. You
can override this by setting the ROCKSDB_VERSION environment variable. For
example:
ROCKSDB_VERSION=9.10.0 pnpm buildYou may also specify latest to use the latest prebuilt version.
ROCKSDB_VERSION=latest pnpm buildOptionally, you may also create a .env file in the root of the project
to specify various settings. For example:
echo "ROCKSDB_VERSION=9.10.0" >> .envTo build RocksDB from source, simply set the ROCKSDB_PATH environment
variable to the path of the local rocksdb repo:
git clone https://github.com/facebook/rocksdb.git /path/to/rocksdb
echo "ROCKSDB_PATH=/path/to/rocksdb" >> .env
pnpm rebuildIt is often helpful to do a debug build and see the internal debug logging of the native binding. You can do a debug build by running:
pnpm rebuild:debugEach debug log message is prefixed with the thread id. Most debug log messages include the instance address making it easier to trace through the log output.
In the event Node.js crashes, re-run Node.js in lldb:
lldb node
# Then in lldb:
# (lldb) run your-program.js
# When the crash occurs, print the stack trace:
# (lldb) btTo run the tests, run:
pnpm coverageTo run the tests without code coverage, run:
pnpm testTo run a specific test suite, for example "ranges", run:
pnpm test ranges
# or
pnpm test test/rangesTo run a specific unit test, for example all tests that mention
"column family", run:
pnpm test -t "column family"Vitest's terminal renderer will often overwrite the debug log output, so it's
highly recommended to specify the CI=1 environment variable to prevent Vitest
from erasing log output:
CI=1 pnpm testBy default, the test runner deletes all test databases after the tests finish.
To keep the temp databases for closer inspection, set the KEEP_FILES=1
environment variable:
CI=1 KEEP_FILES=1 pnpm test