Skip to content

2026-05-04 slatedb 0.12 uniffi Migration

  • Status: implemented
  • Date: 2026-05-04

Summary

This engineering note documents the migration of shardyfusion's writer and reader paths from the legacy synchronous slatedb top-level API (slatedb.SlateDB, slatedb.SlateDBReader) to the async-only uniffi-generated bindings under slatedb.uniffi shipped in slatedb>=0.12,<0.13. It covers the sync→async bridge design, the removal of read-side checkpoint pinning, the switch to opaque shardyfusion-generated UUID checkpoint_id values, the seal()-vs-checkpoint() Protocol change, the typed SlateDbSettings configuration model, the new iterator_chunk_size knob, and the perf microbenchmark scaffolding introduced to guard against bridge-overhead regressions.

1. What problem is being solved or functionality being added by the changes?

slatedb 0.12 deleted the synchronous top-level Python API that shardyfusion was built on. Every public method on Db, DbReader, WriteBatch, and the iterator types is now async def, and the read-side checkpoint_id argument no longer exists. The migration needed to:

  1. Replumb the hot path onto an async-only library while preserving shardyfusion's synchronous DbAdapter and ShardReader Protocols, because Spark/Dask executors and the Python writer's multiprocessing workers are sync code paths. Pushing async upward would have rippled into every framework integration.
  2. Replace the read-side checkpoint pinning model that no longer exists in slatedb. The previous design relied on DbReaderBuilder accepting a checkpoint_id to pin a specific on-disk snapshot.
  3. Keep the public shardyfusion API stable so that downstream Spark/Dask/Ray writers and existing snapshots don't churn.
  4. Avoid catastrophic per-row scan overhead introduced by the sync→async bridge cost (~15–40 µs per round-trip; ~30× slowdown for naive per-row iteration).
  5. Make missing-symbol failures actionable instead of producing AttributeError stack traces deep inside writer/reader code when slatedb's surface drifts again.

2. What design decisions were considered with their pros and cons and trade offs?

Decision 1: How to bridge sync shardyfusion Protocols to async uniffi

Option A: Generate a sync wrapper class (SyncDb) around uniffi

Pros: - Looks idiomatic at call sites (db.write(batch) instead of run_coro(db.write(batch))).

Cons: - Doubles the surface to maintain — every uniffi method needs a sync mirror. - Obscures where async actually happens, making it easy to accidentally call from inside an event loop and deadlock. - Adds a wrapper-object lifetime to track on top of the underlying uniffi object.

Option B: Process-global daemon-thread asyncio loop with run_coro helper (chosen)

Pros: - One file (shardyfusion/_slatedb_runtime.py) owns the bridge. - Honest at call sites: run_coro(reader.get(k)) makes the async hop visible. - Daemon thread gives us process-lifetime semantics without shutdown coordination. - Cannot be hijacked by a request-scoped loop (the loop is private).

Cons: - Per-call cost (~15–40 µs) is non-trivial for hot paths. - Tests that need a fake event loop must monkey-patch run_coro.

Option C: Make the loop user-pluggable

Pros: - Lets advanced callers reuse their own loop.

Cons: - Users will accidentally pin it to a request-scoped loop and deadlock on shutdown. - The Protocol contract becomes "sync, but configurably so" — a worse abstraction.

We chose Option B. The bridge is shardyfusion-owned and non-customizable; cost is amortized by the iterator chunking decision below.

Decision 2: Iterator chunking knob — where it lives

Option A: Push chunking into DbAdapter / writer side

Cons: - Writes already batch via WriteBatch; adding a second knob invites confusion about which path it controls.

Option B: iterator_chunk_size only on SlateDbReaderFactory (chosen)

Pros: - Single place to tune, single place to document. - Default 1024 amortizes the bridge cost across rows of typical size; failure mode is "uses more memory per chunk", not correctness.

Cons: - Callers with very large values must override explicitly.

We chose Option B.

Decision 3: How to identify shards now that slatedb has no read-side checkpoint API

Option A: Hash the materialized SlateDB / SQLite file (legacy SQLite/SQLiteVec behaviour)

Pros: - Content-addressable; identical bytes → identical id.

Cons: - Forces a read-back pass on the writer hot path on every shard close. - Not actually needed for correctness given shardyfusion's invariants (single writer per SlateDB; manifest published only after writers finish; no post-publish updates).

Option B: Opaque uuid.uuid4().hex stamped by the writer (chosen)

Pros: - Zero I/O on the writer hot path. - Uniqueness guaranteed without reading any bytes. - Centralized in shardyfusion._checkpoint_id.generate_checkpoint_id(). - Cache identity for SQLite/SQLiteVec/LanceDB factories is preserved.

Cons: - Two writes that produce identical bytes get different ids — but this never happens under our invariants.

Option C: Re-hash from the manifest after publish

Cons: - Manifest doesn't see the bytes; we'd need a separate pass. - Adds ordering dependency between shard finalize and manifest build.

We chose Option B. SlateDbReaderFactory accepts checkpoint_id for Protocol symmetry but ignores it — cache identity for SlateDB shards collapses to db_url only.

Decision 4: Adapter Protocol — seal() vs checkpoint()

The old Protocol had checkpoint() -> str | None, which told the adapter "flush, finalize, and tell me your checkpoint id". With Decision 3 the writer now stamps the id itself; adapters only need to flush + finalize.

Option A: Keep checkpoint() -> str | None and ignore the return

Cons: - Custom adapters returning a hash silently get their value dropped — no way to notice the contract changed.

Option B: Rename to seal() -> None (chosen)

Pros: - Compile-time / Protocol-check failure for any adapter still on the old contract. - Method name accurately describes the new responsibility.

Cons: - Source-incompatible for custom adapters — but the alternative is worse (silent behavior change).

We chose Option B.

Decision 5: Symbol resolution layer

Option A: import slatedb.uniffi directly at every call site

Cons: - import shardyfusion blows up on machines without slatedb. - Missing symbols become AttributeError deep in writer code. - Tests must patch every site individually.

Option B: Single choke point in _slatedb_symbols.py (chosen)

Pros: - Lazy import isolates the optional-dependency check to one try/except. - Missing symbols become a single DbAdapterError("slatedb.uniffi.X is unavailable"). - Tests monkey-patch sys.modules["slatedb"] and sys.modules["slatedb.uniffi"] once and every shardyfusion call site picks up the fake.

We chose Option B.

Decision 6: Configuration model — SlateDbSettings

Option A: Keep accepting the JSON-ish dict from the legacy API

Cons: - Typos silently lost; new typed Settings class in slatedb 0.12 is not validated.

Option B: Typed SlateDbSettings dataclass with

raw_overrides: dict[str, Any] escape hatch + legacy-dict adapter that emits DeprecationWarning (chosen) Pros: - Typos caught at construction time. - Escape hatch for fields shardyfusion hasn't modeled. - One-release deprecation cycle keeps the migration non-breaking for downstream callers.

We chose Option B.

Decision 7: env_file + db_url resolution

slatedb 0.12's ObjectStore.resolve(db_url) reads env at resolve time. We need env present at that moment, scoped to the resolution call.

Option A: Pull in python-dotenv

Cons: - Ships a CLI and global config-file behavior we don't want. - Adds a third-party dependency for ~30 lines of logic.

Option B: In-house apply_env_file() context manager (chosen)

Pros: - Scopes env mutations to the resolve call; restores on exit. - No new dependency.

We chose Option B.

Decision 8: KeyRange over kwargs in scan_iter

A late perf-test failure caught that uniffi's DbReader.scan takes a single KeyRange positional — not start=/end= kwargs. We added get_key_range_class() and now build a KeyRange(start, start_inclusive=True, end, end_inclusive=False) once per call, mirroring half-open [start, end) Python semantics. We also moved iterator construction outside the per-chunk drain loop. The previous code re-opened the iterator on every chunk, which would have silently re-served the first N rows forever once a shard exceeded iterator_chunk_size.

Decision 9: Performance microbenchmarks — gating

The bridge overhead is the single largest correctness-adjacent risk in this migration.

Option A: Run perf benchmarks on every CI run

Cons: - Adds wall-clock time and flakiness to every PR. - Hard to tune budgets that survive shared-runner variance.

Option B: Marker-gated, on-demand just perf recipe (chosen)

Pros: - Deliberate, focused check when investigating bridge regressions. - addopts = "-ra -m 'not perf'" in pyproject.toml keeps default pytest runs clean. - Loose budgets (~5× measured) catch order-of-magnitude regressions without flaking.

We chose Option B.

Decision 10: Async migration scope

Option A: Migrate shardyfusion to async end-to-end

Cons: - Spark/Dask/Ray worker contracts are sync. - This is a separate, much larger project.

Option B: Keep shardyfusion's Protocols sync; bridge per-call (chosen)

We chose Option B — see Decision 1.

3. What is the impact of these changes (covering testability, performance, and complexity)?

Testability

  • _slatedb_symbols.py gives tests one place to patch (sys.modules["slatedb"] + sys.modules["slatedb.uniffi"]) instead of N call sites.
  • New helpers shardyfusion.testing.open_slatedb_db() and open_slatedb_reader() wrap the canonical "build → resolve → run_coro" pattern so test seed code mirrors production exactly.
  • The integration-test file URL remap pattern (map_s3_db_url_to_file_url(db_url, object_store_root) followed by SlateDbReaderFactory() delegation) is now the standard for any test that materializes data on local disk under an s3:// URL.
  • Perf suite (tests/integration/perf/, @pytest.mark.perf, just perf) provides a deliberate guardrail for bridge-overhead regressions without polluting default CI.

Performance

  • Per-call bridge cost: ~15–40 µs measured. Acceptable for get and write (already batched via WriteBatch).
  • Per-row iteration without chunking: ~30× slowdown vs. an in-process iterator. iterator_chunk_size=1024 default recovers most of it; perf budgets catch regressions.
  • Writer hot path lost the read-back hashing pass that the legacy SHA-256 checkpoint_id required for SQLite/SQLiteVec — a small but real win.

Complexity

  • Net +1 module (_slatedb_runtime.py) and +1 module (_checkpoint_id.py); slight expansion of _slatedb_symbols.py.
  • Net −1 read-side concept (checkpoint pinning) that was never needed under our invariants.
  • Bridge call sites are visible (run_coro(...)) — readers can trace where async actually happens.
  • Configuration surface narrowed: typed SlateDbSettings with one raw_overrides escape hatch instead of an open-ended dict.

4. API delta: slatedb 0.11.x → 0.12.1

Concern 0.11.x (legacy) 0.12.1 (uniffi)
Module surface slatedb.SlateDB, slatedb.SlateDBReader slatedb.uniffi.{Db, DbBuilder, DbReader, DbReaderBuilder, WriteBatch, ObjectStore, Settings, FlushOptions, FlushType, KeyRange, DbIterator, KeyValue}
Sync/async Sync methods All methods async def
Open writer SlateDB(local_dir, url=..., **opts) await DbBuilder(path, store).build() where store = ObjectStore.resolve(url)
Open reader SlateDBReader(local_dir, url=..., checkpoint_id=...) await DbReaderBuilder(path, store).build() (no checkpoint_id arg)
Write batch db.write(batch) (sync) await db.write(WriteBatch())
Flush WAL db.flush() / db.flush_with_options("wal") await db.flush_with_options(FlushOptions(flush_type=FlushType.WAL))
Range scan reader.scan(start=..., end=...) returning sync iterator await reader.scan(KeyRange(start=..., start_inclusive=True, end=..., end_inclusive=False)) returning async DbIterator
Iterate for kv in scan_result (sync) await iterator.next() returning Optional[KeyValue] (per-row only; no next_batch in 0.12.1)
Checkpoint create db.create_checkpoint() returning hash str Removed — no read-side checkpoint pinning API exists
Close implicit GC await db.shutdown() for Db; DbReader has no explicit close
Settings dict/kwargs to constructor Settings(...) typed object passed via builder
Object store creds env vars + url string ObjectStore.resolve(url) reads env at resolve time
Python version floor 3.9+ 3.11+ (uniffi-generated bindings)

Symbol availability check

from importlib.metadata import version
print(version("slatedb"))                    # '0.12.1'
from slatedb.uniffi import (
    Db, DbBuilder, DbReader, DbReaderBuilder,
    WriteBatch, ObjectStore, Settings,
    FlushOptions, FlushType, KeyRange,
)

slatedb<0.12 lacks the uniffi submodule entirely; slatedb==0.11.1 satisfied a naive slatedb<0.13 constraint and triggered the migration's first false-start. The pin is now slatedb>=0.12,<0.13 in both pyproject.toml extras and main deps.

5. Observed-but-deliberate gotchas

  • DbReader.scan requires KeyRange — not kwargs. Documented in the _SlateDbReaderHandle docstring and AGENTS.md Gotchas.
  • DbIterator exposes only next in 0.12.1 — no next_batch. scan_iter keeps a next_batch fast path behind try/except AttributeError for forward compatibility with later slatedb releases that may add it.
  • Per-call bridge cost is real (~15–40 µs); any future hot-path call must batch through run_coro once, not per item.
  • SlateDbReaderFactory.checkpoint_id is accepted but ignored. Document this in any new factory subclass or wrapper to avoid misleading callers.
  • Test patching must hit both sys.modules["slatedb"] and sys.modules["slatedb.uniffi"]; patching only the top-level module doesn't intercept _slatedb_symbols._import_uniffi().

6. What was explicitly not done, and why

  • Did not introduce a sync wrapper class around uniffi. See Decision 1, Option A.
  • Did not move shardyfusion to async end-to-end. See Decision 10.
  • Did not pin slatedb to an exact version. >=0.12,<0.13 lets patch releases through; the symbol-resolution layer makes any newly-renamed class fail at one obvious site.
  • Did not preserve content-addressed checkpoint IDs. See Decision 3 — the single-writer + serial-publish invariant makes content addressing unnecessary, and removing it deleted a write-time hashing pass we were paying for on every shard close.

7. Post-merge audit

After the migration landed, two concerns were raised in review and investigated quantitatively before closing the work.

7.1 Bridge-loop contention

Concern. All synchronous slatedb operations (writer + sync reader) funnel through one process-global asyncio loop running on a single daemon thread. Spark, Dask, Ray, and Python writers can run many worker threads inside one Python process — does the shared loop serialise them?

Topology. Cluster writers (mapPartitionsWithIndex for Spark, analogous for Dask/Ray) put one shard per Python worker process. Inside one process the partition writer is itself sequential, so there is at most one in-flight write_batch per process anyway — the loop is not contended. Multi-shard-per-process only occurs in the Python writer's parallel=True mode, which uses multiprocessing spawn; each subprocess has its own loop.

The interesting case is a single process serving many concurrent sync get/scan calls — e.g. a FastAPI app wrapping ConcurrentShardedReader with a thread pool.

Measurement. A microbenchmark on the local file:// backend (see commit log; not committed as a perf test because the absolute numbers are FS-dependent):

Topology per-op (write_batch ×100 rows)
1 thread, shared bridge loop 101.10 ms
2 threads, shared bridge loop, separate DBs 101.09 ms
4 threads, shared bridge loop, separate DBs 101.12 ms
8 threads, shared bridge loop, separate DBs 101.18 ms
8 threads, separate loops, separate DBs 101.13 ms

Pure bridge cost (no slatedb work) under the same loop:

Topology per-call latency aggregate ops/s
1 thread 18.42 µs 54,300
8 threads (shared loop) 16.05 µs 62,300

Findings.

  1. The bridge contributes ~18 µs per call. A typical write_batch spends ~101 ms in slatedb, so the bridge is <0.02 % of write latency. For point gets it's a larger fraction but still well below the 1 ms budget set in the perf tests.
  2. Adding threads with separate loops (option C) gives no throughput improvement either, which means slatedb's internal Tokio runtime is already serialising per-DB writes — the bridge is not the bottleneck even in principle.
  3. Aggregate bridge throughput actually increases slightly with threads (62k > 54k ops/s) because the loop amortises the wakeup cost across multiple inflight submissions.

Verdict. Single shared bridge loop is correct. No mitigation needed. The cost of switching to per-thread loops would be losing uniffi's set_event_loop registration invariant (uniffi binds Rust async tasks to the loop that created the resource) for no measurable benefit.

7.2 What did the migration give up?

A line-by-line audit of pre-0.12 capabilities versus the current adapter surface:

Pre-0.12 capability Post-0.12 status Operational impact
db.create_checkpoint(scope="durable") returning a slatedb-managed checkpoint id Removed. uniffi 0.12 has no checkpoint create/pin API. Writer stamps an opaque uuid4().hex. Manifests still record a per-shard id, so cleanup and winner-selection are unaffected. Lost: ability to pin-read a specific historical checkpoint of a SlateDB shard via the engine. Safe under the single-writer + serial-publish + S3-strong-consistency invariant: shards at db_url are immutable after publish, so readers never need a checkpoint pin.
Content-addressed checkpoint_id (SHA-256 of materialized DB) Replaced by uuid4().hex. _winner_sort_key is (attempt, task_attempt_id, db_url)checkpoint_id was never consulted for tiebreaks. No regression. External consumers that persisted SHA-256 fingerprints must switch to comparing db_bytes() or computing their own digest from the downloaded shard.
Reader-side checkpoint_id pinning (with_checkpoint_id) No equivalent in uniffi 0.12. SlateDbReaderFactory.checkpoint_id accepted for Protocol symmetry; ignored at runtime. Safe under the immutability invariant; if a future workflow needs reader pinning we will need a SlateDB upstream change first.
flush_with_options("wal") short string form Replaced with typed FlushOptions(flush_type=FlushType.WAL). Cosmetic; same semantics.
Free-form JsonObject settings forwarded as JSON to slatedb Now typed SlateDbSettings with raw_overrides escape hatch. Net positive: typed surface, IDE help. Legacy dict shape is rejected at the type level (library not yet released; no migration cycle needed).
SlateDB(path, url=..., env_file=..., settings=...) synchronous constructor DbBuilder("", ObjectStore.resolve(db_url)) + Settings.set(...) per option, opened via await builder.build() through the bridge. More verbose, more explicit. local_dir is now unused for the SlateDB backend (data goes straight to object store) but is still threaded through factories for symmetry with SQLite/SQLiteVec/LanceDB adapters.
Db.close() Renamed Db.shutdown(); awaitable. Adapter close() calls run_coro(self._db.shutdown()). Same lifecycle semantics.
DbIterator.next_batch(n) Not present in 0.12.1; only next(). scan_iter retains a next_batch try/except fast path for forward compatibility; chunking happens in Python via iterator_chunk_size (default 1024 on SlateDbReaderFactory).
Sync I/O directly from worker thread Now goes through bridge loop. ~18 µs per call; mitigated by batching. See §7.1.
Pre-migration DbAdapter.checkpoint() -> str \| None Renamed to seal() -> None; checkpoint id stamped by writer. Custom adapters must rename. Documented in CHANGELOG, AGENTS.md Gotchas, and adapter-authoring guide.

Net assessment. All losses are documented, and either (a) unused under the existing invariants (checkpoint pinning, content addressing) or (b) cosmetic (constructor shape, flush options). The migration removed a write-time SHA-256 pass and added ~18 µs of bridge overhead per call — net positive on the hot path.