Skip to content

Extras Matrix

This page maps every user-facing use case to the pip install / uv sync --extra target you need. It is auto-generated by scripts/generate_extras_matrix.py; run that script after adding or renaming an extra.

Common install commands

# Reader only (SlateDB backend)
uv sync --extra read-slatedb

# Async reader (SlateDB backend)
uv sync --extra read-slatedb-async

# Spark writer (SlateDB backend; requires Java 17)
uv sync --extra writer-spark-slatedb

# Kitchen-sink CLI (all read backends + vector)
uv sync --extra cli

# Everything (all runtime extras)
uv sync --extra all

Visual map

Arrows point from a base extra to the extras that build on top of it.

%%{init: {'flowchart': {'ranksep': 30, 'nodesep': 4}}}%%
flowchart LR
  classDef default font-size:16px

  subgraph backend_building_blocks[Backend building blocks]
    direction TB
    n_sqlite_range("sqlite-range\nSQLite range-read VFS")
    n_sqlite("sqlite\nSQLite storage engine")
    n_slatedb("slatedb\nSlateDB storage engine")
  end

  subgraph kv_storage_read[KV Storage — Read]
    direction TB
    n_sqlite_adaptive("sqlite-adaptive\nAdaptive SQLite reader deps")
    n_sqlite_async("sqlite-async\nAsync SQLite readers")
    n_sqlite_adaptive_async("sqlite-adaptive-async\nAsync adaptive SQLite reader")
    n_read_slatedb_async("read-slatedb-async\nAsync reader (SlateDB)")
    n_read_sqlite("read-sqlite\nSync SQLite reader (download)")
    n_read_sqlite_range("read-sqlite-range\nSync SQLite reader (range-read)")
    n_read_sqlite_adaptive("read-sqlite-adaptive\nSync adaptive SQLite reader")
    n_read_slatedb("read-slatedb\nSync reader (SlateDB)")
  end

  subgraph kv_storage_write[KV Storage — Write]
    direction TB
    n_writer_dask("writer-dask\nDask framework (no backend)")
    n_writer_dask_sqlite("writer-dask-sqlite\nDask writer (SQLite)")
    n_writer_dask_slatedb("writer-dask-slatedb\nDask writer (SlateDB)")
    n_writer_python_sqlite("writer-python-sqlite\nPython writer (SQLite)")
    n_writer_python_slatedb("writer-python-slatedb\nPython writer (SlateDB)")
    n_writer_ray("writer-ray\nRay framework (no backend)")
    n_writer_ray_sqlite("writer-ray-sqlite\nRay writer (SQLite)")
    n_writer_ray_slatedb("writer-ray-slatedb\nRay writer (SlateDB)")
    n_writer_spark("writer-spark\nSpark framework (no backend)")
    n_writer_spark_sqlite("writer-spark-sqlite\nSpark writer (SQLite)")
    n_writer_spark_slatedb("writer-spark-slatedb\nSpark writer (SlateDB)")
  end

  subgraph vector_search_distributed_write[Vector Search — Distributed Write]
    direction TB
    n_writer_dask_vector_lancedb("writer-dask-vector-lancedb\nDask vector writer (LanceDB)")
    n_writer_dask_vector_sqlite("writer-dask-vector-sqlite\nDask vector writer (sqlite-vec)")
    n_writer_ray_vector_lancedb("writer-ray-vector-lancedb\nRay vector writer (LanceDB)")
    n_writer_ray_vector_sqlite("writer-ray-vector-sqlite\nRay vector writer (sqlite-vec)")
    n_writer_spark_vector_lancedb("writer-spark-vector-lancedb\nSpark vector writer (LanceDB)")
    n_writer_spark_vector_sqlite("writer-spark-vector-sqlite\nSpark vector writer (sqlite-vec)")
  end

  subgraph vector_search[Vector Search]
    direction TB
    n_vector_lancedb("vector-lancedb\nVector search (LanceDB)")
    n_vector_sqlite("vector-sqlite\nVector search (sqlite-vec)")
  end

  subgraph kv_plus_vector[KV + Vector]
    direction TB
    n_unified_slatedb_lancedb("unified-slatedb-lancedb\nUnified KV+vector (SlateDB + LanceDB)")
    n_unified_sqlite_vec("unified-sqlite-vec\nUnified KV+vector (sqlite-vec)")
  end

  subgraph operations_and_observability[Operations & Observability]
    direction TB
    n_cli_minimal("cli-minimal\nCLI binary (no backend)")
    n_cli("cli\nKitchen-sink CLI")
    n_metrics_otel("metrics-otel\nOpenTelemetry metrics")
    n_metrics_prometheus("metrics-prometheus\nPrometheus metrics")
  end

  subgraph routing[Routing]
    direction TB
    n_cel("cel\nCEL expression routing")
  end

  subgraph development[Development]
    direction TB
    n_all("all\nEverything (runtime)")
  end

  n_cli --> n_cli_minimal
  n_cli --> n_read_slatedb
  n_cli --> n_read_slatedb_async
  n_cli --> n_read_sqlite
  n_cli --> n_read_sqlite_adaptive
  n_cli --> n_read_sqlite_range
  n_cli --> n_sqlite_adaptive_async
  n_cli --> n_sqlite_async
  n_cli --> n_unified_slatedb_lancedb
  n_cli --> n_unified_sqlite_vec
  n_read_slatedb --> n_slatedb
  n_read_slatedb_async --> n_read_slatedb
  n_read_sqlite --> n_sqlite
  n_read_sqlite_adaptive --> n_sqlite_adaptive
  n_read_sqlite_range --> n_sqlite_range
  n_sqlite_adaptive --> n_sqlite
  n_sqlite_adaptive --> n_sqlite_range
  n_sqlite_adaptive_async --> n_sqlite_adaptive
  n_sqlite_async --> n_read_sqlite
  n_unified_slatedb_lancedb --> n_cel
  n_unified_slatedb_lancedb --> n_vector_lancedb
  n_unified_sqlite_vec --> n_cel
  n_unified_sqlite_vec --> n_vector_sqlite
  n_writer_dask_slatedb --> n_slatedb
  n_writer_dask_slatedb --> n_writer_dask
  n_writer_dask_sqlite --> n_sqlite
  n_writer_dask_sqlite --> n_writer_dask
  n_writer_dask_vector_lancedb --> n_vector_lancedb
  n_writer_dask_vector_lancedb --> n_writer_dask
  n_writer_dask_vector_sqlite --> n_vector_sqlite
  n_writer_dask_vector_sqlite --> n_writer_dask
  n_writer_python_slatedb --> n_slatedb
  n_writer_python_sqlite --> n_sqlite
  n_writer_ray_slatedb --> n_slatedb
  n_writer_ray_slatedb --> n_writer_ray
  n_writer_ray_sqlite --> n_sqlite
  n_writer_ray_sqlite --> n_writer_ray
  n_writer_ray_vector_lancedb --> n_vector_lancedb
  n_writer_ray_vector_lancedb --> n_writer_ray
  n_writer_ray_vector_sqlite --> n_vector_sqlite
  n_writer_ray_vector_sqlite --> n_writer_ray
  n_writer_spark_slatedb --> n_slatedb
  n_writer_spark_slatedb --> n_writer_spark
  n_writer_spark_sqlite --> n_sqlite
  n_writer_spark_sqlite --> n_writer_spark
  n_writer_spark_vector_lancedb --> n_vector_lancedb
  n_writer_spark_vector_lancedb --> n_writer_spark
  n_writer_spark_vector_sqlite --> n_vector_sqlite
  n_writer_spark_vector_sqlite --> n_writer_spark

Full table

Backend building blocks

Extra Task Notes
sqlite-range SQLite range-read VFS APSW + obstore for S3 range-reads without full download.
sqlite SQLite storage engine Base dep for SQLite-backed readers and writers.
slatedb SlateDB storage engine Base dep for SlateDB-backed readers and writers.

KV Storage — Read

Extra Task Notes
sqlite-adaptive Adaptive SQLite reader deps Composes sqlite + sqlite-range so AdaptiveSqliteReaderFactory can pick per snapshot.
sqlite-async Async SQLite readers Async wrappers for both download and range-read SQLite.
sqlite-adaptive-async Async adaptive SQLite reader Async adaptive policy + aiobotocore.
read-slatedb-async Async reader (SlateDB) Async S3 manifest store + SlateDB shards via aiobotocore.
read-sqlite Sync SQLite reader (download) Downloads full DB locally before opening.
read-sqlite-range Sync SQLite reader (range-read) Uses APSW VFS to read S3 pages on demand.
read-sqlite-adaptive Sync adaptive SQLite reader Alias for sqlite-adaptive. Auto-picks download vs range per snapshot.
read-slatedb Sync reader (SlateDB) Sync SlateDB reader. Pulls in slatedb.

KV Storage — Write

Extra Task Notes
writer-dask Dask framework (no backend) dask[dataframe]. Combine with slatedb, sqlite, vector-lancedb, or vector-sqlite.
writer-dask-sqlite Dask writer (SQLite) Dask DataFrame input writing SQLite shards.
writer-dask-slatedb Dask writer (SlateDB) Dask DataFrame input.
writer-python-sqlite Python writer (SQLite) Pure Python writing SQLite shards.
writer-python-slatedb Python writer (SlateDB) Pure Python, single-process or multiprocessing.
writer-ray Ray framework (no backend) ray[data]. Combine with slatedb, sqlite, vector-lancedb, or vector-sqlite.
writer-ray-sqlite Ray writer (SQLite) Ray Dataset input writing SQLite shards.
writer-ray-slatedb Ray writer (SlateDB) Ray Dataset input.
writer-spark Spark framework (no backend) PySpark + pandas + pyarrow. Combine with slatedb, sqlite, vector-lancedb, or vector-sqlite.
writer-spark-sqlite Spark writer (SQLite) Requires Java 17. PySpark ≥3.3.
writer-spark-slatedb Spark writer (SlateDB) Requires Java 17. PySpark ≥3.3.

Vector Search — Distributed Write

Extra Task Notes
writer-dask-vector-lancedb Dask vector writer (LanceDB) Dask DataFrame → sharded LanceDB index.
writer-dask-vector-sqlite Dask vector writer (sqlite-vec) Dask DataFrame → sharded sqlite-vec index.
writer-ray-vector-lancedb Ray vector writer (LanceDB) Ray Dataset → sharded LanceDB index.
writer-ray-vector-sqlite Ray vector writer (sqlite-vec) Ray Dataset → sharded sqlite-vec index.
writer-spark-vector-lancedb Spark vector writer (LanceDB) PySpark DataFrame → sharded LanceDB index. Requires Java 17.
writer-spark-vector-sqlite Spark vector writer (sqlite-vec) PySpark DataFrame → sharded sqlite-vec index. Requires Java 17.
Extra Task Notes
vector-lancedb Vector search (LanceDB) HNSW index via LanceDB.
vector-sqlite Vector search (sqlite-vec) sqlite-vec unified KV+vector in single DB.

KV + Vector

Extra Task Notes
unified-slatedb-lancedb Unified KV+vector (SlateDB + LanceDB) Composite SlateDB + LanceDB sidecar. Enables UnifiedShardedReader.
unified-sqlite-vec Unified KV+vector (sqlite-vec) Single-file sqlite-vec backend. Enables UnifiedShardedReader.

Operations & Observability

Extra Task Notes
cli-minimal CLI binary (no backend) shardy command with click only. Combine with a reader extra.
cli Kitchen-sink CLI All reader backends + vector + CEL bundled.
metrics-otel OpenTelemetry metrics OtelCollector for writer/reader events.
metrics-prometheus Prometheus metrics PrometheusCollector for writer/reader events.

Routing

Extra Task Notes
cel CEL expression routing Custom sharding rules via cel-expr-python.

Development

Extra Task Notes
docs Documentation dependencies MkDocs + plugins.
all Everything (runtime) Convenience bundle of all runtime extras. Excludes dev/test/quality/docs.
quality Lint & type-check dependencies ruff, pyright.
test Test dependencies pytest, hypothesis, moto, etc.

Generated by scripts/generate_extras_matrix.py. Do not edit this file by hand — it will be overwritten on the next run.