Skip to content

Optional imports

shardyfusion is one package with many feature dimensions: four writer engines (Python, Spark, Dask, Ray), three KV backends (SlateDB, SQLite-download, SQLite-range-read), two vector backends (LanceDB, sqlite-vec), two metrics backends (Prometheus, OTel), two manifest stores (S3, Postgres), and a CLI. Forcing every install to pull all of these would be untenable.

The optional-imports pattern keeps the package importable with no extras installed, and gates feature availability on per-extra dependency groups.

The pattern

  1. Define an extra in pyproject.toml under [project.optional-dependencies].
  2. The module that depends on it is imported lazily — never at the top level of shardyfusion/__init__.py.
  3. The lazy import is wrapped in a helper that raises a clear "install with pip install shardyfusion[<extra>]" message if the dependency is missing.

Example: CEL

# shardyfusion/_writer_core.py:113
def _get_cel_imports() -> tuple[Any, ...]:
    from shardyfusion.cel import compile_cel, route_cel_batch  # local import
    return compile_cel, route_cel_batch

shardyfusion.cel itself does:

# shardyfusion/cel.py
def _import_cel() -> Any:
    try:
        from cel_expr_python.cel import NewEnv, Type
    except ImportError as e:
        raise ImportError("CEL routing requires `pip install shardyfusion[cel]`") from e
    return NewEnv, Type

The CEL package is cel-expr-python — a fast Rust-backed CEL implementation, not the older pure-Python celpy.

Example: vector adapters

shardyfusion/vector/adapters/lancedb_adapter.py imports lancedb only inside the constructor of the LanceDB factory. If lancedb is missing, the user gets ImportError: ... pip install shardyfusion[vector-lancedb] — but the rest of shardyfusion (KV writers, readers) is untouched.

Example: UnifiedShardedReader

shardyfusion/__init__.py exposes UnifiedShardedReader via __getattr__:

def __getattr__(name):
    if name == "UnifiedShardedReader":
        from shardyfusion.reader.unified_reader import UnifiedShardedReader
        return UnifiedShardedReader
    raise AttributeError(name)

Importing shardyfusion does not pull unified_reader, which would in turn pull vector dependencies.

Extras index

Extra What it enables Notes
slatedb SlateDB driver Base building block; pulled in by reader/writer extras.
sqlite SQLite driver Base building block.
sqlite-range APSW + range-read VFS Base building block for range-read SQLite.
read-slatedb SlateDB sync reader Sync reader.
read-slatedb-async SlateDB async reader (aiobotocore) Async reader.
read-sqlite SQLite download-and-cache reader Sync.
read-sqlite-range SQLite range-read reader (APSW) Sync.
sqlite-adaptive Composes sqlite + sqlite-range Required by AdaptiveSqliteReaderFactory (default reader mode).
read-sqlite-adaptive Alias for sqlite-adaptive Sync adaptive reader.
sqlite-async Async SQLite readers (download + range) Async.
sqlite-adaptive-async Async adaptive SQLite reader (aiobotocore) Required by AsyncAdaptiveSqliteReaderFactory.
writer-spark-slatedb Spark writer (SlateDB) Requires Java.
writer-spark-sqlite Spark writer (SQLite) Requires Java.
writer-python-slatedb Python writer (SlateDB) Pure Python.
writer-python-sqlite Python writer (SQLite) Pure Python.
writer-dask-slatedb Dask writer (SlateDB)
writer-dask-sqlite Dask writer (SQLite)
writer-ray-slatedb Ray writer (SlateDB)
writer-ray-sqlite Ray writer (SQLite)
cli-minimal click>=8.0 only CLI binary without any backend. Combine with a reader extra (e.g. [cli-minimal,read-sqlite]). PyYAML is a base dep, not part of this extra.
cli Kitchen-sink CLI Includes cli-minimal plus all read backends (slatedb sync/async, sqlite download/range/adaptive sync/async, unified-slatedb-lancedb + unified-sqlite-vec).
cel cel-expr-python CEL routing.
metrics-prometheus prometheus_client Prometheus metrics backend.
metrics-otel opentelemetry SDK OTel metrics backend.
vector-lancedb LanceDB vector backend
vector-sqlite sqlite-vec unified KV+vector
unified-slatedb-lancedb Composite KV+vector wiring (SlateDB + LanceDB) For UnifiedShardedReader.
unified-sqlite-vec Composite KV+vector wiring (sqlite-vec)
all Convenience runtime bundle Includes readers, writers, CLI, metrics, CEL, and vector extras; excludes dev/test/quality/docs extras.
test Test runner deps Dev.
quality Lint / typecheck deps Dev.
docs MkDocs + plugins Dev.

For the canonical list, see pyproject.toml. The validate skill (.opencode/skills/validate/SKILL.md) cross-checks every extra documented in docs against the canonical list.

Why not "just pip install everything"

  • Conflicting transitive deps: LanceDB and pyspark have incompatible Arrow versions in some combinations.
  • Container image size: minimal reader installs are ~50MB; pulling vector + Spark would push past 1GB.
  • Python version coverage: shardyfusion targets Python 3.11–3.13 (requires-python = ">=3.11,<3.14"). Some optional dependencies have narrower support windows; gating them keeps the base install broadly compatible.

Contributor rules

When adding a new optional dependency:

  1. Add it to [project.optional-dependencies] under a meaningful extra.
  2. Import it lazily — never at module top-level for any module imported by shardyfusion/__init__.py.
  3. Wrap the import in a helper that raises ImportError with a pip install shardyfusion[<extra>] message.
  4. Add it to the extras index in this page.
  5. Add a use-case page that exercises the new extra.
  6. Run validate-docs; it will refuse to pass if extras are out of sync.

See also