Optional imports¶

shardyfusion is one package with many feature dimensions: four writer engines (Python, Spark, Dask, Ray), three KV backends (SlateDB, SQLite-download, SQLite-range-read), two vector backends (LanceDB, sqlite-vec), two metrics backends (Prometheus, OTel), two manifest stores (S3, Postgres), and a CLI. Forcing every install to pull all of these would be untenable.

The optional-imports pattern keeps the package importable with no extras installed, and gates feature availability on per-extra dependency groups.

The pattern¶

Define an extra in pyproject.toml under [project.optional-dependencies].
The module that depends on it is imported lazily — never at the top level of shardyfusion/__init__.py.
The lazy import is wrapped in a helper that raises a clear "install with pip install shardyfusion[<extra>]" message if the dependency is missing.

Example: CEL¶

# shardyfusion/_writer_core.py:113
def _get_cel_imports() -> tuple[Any, ...]:
    from shardyfusion.cel import compile_cel, route_cel_batch  # local import
    return compile_cel, route_cel_batch

shardyfusion.cel itself does:

# shardyfusion/cel.py
def _import_cel() -> Any:
    try:
        from cel_expr_python.cel import NewEnv, Type
    except ImportError as e:
        raise ImportError("CEL routing requires `pip install shardyfusion[cel]`") from e
    return NewEnv, Type

The CEL package is cel-expr-python — a fast Rust-backed CEL implementation, not the older pure-Python celpy.

Example: vector adapters¶

shardyfusion/vector/adapters/lancedb_adapter.py imports lancedb only inside the constructor of the LanceDB factory. If lancedb is missing, the user gets ImportError: ... pip install shardyfusion[vector-lancedb] — but the rest of shardyfusion (KV writers, readers) is untouched.

Example: `UnifiedShardedReader`¶

shardyfusion/__init__.py exposes UnifiedShardedReader via __getattr__:

def __getattr__(name):
    if name == "UnifiedShardedReader":
        from shardyfusion.reader.unified_reader import UnifiedShardedReader
        return UnifiedShardedReader
    raise AttributeError(name)

Importing shardyfusion does not pull unified_reader, which would in turn pull vector dependencies.

Extras index¶

Extra	What it enables	Notes
`slatedb`	SlateDB driver	Base building block; pulled in by reader/writer extras.
`sqlite`	SQLite driver	Base building block.
`sqlite-range`	APSW + range-read VFS	Base building block for range-read SQLite.
`read-slatedb`	SlateDB sync reader	Sync reader.
`read-slatedb-async`	SlateDB async reader (aiobotocore)	Async reader.
`read-sqlite`	SQLite download-and-cache reader	Sync.
`read-sqlite-range`	SQLite range-read reader (APSW)	Sync.
`sqlite-adaptive`	Composes `sqlite` + `sqlite-range`	Required by `AdaptiveSqliteReaderFactory` (default reader mode).
`read-sqlite-adaptive`	Alias for `sqlite-adaptive`	Sync adaptive reader.
`sqlite-async`	Async SQLite readers (download + range)	Async.
`sqlite-adaptive-async`	Async adaptive SQLite reader (`aiobotocore`)	Required by `AsyncAdaptiveSqliteReaderFactory`.
`writer-spark-slatedb`	Spark writer (SlateDB)	Requires Java.
`writer-spark-sqlite`	Spark writer (SQLite)	Requires Java.
`writer-python-slatedb`	Python writer (SlateDB)	Pure Python.
`writer-python-sqlite`	Python writer (SQLite)	Pure Python.
`writer-dask-slatedb`	Dask writer (SlateDB)
`writer-dask-sqlite`	Dask writer (SQLite)
`writer-ray-slatedb`	Ray writer (SlateDB)
`writer-ray-sqlite`	Ray writer (SQLite)
`cli-minimal`	`click>=8.0` only	CLI binary without any backend. Combine with a reader extra (e.g. `[cli-minimal,read-sqlite]`). PyYAML is a base dep, not part of this extra.
`cli`	Kitchen-sink CLI	Includes `cli-minimal` plus all read backends (slatedb sync/async, sqlite download/range/adaptive sync/async, unified-slatedb-lancedb + unified-sqlite-vec).
`cel`	`cel-expr-python`	CEL routing.
`metrics-prometheus`	`prometheus_client`	Prometheus metrics backend.
`metrics-otel`	`opentelemetry` SDK	OTel metrics backend.
`vector-lancedb`	LanceDB vector backend
`vector-sqlite`	sqlite-vec unified KV+vector
`unified-slatedb-lancedb`	Composite KV+vector wiring (SlateDB + LanceDB)	For `UnifiedShardedReader`.
`unified-sqlite-vec`	Composite KV+vector wiring (sqlite-vec)
`all`	Convenience runtime bundle	Includes readers, writers, CLI, metrics, CEL, and vector extras; excludes dev/test/quality/docs extras.
`test`	Test runner deps	Dev.
`quality`	Lint / typecheck deps	Dev.
`docs`	MkDocs + plugins	Dev.

For the canonical list, see pyproject.toml. The validate skill (.opencode/skills/validate/SKILL.md) cross-checks every extra documented in docs against the canonical list.

Why not "just pip install everything"¶

Conflicting transitive deps: LanceDB and pyspark have incompatible Arrow versions in some combinations.
Container image size: minimal reader installs are ~50MB; pulling vector + Spark would push past 1GB.
Python version coverage: shardyfusion targets Python 3.11–3.13 (requires-python = ">=3.11,<3.14"). Some optional dependencies have narrower support windows; gating them keeps the base install broadly compatible.

Contributor rules¶

When adding a new optional dependency:

Add it to [project.optional-dependencies] under a meaningful extra.
Import it lazily — never at module top-level for any module imported by shardyfusion/__init__.py.
Wrap the import in a helper that raises ImportError with a pip install shardyfusion[<extra>] message.
Add it to the extras index in this page.
Add a use-case page that exercises the new extra.
Run validate-docs; it will refuse to pass if extras are out of sync.