Read a vector snapshot synchronously¶
Use ShardedVectorReader for approximate nearest-neighbor (ANN) search across a sharded vector snapshot.
When to use¶
- Pure vector workload — no KV lookups needed.
- Synchronous code.
When NOT to use¶
- You also need KV lookups — use KV+Vector sync reader (
UnifiedShardedReader). - You're in async code — use async vector reader (
AsyncShardedVectorReader).
Install¶
# LanceDB backend
uv add 'shardyfusion[vector-lancedb]'
# sqlite-vec backend
uv add 'shardyfusion[vector-sqlite]'
Minimal example¶
from shardyfusion.vector.reader import ShardedVectorReader
import numpy as np
reader = ShardedVectorReader(
s3_prefix="s3://my-bucket/snapshots/embeddings",
local_root="/tmp/vectors",
)
query = np.random.randn(384).astype(np.float32)
response = reader.search(query, top_k=10)
print("search latency (ms):", round(response.latency_ms, 2))
for res in response.results:
print(res.id, res.score, res.payload)
reader.close()
Configuration¶
ShardedVectorReader (shardyfusion/vector/reader.py):
| Param | Default | Purpose |
|---|---|---|
s3_prefix |
required | Snapshot root. |
local_root |
required | Local cache directory. |
manifest_store |
auto | Custom async or sync store. |
max_workers |
None |
Thread-pool size for multi-shard fan-out. |
max_fallback_attempts |
3 |
Fallback to previous manifests. |
rate_limiter |
None |
Token-bucket rate limit on search. |
Reader API¶
# ANN search
results = reader.search(
query_vector,
top_k=10,
shard_ids=None, # restrict to specific shards
num_probes=None, # CLUSTER routing: how many centroid shards to query
routing_context=None, # CEL routing context
)
# Routing introspection (sync)
db_id = reader.route_vector(query_vector) # for CLUSTER/LSH strategies
# Snapshot inspection
info = reader.snapshot_info()
shards = reader.shard_details()
health = reader.health()
# Refresh / lifecycle
changed = reader.refresh()
reader.close()
Query flow¶
flowchart TD
Q[Query vector] --> R{Routing strategy}
R -->|CLUSTER| C["Find nearest centroids<br/>num_probes shards"]
R -->|LSH| L["Hash to buckets<br/>matching shards"]
R -->|CEL / EXPLICIT| E["Target shard_ids"]
R -->|none| A["All shards"]
C --> F["Fan out search<br/>thread pool"]
L --> F
E --> F
A --> F
F --> M["Merge local top-k<br/>global sort by distance"]
M --> O[Global top-k results]
The merge logic is in shardyfusion/vector/_merge.py and is shared across all vector reader variants.
Functional properties¶
- Lazy shard loading: indices are downloaded on first search.
- LRU eviction for shard indices.
searchfans out across target shards using a thread pool.
Guarantees¶
- Reads pinned to manifest at open / last
refresh(). - Routing matches writer (same centroids, hyperplanes, or CEL expression).
- Same fallback behavior as KV readers: up to
max_fallback_attemptsprevious manifests.
Weaknesses¶
ShardedVectorReaderis not re-exported at top level — import fromshardyfusion.vector.CLUSTERsharding requires sampling pass over data at write time.- No built-in query filtering (filter must happen post-search in application code).
Failure modes & recovery¶
| Failure | Surface | Recovery |
|---|---|---|
Missing _CURRENT |
ReaderStateError |
Verify writer published; check s3_prefix. |
| Malformed manifest | ManifestParseError; fallback to previous manifests |
Investigate writer. |
| Shard index not found | DbAdapterError |
Check S3 connectivity; refresh(). |
| Dim mismatch | ConfigValidationError |
Fix query vector dimension. |
See also¶
- Vector Overview — routing strategies, scatter-gather flow
- Build → LanceDB
- Build → sqlite-vec
- Async vector reader