CLI (shardy)¶
The shardy command-line tool wraps ConcurrentShardedReader for interactive
lookups, batch script execution, and manifest inspection — no Python code needed.
Installation¶
# CLI only (default SlateDB reader stack, no Spark/Java required)
uv sync --extra cli
# Or as part of the full dev environment
uv sync --all-extras --dev
Running Locally¶
During development, run via uv run so the tool picks up the local package
without a separate install step:
uv run shardy --help
After uv sync, the entry point is also available directly if the virtualenv
is activated:
shardy --help
Quick Start¶
Every invocation needs a _CURRENT pointer URL. Supply it via --current-url,
the SHARDY_CURRENT environment variable, or a reader.toml config file
(see Configuration below).
# Single key lookup
uv run shardy --current-url s3://bucket/prefix/_CURRENT get 42
# Multiple keys
uv run shardy --current-url s3://bucket/prefix/_CURRENT multiget 1 2 3
# Multiple keys from stdin
echo -e "1\n2\n3" | uv run shardy --current-url s3://bucket/prefix/_CURRENT multiget -
# Manifest info
uv run shardy --current-url s3://bucket/prefix/_CURRENT info
# Per-shard details
uv run shardy --current-url s3://bucket/prefix/_CURRENT shards
# Check which shard a key routes to
uv run shardy --current-url s3://bucket/prefix/_CURRENT route 42
# Reload manifest
uv run shardy --current-url s3://bucket/prefix/_CURRENT refresh
# Reader health (exit 0/1/2 for healthy/degraded/unhealthy)
uv run shardy --current-url s3://bucket/prefix/_CURRENT health
# Interactive REPL (no subcommand)
uv run shardy --current-url s3://bucket/prefix/_CURRENT
Using an Environment Variable¶
To avoid repeating the URL on every call:
export SHARDY_CURRENT=s3://bucket/prefix/_CURRENT
uv run shardy get 42
uv run shardy multiget 1 2 3
uv run shardy info
uv run shardy shards
uv run shardy route 42
Global Options¶
| Option | Description |
|---|---|
--current-url URL |
S3 URL to the _CURRENT pointer (overrides env / config) |
--config PATH |
Path to reader.toml (overrides SHARDY_CONFIG env) |
--credentials PATH |
Path to credentials.toml (overrides SHARDY_CREDENTIALS env) |
--s3-option KEY=VALUE |
Override S3 connection option (repeatable) |
--ref REF |
Load a specific manifest by reference (non-mutating; mutually exclusive with --offset) |
--offset N |
Load a manifest at position N in history — 0 is latest, 1 is previous, etc. (non-mutating; mutually exclusive with --ref) |
--output-format FORMAT |
Output format: json, jsonl (default), table, text |
--version |
Show version and exit |
Subcommands¶
| Subcommand | Arguments | Description |
|---|---|---|
get |
[--strict] KEY |
Look up a single key; --strict exits 1 when the key is not found |
multiget |
KEY [KEY ...] or - |
Look up multiple keys; pass - to read from stdin |
info |
— | Show manifest metadata (run_id, num_dbs, sharding, key_encoding, row_count) |
shards |
— | Show per-shard details (db_id, row_count, min/max key, URL) |
route |
KEY |
Show which shard a key routes to (without performing a lookup) |
refresh |
— | Reload _CURRENT and manifest |
history |
[--limit N] |
List recent published manifests |
rollback |
--ref REF \| --run-id RUN_ID \| --offset N |
Roll back the current pointer to a previous manifest |
schema |
[--type manifest\|current-pointer] |
Print the JSON Schema for manifest or current-pointer formats |
health |
[--staleness-threshold SECONDS] |
Report reader health; exit 0=healthy, 1=degraded, 2=unhealthy |
exec |
--script FILE [--output FILE] |
Execute a YAML batch script |
Key Coercion¶
CLI keys are always strings. When the manifest uses an integer key encoding
(u64be or u32be), keys are automatically coerced to int before lookup.
For other encodings (e.g. utf8), keys are passed as-is.
The active key encoding is visible via info (the key_encoding field) and
affects get, multiget, and route commands.
Configuration¶
CURRENT URL Resolution¶
The URL is resolved in priority order:
--current-urlCLI optionSHARDY_CURRENTenvironment variablecurrent_urlin the[reader]section ofreader.toml
reader.toml¶
Searched in order: ./reader.toml, ~/.config/shardy/reader.toml,
or via SHARDY_CONFIG env / --config flag.
[reader]
current_url = "s3://bucket/prefix/_CURRENT"
local_root = "/tmp/shardy"
thread_safety = "lock" # "lock" or "pool"
max_workers = 4
slate_env_file = "/path/to/env"
credentials_profile = "default"
[output]
format = "jsonl" # json | jsonl | table | text
value_encoding = "base64" # base64 | hex | utf8
null_repr = "null"
credentials.toml¶
Searched in order: ./credentials.toml, ~/.config/shardy/credentials.toml,
or via SHARDY_CREDENTIALS env / --credentials flag.
The tool warns if file permissions are wider than 0600.
[default]
endpoint_url = "http://localhost:9000"
region = "us-east-1"
access_key_id = "..."
secret_access_key = "..."
addressing_style = "path"
verify_ssl = true
connect_timeout = 10
read_timeout = 30
max_attempts = 3
Inline S3 Option Overrides¶
Override individual S3 settings without editing config files:
uv run shardy --current-url s3://bucket/prefix/_CURRENT \
--s3-option addressing_style=path \
--s3-option verify_ssl=false \
get 42
Interactive REPL¶
When no subcommand is given, the CLI enters a cmd.Cmd REPL:
$ uv run shardy --current-url s3://bucket/prefix/_CURRENT
Loaded manifest run_id=2024-01-15T12:00:00Z (4 shards, hash sharding)
shardy> get 42
{
"op": "get",
"key": "42",
"found": true,
"value": "aGVsbG8="
}
shardy> info
{
"op": "info",
"run_id": "2024-01-15T12:00:00Z",
"num_dbs": 4,
"sharding": "hash",
"key_encoding": "u64be",
"row_count": 1000000,
...
}
shardy> route 42
{
"op": "route",
"key": "42",
"db_id": 2
}
shardy> shards
{
"op": "shards",
"shards": [...]
}
shardy> refresh
{
"op": "refresh",
"changed": false
}
shardy> quit
Interactive mode defaults to json (pretty-printed with indentation) output instead of jsonl.
REPL commands: get KEY, multiget KEY [KEY ...], info, shards, route KEY,
refresh, history [LIMIT], health [STALENESS_THRESHOLD],
use (--offset N | --ref REF | --latest),
quit/exit/Ctrl-D.
REPL-Only Commands¶
history [LIMIT]¶
List recent manifests within the REPL:
shardy> history
shardy> history 5
use¶
Switch to a different manifest without restarting the REPL:
shardy> use --offset 1 # switch to the previous manifest
shardy> use --ref s3://... # switch to a specific manifest ref
shardy> use --latest # switch back to the latest manifest
After use, all subsequent lookups operate against the selected manifest until use --latest or refresh is called. use is session-local and non-mutating — it does not update the _CURRENT pointer. Calling refresh clears any session pin and reloads from _CURRENT.
Manifest History¶
List recent published manifests to see the history of snapshot builds:
uv run shardy --current-url s3://bucket/prefix/_CURRENT history
uv run shardy --current-url s3://bucket/prefix/_CURRENT history --limit 5
Output shows each manifest's reference, run ID, and publication timestamp in reverse chronological order.
Reader Health¶
Check reader health — whether the manifest is loaded, how stale it is, and how many shards are open:
uv run shardy --current-url s3://bucket/prefix/_CURRENT health
{
"op": "health",
"status": "healthy",
"manifest_ref": "s3://bucket/prefix/manifests/.../manifest",
"manifest_age_seconds": 42.3,
"num_shards": 4,
"is_closed": false
}
Exit codes map to standard monitoring conventions:
| Exit code | status |
Meaning |
|---|---|---|
0 |
healthy |
Manifest loaded and fresh |
1 |
degraded |
Manifest loaded but stale (age exceeds threshold) |
2 |
unhealthy |
Reader closed, manifest missing, or unexpected error |
Use --staleness-threshold SECONDS to control when stale manifests are reported as degraded:
# Degrade if manifest is older than 5 minutes
uv run shardy health --staleness-threshold 300
Without --staleness-threshold, the library's default threshold applies (staleness is not checked separately from overall health).
The machine-readable exit codes make health suitable for Kubernetes liveness/readiness probes and monitoring integrations:
# In a readiness probe or health-check script
shardy health --staleness-threshold 600 && echo "OK" || echo "DEGRADED/UNHEALTHY"
In the REPL, health accepts an optional staleness threshold in seconds:
shardy> health
shardy> health 300
--strict on get¶
By default, get exits 0 regardless of whether the key was found. Pass --strict to exit 1
when the key is not found, while still emitting the JSON result to stdout:
# Exits 0 if key exists, 1 if not found — result always goes to stdout
uv run shardy get --strict 42
# Script use: check presence without suppressing output
if ! uv run shardy get --strict 42 > result.json; then
echo "Key 42 not in snapshot" >&2
fi
This allows shell scripts to branch on key presence while still capturing the full result.
Rollback¶
Roll back the current pointer to a previous manifest. Exactly one targeting option is required:
# Roll back by manifest reference
uv run shardy rollback --ref s3://bucket/prefix/manifests/2026-03-13.../manifest
# Roll back by run ID
uv run shardy rollback --run-id abc123
# Roll back by offset (1 = previous, 2 = two versions ago)
uv run shardy rollback --offset 1
Rollback calls set_current() on the manifest store, updating the _CURRENT pointer. The old manifest data remains intact — rollback only changes which manifest is "current".
Warning
Rollback affects all readers loading from this _CURRENT pointer. Readers will pick up the rolled-back manifest on their next refresh().
Schema¶
Print the JSON Schema for manifest or current-pointer formats. Generated at runtime from the Pydantic models:
# Manifest schema (default)
uv run shardy schema
# Current-pointer schema
uv run shardy schema --type current-pointer
The schemas are useful for validating external tools that produce or consume shardyfusion manifests.
Loading a Specific Manifest¶
The --ref and --offset global options let you target a specific manifest for any subcommand:
# Look up a key against the previous manifest
uv run shardy --offset 1 get 42
# Inspect a specific manifest's metadata
uv run shardy --ref s3://bucket/prefix/manifests/.../manifest info
These options are mutually exclusive and non-mutating — the reader loads the targeted manifest for this invocation only, without updating the _CURRENT pointer. In contrast, rollback mutates _CURRENT to point at a different manifest, affecting all readers.
Batch Scripts¶
Execute a YAML file containing multiple commands:
uv run shardy --current-url s3://bucket/prefix/_CURRENT \
exec --script commands.yaml --output results.jsonl
Script Format¶
on_error: continue # stop (default) | continue
commands:
- op: get
key: 42
- op: get # --strict equivalent: result emitted, error record on not-found
key: 99
- op: multiget
keys: [1, 2, 3]
- op: info
- op: shards
- op: route
key: 42
- op: refresh
- op: health # check reader health
- op: health # check with staleness threshold
staleness_threshold: 300
- op: history # list recent manifests (default limit 10)
- op: history
limit: 5
Batch mode defaults output to jsonl — one JSON object per command, streamed
immediately.
Output Formats¶
| Format | Best for |
|---|---|
jsonl |
Machine processing, piping, batch scripts (default for exec and one-shot) |
json |
Pretty-printed; interactive exploration (default for REPL) |
table |
Human-readable multiget results in a terminal |
text |
Plain KEY=VALUE; simplest shell scripting |
Set via --output-format, [output] format in reader.toml, or mode defaults.
Local Development with MinIO¶
For local testing with a MinIO or S3-compatible store:
# Start MinIO (example)
docker run -p 9000:9000 -p 9001:9001 \
-e MINIO_ROOT_USER=minioadmin \
-e MINIO_ROOT_PASSWORD=minioadmin \
minio/minio server /data --console-address ":9001"
# Create credentials.toml for local MinIO
cat > credentials.toml << 'EOF'
[default]
endpoint_url = "http://localhost:9000"
region = "us-east-1"
access_key_id = "minioadmin"
secret_access_key = "minioadmin"
addressing_style = "path"
verify_ssl = false
EOF
chmod 600 credentials.toml
# Use it
uv run shardy \
--current-url s3://my-bucket/prefix/_CURRENT \
--credentials credentials.toml \
info