Python SDK¶
The rahcp Python SDK provides a lightweight, async-first client for the HCP Unified API. It is distributed as a uv workspace with seven installable packages:
graph TD
ROOT["rahcp (umbrella)"]
TRACKER["rahcp-tracker<br/><small>Transfer state tracking</small>"]
CLIENT["rahcp-client<br/><small>Async HTTP client</small>"]
CLI["rahcp-cli<br/><small>Typer CLI</small>"]
IIIF["rahcp-iiif<br/><small>IIIF image downloader</small>"]
LANCE["rahcp-lance<br/><small>LanceDB datasets</small>"]
ETL["rahcp-etl<br/><small>JetStream pipelines</small>"]
VAL["rahcp-validate<br/><small>File validation</small>"]
ROOT --> CLIENT
ROOT --> CLI
ROOT -.->|optional| IIIF
ROOT -.->|optional| LANCE
ROOT -.->|optional| ETL
ROOT -.->|optional| VAL
CLIENT --> TRACKER
IIIF --> TRACKER
IIIF -.->|optional| VAL
CLI --> CLIENT
CLI --> IIIF
CLI -.->|optional| VAL
LANCE --> CLIENT
ETL --> CLIENT
Installation¶
Requires Python >= 3.13 and uv.
# SDK + CLI (default)
uv pip install rahcp
# With OpenTelemetry tracing
uv pip install "rahcp-client[otel]"
# With Lance dataset support
uv pip install "rahcp[lance]"
# With ETL pipelines (NATS JetStream)
uv pip install "rahcp[etl]"
# With image validation (Pillow)
uv pip install "rahcp[validate]"
# With IIIF image downloader
uv pip install "rahcp[iiif]"
# Everything
uv pip install "rahcp[all]"
The default install includes both the Python SDK (rahcp-client) and the CLI (rahcp-cli). The heavier packages (Lance, ETL, validation) are opt-in.
For local development from the repository:
uv sync # install all workspace packages
uv run rahcp s3 ls # run CLI via uv
uv run rahcp auth whoami # check current identity
uv run vs rahcp
When developing locally, use uv run rahcp to run the CLI without installing globally. After uv pip install rahcp, you can use rahcp directly.
Packages¶
| Package | Documentation | Description |
|---|---|---|
| rahcp-client | Client library | Async HTTP client with auth, retries, presigned URLs, bulk transfers |
| rahcp-cli | CLI tool | Command-line interface for S3, IIIF, and namespace operations |
| rahcp-tracker | Transfer tracking | Resumable transfer state tracking with SQLite |
| rahcp-iiif | IIIF downloader | Async IIIF image downloader with parallel workers |
| rahcp-lance | LanceDB datasets | LanceDB dataset management on HCP S3 |
| rahcp-etl | ETL pipelines | NATS JetStream event-driven pipelines with checkpointing |
| rahcp-validate | File validation | Format-specific file validation with composable rules |
Comparison: SDK vs raw HTTP¶
The SDK eliminates boilerplate around authentication, retries, presigned URLs, and multipart uploads. Here is the same upload workflow with raw httpx vs the SDK:
from rahcp_client import HCPClient
from pathlib import Path
async with HCPClient.from_env() as client:
etag = await client.s3.upload("my-bucket", "data/file.bin", Path("file.bin"))
print(f"Uploaded: {etag}")
import httpx
BASE = "http://localhost:8000/api/v1"
async with httpx.AsyncClient(base_url=BASE) as c:
# 1. Authenticate
resp = await c.post("/auth/token", data={
"username": "admin", "password": "password", "tenant": "dev-ai",
})
token = resp.json()["access_token"]
c.headers["Authorization"] = f"Bearer {token}"
# 2. Get presigned upload URL
resp = await c.post("/presign", json={
"bucket": "my-bucket", "key": "data/file.bin", "method": "put_object",
})
url = resp.json()["url"]
# 3. Upload to presigned URL
data = Path("file.bin").read_bytes()
async with httpx.AsyncClient() as hcp:
resp = await hcp.put(url, content=data)
resp.raise_for_status()
print(f"Uploaded: {resp.headers['etag']}")
The SDK also handles automatic retries, token refresh, and multipart upload for large files -- none of which are shown in the raw example above.