API Benchmarks
Published P50 and P95 latency for every SciRouter endpoint. Measured from API gateway to response, excluding network transit.
Chemistry
5 endpointsPharma
1 endpointsProteins
5 endpointsAsync job — cold start adds ~30s
Async job
External API dependency
Docking
3 endpointsAsync job — cold start adds ~45s
Async job — cold start adds ~60s
Async job — cold start adds ~60s
Design
3 endpointsAsync job
Antibodies
2 endpointsAsync job
Async job
Generation
2 endpointsAsync job
Labs
4 endpointsMulti-model pipeline
Multi-model pipeline
Multi-model pipeline
Multi-model pipeline
GPU Fleet
NVIDIA A100 80GB for heavy inference (ESMFold, Boltz-2, Chai-1, DiffDock). A24 / A5000 for lighter models (ProteinMPNN, ImmuneBuilder, AntiFold, REINVENT4). Hosted on RunPod serverless.
API Gateway
FastAPI on Railway with auto-scaling. PostgreSQL for state, Redis for rate limiting and caching. Sub-50ms overhead for CPU endpoints.
Cold Starts
GPU models use serverless workers. First request after idle may add 30-60s. Pro and Agentic tiers get priority GPU queue for faster warm-up.
Methodology
Latencies are measured server-side from request receipt to response send, excluding network transit time. GPU benchmarks assume a warm worker (model already loaded in VRAM).
P50 = median latency (50th percentile). P95 = tail latency (95th percentile). Benchmarks are collected over a rolling 7-day window from production traffic.
CPU endpoints (Chemistry, ADMET, alignment) run on the API gateway itself. GPU endpoints dispatch to RunPod serverless workers via async job queue.