Cell Atlas
The first aggregator API for single-cell foundation models. Geneformer + scGPT for cell-type annotation, embeddings, and marker gene identification — all through one API key.
What's inside
Geneformer Cell-Type Annotation
Classify single cells by type from sparse expression matrices. Uses canonical marker gene signatures for 30+ cell types including T cells, B cells, fibroblasts, neurons, hepatocytes, and more.
scGPT Embeddings
Dense 128-dim embeddings for any single-cell expression profile. Drop into UMAP, Louvain, or cosine similarity for clustering and retrieval.
Marker Gene Identification
Per-cluster differential expression with log2 fold change, in-mean / out-mean, and ranked scores. Paste cluster labels, get top markers per group.
Weekend Hacker Ready
No need to install scvi-tools, anndata, or set up a GPU. Paste up to 100 cells × 500 genes, get annotations back in seconds. Larger datasets via batch endpoints.
FAQ
What models are behind Cell Atlas?›
Geneformer (Chan-Zuckerberg, trained on 30M cells) for cell-type annotation, and scGPT (33M cells) for embeddings. Sprint 51 ships with deterministic mock mode — real GPU workers activate as RunPod endpoints are provisioned.
How many cells can I annotate per call?›
Up to 100 cells × 500 genes per single call. For larger datasets, use the batch endpoints (coming in Sprint 51B). The service is CPU-cheap so bulk processing is affordable.
Are the annotations accurate?›
Mock mode uses canonical marker gene signatures — CD4/CD3D for T cells, MS4A1/CD19 for B cells, COL1A1/VIM for fibroblasts, etc. Confidence scores are included for every prediction. Real Geneformer mode (coming soon) will provide foundation-model-quality annotations.
What input format do I need?›
A JSON object with expression_matrix (n_cells × n_genes, normalized) and gene_names (list of gene symbols matching the columns). Symbols are case-insensitive. See the code example in the tool catalog.
Can I cluster cells with this?›
Yes — use /v1/singlecell/embed to get 128-dim embeddings, then cluster client-side with sklearn KMeans, scipy hierarchical, or plot in UMAP. Then feed cluster labels back to /v1/singlecell/marker-genes to identify what defines each cluster.