Every protein has a unique ID — here's how to find it
Scientific names for proteins can get messy. "Insulin" is actually a family — there are bovine, porcine, and human versions, and the A and B chains are separately listed. "p53" has a half-dozen aliases (TRP53 in mouse, TP53 gene in human, just "53 kDa tumor antigen" in older papers).
UniProt fixes this with one unique accession ID per protein per species. This guide walks through how to find it.
Step 1: Know what you're searching for
- A common name (insulin, hemoglobin, spike protein). Easy start, but may return many hits.
- A gene name (INS, HBB, ACE2). More precise — most genes encode one protein.
- A UniProt ID (P01308). The most precise — one entry, one answer.
- A sequence. For novel or unfamiliar proteins. UniProt's BLAST tool compares your sequence against the database.
Step 2: Search UniProt
Go to uniprot.org and type your query. A list comes back. Two critical filters on the left side:
- Reviewed (Swiss-Prot) — manually curated. For well-studied proteins, always start here.
- Organism — narrow to Homo sapiens for human, Mus musculus for mouse, etc.
What you'll typically want: "Reviewed + Homo sapiens" narrows 200 results to the one or two canonical human entries.
Step 3: Grab the accession
The accession is the column labeled "Entry" — typically starts with P, Q, O, or A. Copy it.
Famous examples:
- P01308 — Insulin (human)
- P69905 — Hemoglobin alpha (human)
- P68871 — Hemoglobin beta (human)
- P04637 — p53 tumor suppressor (human)
- P0DTC2 — SARS-CoV-2 Spike glycoprotein
- Q9Y261 — FOXA2 (liver transcription factor)
- P38398 — BRCA1
Step 4: Look it up
With the accession in hand, open ProteinLab and paste it in. You get the plain-English summary: what it does, diseases it's linked to, drugs that target it, pathways it's part of, where it lives in the cell.
Bonus: finding proteins by sequence
Got an amino-acid sequence from somewhere (a paper, a sequencing result, a protein extract)? Run it through BLAST on UniProt. BLAST compares your sequence against every known protein and returns the closest matches with an E-value (statistical significance — lower is better).
Typical workflow:
- Paste your sequence into the UniProt BLAST search.
- Pick the top hit (usually the exact match if you have one).
- Follow its accession into the full entry.
- Copy that accession, plug into ProteinLab for the plain-English view.
Why bother learning this
Once you know the accession pattern, every scientific paper and biomedical news article gets more navigable. "A mutation in P04637 reduces binding to MDM2" becomes actionable — you know exactly what the paper means and where to learn more. UniProt is the scientific community's shared address book for proteins. ProteinLab is your quick-lookup tool on top of that.