UMLS Reference
Free reference guide: UMLS Reference
About UMLS Reference
The UMLS Reference is a searchable cheat sheet for the Unified Medical Language System maintained by the U.S. National Library of Medicine (NLM). It covers all three UMLS knowledge sources: the Metathesaurus (integrating 200+ source vocabularies including ICD-10, SNOMED CT, MeSH, RxNorm, CPT, and LOINC into 4.4 million+ unified concepts), the Semantic Network (127 semantic types organized hierarchically under Entity and Event), and the SPECIALIST Lexicon (morphological analysis for biomedical NLP).
The reference documents the complete Metathesaurus identifier hierarchy — CUI (Concept Unique Identifier, C + 7 digits), LUI (Lexical Unique Identifier for spelling variants), SUI (String Unique Identifier, case-sensitive), and AUI (Atom Unique Identifier for each source-specific occurrence). It explains relationship types (REL: PAR, CHD, RB, RN, SIB, SY, RO) and relationship attributes (RELA: isa, finding_site_of, has_ingredient, may_treat), as well as term types (TTY: PT for Preferred Term, SY for Synonym, FN for Full Name) and source abbreviations (SAB: SNOMEDCT_US, ICD10CM, MSH, RXNORM, LNC).
Practical sections cover the RRF (Rich Release Format) file structure — MRCONSO.RRF for concept/name data, MRREL.RRF for relationships, MRSTY.RRF for semantic type assignments, MRDEF.RRF for definitions, and MRMAP.RRF for source-provided mappings. The API section documents the UMLS REST API at uts-ws.nlm.nih.gov/rest for CUI lookups, text search with configurable parameters (exact/words/approximate), and programmatic access with Python examples. Cross-vocabulary mapping via CUI (e.g., ICD-10-CM to SNOMED CT) and the QuickUMLS concept extraction library are also covered. All content runs entirely in your browser.
Key Features
- Complete Metathesaurus identifier hierarchy: CUI, LUI, SUI, AUI with format examples
- Source vocabulary abbreviations (SAB) for SNOMED CT, ICD-10, MeSH, RxNorm, LOINC, CPT, and more
- Relationship types (REL) and attributes (RELA) with parent/child, broader/narrower, and clinical relations
- Semantic Network reference: 127 semantic types (TUI), semantic groups, and permitted relations
- RRF file structure: MRCONSO, MRREL, MRSTY, MRDEF, MRMAP column definitions and usage
- UMLS REST API documentation: CUI lookup, text search, field selection, and Python code examples
- Cross-vocabulary mapping (crosswalk) workflow via CUI for ICD-10 to SNOMED CT and other systems
- MetamorphoSys configuration and QuickUMLS Python library for fast concept extraction
Frequently Asked Questions
What is a CUI and how does the UMLS identifier hierarchy work?
A CUI (Concept Unique Identifier) like C0011849 (Diabetes Mellitus) is the central identifier in the UMLS Metathesaurus, representing a single biomedical concept. Each CUI contains multiple LUIs (Lexical Unique Identifiers, grouping case/inflection variants), each LUI contains multiple SUIs (String Unique Identifiers, exact case-sensitive strings), and each SUI maps to multiple AUIs (Atom Unique Identifiers, representing each occurrence in a specific source vocabulary). So CUI > LUI > SUI > AUI forms the hierarchy from concept to individual source term.
What is the difference between Swiss-Prot and the UMLS Metathesaurus?
They serve entirely different domains. The UMLS Metathesaurus integrates biomedical terminology systems (ICD-10, SNOMED CT, MeSH, etc.) to unify clinical, pharmacological, and research vocabularies under common concept identifiers (CUIs). Swiss-Prot (part of UniProt) is a protein sequence database. The UMLS is used for clinical NLP, terminology mapping, and medical informatics, while UniProt/Swiss-Prot is used for proteomics and molecular biology research.
How do I map between ICD-10 and SNOMED CT using UMLS?
Use CUI-based crosswalking: First, find the CUI for your ICD-10-CM code by querying MRCONSO.RRF with SAB=ICD10CM. Then find atoms with the same CUI but SAB=SNOMEDCT_US. This gives you the SNOMED CT codes that share the same concept. Note that mappings may not be 1:1 — a single ICD-10 code might map to multiple SNOMED concepts or vice versa. For official mappings, check MRMAP.RRF which contains source-provided crosswalks.
What are Semantic Types and Semantic Groups?
Every CUI in the Metathesaurus is assigned one or more Semantic Types (identified by TUI, e.g., T047 = Disease or Syndrome, T121 = Pharmacologic Substance, T023 = Body Part/Organ). The 127 semantic types are organized in a hierarchy under two root types: Entity and Event. Semantic Groups are higher-level clusters: DISO (Disorders), CHEM (Chemicals & Drugs), ANAT (Anatomy), PROC (Procedures), GENE (Genes & Molecular Sequences), and others. These classifications enable filtering and categorizing large concept sets.
How do I use the UMLS REST API?
The REST API base URL is https://uts-ws.nlm.nih.gov/rest and requires an API key from a free UTS account. Key endpoints: GET /content/current/CUI/{CUI} for concept lookup, GET /search/current?string={term} for text search (with parameters for source filter, search type, and pagination), GET /content/current/CUI/{CUI}/relations for relationships, and GET /content/current/CUI/{CUI}/atoms?sabs=SNOMEDCT_US&ttys=PT for filtered atom retrieval. Responses are in JSON format.
What is the difference between REL and RELA in MRREL.RRF?
REL is the general relationship type defined by UMLS (PAR=Parent, CHD=Child, RB=Broader, RN=Narrower, SIB=Sibling, SY=Synonym, RO=Other Related). RELA is a more specific relationship attribute defined by the source vocabulary (e.g., isa, finding_site_of, causative_agent_of, has_ingredient, may_treat). So REL tells you the broad category of relationship, while RELA gives you the precise clinical or ontological meaning from the original source.
What is MetamorphoSys and when do I need it?
MetamorphoSys is the UMLS installation and customization tool (requires Java). You need it when downloading the full UMLS release to create a local subset — you can select which source vocabularies to include or exclude, specify the output format (RRF), and generate a Level 0 Subset containing only freely available sources. If you only need occasional lookups, the REST API is simpler. MetamorphoSys is necessary for building local databases for high-throughput NLP pipelines or offline clinical systems.
How does QuickUMLS work for concept extraction?
QuickUMLS is a Python library that performs fast, approximate string matching against UMLS concepts in clinical text. You initialize it with a path to your local UMLS installation, then call matcher.match(text) to find UMLS concepts in free text like "Patient has diabetes and hypertension." It returns matched CUIs, preferred terms, similarity scores, and semantic types. It uses simstring for approximate matching, making it much faster than exact string lookup for clinical NLP tasks.