Medical Data, Standard Vocabularies, Communication Standards 6.872/HST950 Peter Szolovits (with some material from Chris Cimino)Recall Children’s Clinicians’ Workstation Database • Demographics • Problems • Allergies • Medications – Immunizations • Lab Data • Clinical Measurements – Growth Charts • Visit History • Reports and LettersThe Database •General Information •CPT_CODE •ICD9 •ICD9_PROCDR •PROBLEM_NOSOLOGY •Documents •DOC_STORE •DOC_ATTRIBUTES •DOC_DESCRIPTION •CHILD_DOCS •Doctors •PERSNL_PUBLIC •PPR •Patients •PAT_DEMOGRAPH •PAT_FIN_ACCT •PAT_TEST_HISTV •REMOTE_TEST •PHARMACY_TABLE •PROBLEMSVocabularies and Terminology • Why? – Surrogate for “messy reality” – Uses • How? – Flat list – Taxonomy (Hierarchy, Nosology, …) – Heterarchy – Combinatorial Language • Derivation rules • Inference • … knowledge representation“Ontology” for Computer Folks • An organization of concepts (hierarchy orheterarchy) • (Some) concepts are defined in terms ofothers – A triangle is a polygon with exactly 3 sides – A dachshund is a dog (with ???) • Automatic classification – If P is a 3-sided polygon with …, it is recognized automatically as a triangleOWL: “Semantic Web” • Description Logic – Concepts and Instances – Is-a in virtue of • Primitive assertion: “A dog is a mammal.” • Definition: “A triangle is a three-sided polygon.” – Limited logical power of definition language assurestractable inference • Slot restriction • Number restriction • But, no negation or disjunction – Subsumption inferences are central – Other logical assertions may be made, but aretypically not enforced or utilized in DLDefinitions • Word – a set of characters including punctuation delimited by white space. • Term – one or more words used as a unit. • Concept – an idea, action, or thing. • Synonym – two terms for the same concept.Vocabulary Uses • Indexing – Finding what you want • Cataloging – Putting away what you have – E.g., WHO, DRGs • Knowledge Representation – Representing the facts – Blurring the facts – Creating new shades of meaningDescribe a term for a Laboratory Test • Where was it done? • How was it done? • Under what conditions was it done? • How many minutes after eating carbohydrate was it measured?Describe a Vocabulary for a Gene • Whose gene? • Gene fragment? • Open Reading Frame? • Promoter + all exons and introns • Promoter + all exons + all introns + other binding sites affecting function? • Final/draft/species/SNP/Alternativesplicing?Knowledge vs. Language • Get two or more people to enumerate terms to describe the same set. – Do any terms match exactly? – Do terms differ by word order? – Do terms differ by word suffix or prefix? – Are there terms that some people think are synonyms that other people think are not?History of 3 Vocabularies • MeSH — Index • ICD — Precoodinated • SNOMED — Post-coordinatedHistory • The modern history of medical controlled vocabularies begins with the U.S. Army General Surgeon who petitioned Congress to fund a medical library. (~Civil War) • The position eventually became “The US Surgeon General” and the library the National Library of Medicine – http://www.nlm.nih.gov/History • Library collection was indexed with Index Medicus (created by NLM) which is published in book form. • Index Medicus was extended to index medical literature articles. • Index Medicus was extended further to provide on-line indexing (1960). This became the Medical Subject Headings (MeSH).MeSH • Purpose is to index the medical literature. • Content of MeSH is driven by publications. • Who “owns” MeSH? • What impact do vocabulary changes have?MeSH – Structure http://www.nlm.nih.gov/mesh/ • MeSH is organized into a series of “trees”. (e.g. physical findings, diseases, chemicals) • A MeSH main heading is a “concept”. (e.g. “Neurologic Disease”, “Epilepsy”) • Main Heading (MH) is often called a term. (Try to avoid doing this.)MeSH – Structure • Each MH has a unique identifier. • Each MH may have multiple synonyms. • Each MH may have multiple locations inmultiple trees. Each of these “contexts”has a unique tree address. The concept of“context” is synonymous with “multipleinheritance”.MeSH – Structure • There is a small set of subheadings (50) that “modify” MH based on tree address. (e.g. “diagnosis” applies to MH in the “Disease” tree but not to the “Chemical” tree). • There is a small set of tag terms (15) which exist unrelated to the rest of MeSH. (e.g., “Review Article”, “Human”, “Animal”)MeSH – Structure • Every article is indexed with tag terms. • Every article is indexed with MH terms forfocus (main index term) and mention(minor index term). • Every index term is checked forsubheadings. • This is all done by trained reviewers. • The MeSH Vocabulary is revised annually. http://www.nlm.nih.gov/mesh/MESH Redux— The Genome “Ontology” biological process molecular function cell components Image showing genome ontology removed due to copyright restrictions.International Classification of Disease (ICD) • Any agency that dispenses funds for health care needs a way to assess needs and effectiveness. • The United Nations World Health Organization (WHO) funds health care prevention projects world wide and gathers statistics for member nations. • Who “owns” ICD? • What impact will changes have?ICD – Structure • ICD is divided into categories based on a 5-digit numeric code. (e.g., “133.21”) • Usually round numbers are more general concepts (e.g., “100” subsumes “130” which subsumes “133”) • The fourth and fifth digit is called a modifier but it isn’t really.ICD – Structure • The code is both the concept and theunique identifier. Multiple terms are linked to the same code. • Every patient is coded with as many termsas possible. • Terms should be the most specific one to describe a particular problem.ICD – Structure • Coding scheme limits the size of the vocabulary. • Obsolete codes must be reused. • Base ten results in limited flexibility and the need for “other”, “NOS”, and “NOC” terms.ICD – Structure • Lack of
View Full Document