
27f3f1be471a939c462bae6d48ec0f86.ppt
- Количество слайдов: 154
R T U New York State Center of Excellence in Bioinformatics & Life Sciences VUB Leerstoel 2009 -2010 Theme: Ontology for Ontologies, theory and applications Ontologies in healthcare and the vision of personalized medicine May 19, 2010; 17 h 00 -19 h 00 Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels Room D 2. 01 Prof. Werner CEUSTERS, MD Ontology Research Group, Center of Excellence in Bioinformatics and Life Sciences and Department of Psychiatry, University at Buffalo, NY, USA
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Context of this lecture series Knowledge Representation Informatics Linguistics Computational Linguistics Medical Natural Language Understanding Electronic Health Records Translational Research Medicine Biology Ontology Philosophy Realism-Based Ontology Referent Tracking Pharmacogenomics Pharmacology Performing Arts Defense & Intelligence
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Today’s topic • May 19: ontologies in healthcare and the vision of personalized medicine – Open Biomedical Ontologies Foundry – Example ontologies for e. Health – An ontologist’s view on data models Electronic Health Records Translational Research Medicine Biology Pharmacogenomics Pharmacology Realism-Based Ontology Referent Tracking
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Data, information and what it is about
R T U New York State Center of Excellence in Bioinformatics & Life Sciences A general belief: Better information Better care
R T U New York State Center of Excellence in Bioinformatics & Life Sciences ‘Information’ versus ‘informing’ Being better informed Better information Better care
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Being better informed A general belief: • Concerns primarily the delivery of information: Being better informed Better information Better care
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Being better informed A general belief: • Concerns primarily the delivery of information: – Timely, – Where required (e. g. bed-side computing), – What is permitted, – What is needed. • Involves: – Connecting systems, – Making systems interoperable: • Syntactically, • Semantically. pretty well covered long way to go
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Today’s data generation and use observation & measurement data organization model development use = outcome add Δ (instrument and study optimization) verify further R&D Generic beliefs application
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Example 1: clinician observation & measurement data organization diagnosis use = outcome verify add Δ Generic beliefs treatment
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Example 2: researcher observation & measurement data organization hypothesis use = outcome add Δ (instrument and study optimization) verify further R&D Generic beliefs
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Example 3: device manufacturer / supplier observation & measurement data organization model development use = outcome add Δ (instrument and study optimization) verify further R&D Generic beliefs application
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Slightly different: payer / health plan data organization model development $ ence use influ = outcome add Δ verify window dressing Generic beliefs
R T U New York State Center of Excellence in Bioinformatics & Life Sciences “Better Information” must cover … 1 Patient-specific information 3 Scientific “knowledge” 2 • EHR-EMR-ENR-… • PHR • Various modality related databases – Lab, imaging, … • Textbooks • Classification systems • Terminologies • Ontologies
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Means to structure the available information Key question: on what should the structure be based ?
R T U New York State Center of Excellence in Bioinformatics & Life Sciences What is the structure based on ? (1) • Classification systems: on ‘properties’ of patients which are of importance for the purposes the system has been designed http: //www. who. int/classifications/en/
R T U New York State Center of Excellence in Bioinformatics & Life Sciences What is the structure based on ? (2) • Terminologies: – on ‘concepts’ • But terminologists fail to give a good answer on what a concept is
R T U New York State Center of Excellence in Bioinformatics & Life Sciences What is the structure based on ? (3) • Ontologies (mainstream view): – on ‘concepts’ • when designed by terminologists – on ‘classes’ • when designed by software engineers and computer scientists – a class is a construct that is used as a blueprint to create objects of that class. ? – a class is a cohesive package that consists of a particular kind of metadata. ? ? – a class usually represents a noun, such as a person ? ? ? http: //en. wikipedia. org/wiki/Class_(computer_science)
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Patients become victims of bad IT design • ‘The Data Model That Nearly Killed Me’: – Joe Bugajski, http: //tiny. cc/S 1 HWo • “If data cannot be made reliably available across silos in a single EHR, then this data cannot be made reliably available to a huge, heterogeneous collection of networked systems. ” • ‘Are Health IT designers, testers and purchasers trying to kill people? ’ – Scot M. Silverstein, http: //tiny. cc/CKIW 1
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Our view: on realism-based ontology ! • In philosophy: – Ontology (no plural) is the study of what entities exist and how they relate to each other; • In computer science and (biomedical informatics) applications: – An ontology (plural: ontologies) is a shared and agreed upon conceptualization of a domain; • Our ‘realist’ view within the Ontology Research Group combines the two: – We use realism, a specific theory of ontology, as the basis for building high quality ontologies, using reality as benchmark.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences OBO and OBO-Foundry A reaction to inadequacy
R T U New York State Center of Excellence in Bioinformatics & Life Sciences US National Centers for Biomedical Computing
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences OBO Website
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences The OBO Foundry • a family of interoperable biomedical reference ontologies built around the Gene Ontology at its core and using the same principles • a modular annotation catalogue of English phrases • each module created by experts from the corresponding scientific community • http: //obofoundry. org
R T U New York State Center of Excellence in Bioinformatics & Life Sciences OBO Foundry principles (1) • The ontology must be open and available to be used by all without any constraint other than – (a) its origin must be acknowledged and – (b) it is not to be altered and subsequently redistributed under the original name or with the same identifiers. • The ontology is in, or can be expressed in, a common shared syntax. This may be either the OBO syntax, extensions of this syntax, or OWL. • Each Foundry ontology should be built on the basis of BFO top-level distinctions – cave: OWL-DL is not capable of representing all BFO aspects
R T U New York State Center of Excellence in Bioinformatics & Life Sciences OBO Foundry principles (2) • The ontologies have a unique identifier space within the OBO Foundry. • The source of a representational unit (RU) from any ontology can be immediately identified by the prefix of the identifier of each term. • The ontology provider has procedures for identifying distinct successive versions. • The ontology has a clearly specified and clearly delineated content. – The ontology must be orthogonal to other ontologies already lodged within OBO. – community acceptance of a single ontology for one domain
R T U New York State Center of Excellence in Bioinformatics & Life Sciences OBO Foundry principles (3) • The ontologies include textual definitions for all RUs. – RUs should be defined so that their precise meaning within the context of a particular ontology is clear to a human reader. – Textual definitions will use the genus-species form: An A =def. a B which Cs, where B is the parent of the defined term A and C is the defining characteristic of those Bs which are also As. • The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology. • The ontology is well documented. • The ontology has a plurality of independent users. • The ontology will be developed collaboratively with other OBO Foundry members.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences OBO Foundry principles (4) • Single is_a inheritance: ontologies will distinguish a backbone ('asserted') is_a hierarchy subject to the principle of single inheritance (each term in the ontology has maximally one is_a parent in this asserted hierarchy). • Instantiability: RUs in an ontology should correspond to instances in reality.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Idea grew out of the Gene Ontology what cellular component? what molecular function? what biological process? 33
R T U New York State Center of Excellence in Bioinformatics & Life Sciences OBO Foundry ontologies in BFO-dress RELATION TO TIME GRANULARITY CONTINUANT INDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE OCCURRENT DEPENDENT Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic Biological Process CARO) Quality (GO) (Pa. TO) Cellular Component Function (FMA, GO) (GO) Molecule (Ch. EBI, SO, Rna. O, Pr. O) Molecular Function (GO) Molecular Process (GO) 34
Ontology Scope R T U New York State Cell Ontology (CL) URL Center cell types from prokaryotes of Excellence in obo. sourceforge. net/cgito mammals bin/detail. cgi? cell Bioinformatics & Life Sciences Chemical Entities of Biological Interest (Ch. EBI) molecular entities Common Anatomy Reference Ontology (CARO) Custodians Jonathan Bard, Michael Ashburner, Oliver Hofman ebi. ac. uk/chebi Paula Dematos, Rafael Alcantara anatomical structures in human and model organisms (under development) Melissa Haendel, Terry Hayamizu, Cornelius Rosse, David Sutherland, Foundational Model of Anatomy (FMA) structure of the human body fma. biostr. washington. edu JLV Mejino Jr. , Cornelius Rosse Functional Genomics Investigation Ontology (Fu. GO) design, protocol, data instrumentation, and analysis fugo. sf. net Fu. GO Working Group Gene Ontology (GO) cellular components, molecular functions, biological processes www. geneontology. org Gene Ontology Consortium Phenotypic Quality Ontology (Pa. TO) qualities of biomedical entities obo. sourceforge. net/cgi -bin/ detail. cgi? attribute_and_value Michael Ashburner, Suzanna Lewis, Georgios Gkoutos Protein Ontology (Pr. O) protein types and modifications (under development) Protein Ontology Consortium Relation Ontology (RO) relations obo. sf. net/relationship Barry Smith, Chris Mungall RNA Ontology (Rna. O) three-dimensional RNA structures (under development) RNA Ontology Consortium Sequence Ontology properties and features of nucleic sequences song. sf. net Karen Eilbeck http: //ontologist. com (SO) 35
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Ontology of General Medical Science First ontology in which the L 1/L 2/L 3 distinction is used
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Remember Basic Formal Ontology • The world consists of – entities that are • Either particulars or universals; • Either occurrents or continuants; and, – relationships between these entities of the form • Either dependent or independent; •
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Basic BFO distinctions universals Continuant Independent Continuant Dependent Continuant thing quality . . particulars Occurrent process, event . . . .
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Basic BFO distinctions universals has_participant Continuant isa Independent Continuant Dependent Continuant ~ thing . . inheres_in particulars Occurrent process, event . . . . instance_of (at t)
R T U New York State Center of Excellence in Bioinformatics & Life Sciences BFO Top-Level Ontology (partial) Continuant Spatial Region Independent Continuant SDC Quality Role Realizable Disposition Function Dependent Continuant Occurrent (always dependent on one or more independent continuants) GDC Information Content Entity Process Functioning Temporal Region
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Three levels of reality in Realist Ontology Terminology Representation and Reference (3) Representational units in various forms about (1), (2) or (3) representational units cognitive units communicative units universals particulars (2) Cognitive entities which are our beliefs about (1) Entities with objective existence which are not about anything First Order Reality
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Universals and Defined Classes Unconstrained reasoning HUMAN BEING instance_of at t extension_of at t E: all human beings at t DC-x: patients at t subclass_of at t I-y class_member_of at t OWL-DL reasoning
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Goal of OGMS • To be a consistent, logical and extensible framework (ontology) for the representation of – features of disease – clinical processes – results
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Motivation • Clarity about: – disease etiology and progression – disease and the diagnostic process – phenotype and signs/symptoms
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Approach • a disease is a disposition rooted in a physical disorder in the organism and realized in pathological processes. produces etiological process bears disorder realized_in disposition pathological process produces diagnosis interpretive process produces signs & symptoms participates_in abnormal bodily features recognized_as
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Cirrhosis - environmental exposure • • Etiological process - phenobarbitolinduced hepatic cell death – produces Disorder - necrotic liver – bears Disposition (disease) - cirrhosis – realized_in Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death – produces Abnormal bodily features – recognized_as Symptoms - fatigue, anorexia Signs - jaundice, splenomegaly • • Symptoms & Signs – used_in Interpretive process – produces Hypothesis - rule out cirrhosis – suggests Laboratory tests – produces Test results – documentation of elevated liver enzymes in serum – used_in Interpretive process – produces Result - diagnosis that patient X has a disorder that bears the disease cirrhosis
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Influenza - infectious • • Etiological process - infection of airway epithelial cells with influenza virus – produces Disorder - viable cells with influenza virus – bears Disposition (disease) - flu – realized_in Pathological process - acute inflammation – produces Abnormal bodily features – recognized_as Symptoms - weakness, dizziness Signs - fever • Symptoms & Signs – used_in • Interpretive process – produces • Hypothesis - rule out influenza – suggests • Laboratory tests – produces • Test results – documentation of elevated serum antibody titers – used_in • Interpretive process – produces • Result - diagnosis that patient X has a disorder that bears the disease flu But the disorder also induces normal physiological processes (immune response) that can result in the elimination of the disorder (transient disease course).
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Huntington’s Disease - genetic • • Etiological process - inheritance of >39 CAG repeats in the HTT gene – produces Disorder - chromosome 4 with abnormal m. HTT – bears Disposition (disease) - Huntington’s disease – realized_in Pathological process - accumulation of m. HTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum – produces Abnormal bodily features – recognized_as Symptoms - anxiety, depression Signs - difficulties in speaking and swallowing • • Symptoms & Signs – used_in Interpretive process – produces Hypothesis - rule out Huntington’s – suggests Laboratory tests – produces Test results - molecular detection of the HTT gene with >39 CAG repeats – used_in Interpretive process – produces Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease
R T U New York State Center of Excellence in Bioinformatics & Life Sciences HNPCC - genetic pre-disposition hereditary non-polyposis colorectal cancer • Etiological process - inheritance of a mutant mismatch repair gene – produces • Disorder - chromosome 3 with abnormal h. MLH 1 – bears • Disposition (disease) - Lynch syndrome – realized_in • Pathological process - abnormal repair of DNA mismatches – produces • Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e. g. TGF-beta R 2) – bears • Disposition (disease) – to acquire non-polyposis colon cancer
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Big Picture
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Primitive Terms • ‘bodily feature’ – may denote a physical component, a bodily quality, or a bodily process.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences There are way more sorts of classes than universals do not have corresponding universals Quality isa Fever Independent continuant isa Edema Rash Process isa Tremor extension_of
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Primitive Terms • clinically abnormal - some bodily feature that – (1) is not part of the life plan for an organism of the relevant type (unlike aging or pregnancy), – (2) is causally linked to an elevated risk either of pain or other feelings of illness, or of death or dysfunction, and – (3) is such that the elevated risk exceeds a certain threshold level.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Disposition • A disposition is • a realizable entity which is such that • (1) if it ceases to exist, then its bearer is physically changed, and • (2) whose realization occurs, in virtue of the bearer’s physical make-up, when this bearer is in some special physical circumstances
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Foundational Terms • Disorder =def. – A causally linked combination of physical components that is (a) clinically abnormal and (b) maximal, in the sense that it is not a part of some larger such combination. • Pathological Process =def. – A bodily process that is a manifestation of a disorder and is clinically abnormal. • Disease =def. – A disposition (i) to undergo pathological processes that (ii) exists in an organism because of one or more disorders in that organism.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Dispositions and Predispositions • All diseases are dispositions; not all dispositions are diseases. • A predisposition is a disposition. • Predisposition to Disease of Type X =def. – A disposition in an organism that constitutes an increased risk of the organism’s subsequently developing the disease X. • HNPCC is caused by a – disorder (mutation) in a DNA mismatch repair gene that – disposes to the acquisition of additional mutations from defective DNA repair processes, and thus – predisposition to the development of colon cancer.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Etiology • Etiological Process =def. – A process in an organism that leads to a subsequent disorder. • Example: – toxic chemical exposure resulting in a mutation in the genomic DNA of a cell; – infection of a human with a pathogenic virus; – inheritance of two defective copies of a metabolic gene • The etiological process creates the physical basis of that disposition to pathological processes which is the disease.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Evaluation related • Sign =def. – A bodily feature of a patient that is observed in a physical examination and is deemed by the clinician to be of clinical significance. (Objectively observable features) • Symptom =def. – A bodily feature of a patient that is observed by the patient and is hypothesized by the patient to be a realization of a disease. (a restricted family of phenomena (including pain, nausea, anger, drowsiness), which are of their nature experienced in the first person) • Laboratory Test =def. – A measurement assay that has as input a patient -derived specimen, and as output a result representing a quality of the specimen. • Laboratory Finding =def. – A representation of a quality of a specimen that is the output of a laboratory test and that can support an inference to an assertion about some quality of the patient.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Definitions - Qualities • Manifestation of a Disease =def. – A bodily feature of a patient that is (a) a deviation from clinical normality that exists in virtue of the realization of a disease and (b) is observable. – Observability includes observable through elicitation of response or through the use of special instruments. • Preclinical Manifestation of a Disease =def. – A manifestation of a disease that exists prior to its becoming detectable in a clinical history taking or physical examination. • Clinical Manifestation of a Disease =def. – A manifestation of a disease that is detectable in a clinical history taking or physical examination. • Phenotype =def. – A (combination of) bodily feature(s) of an organism determined by the interaction of its genetic make-up and environment. • Clinical Phenotype =def. – A clinically abnormal phenotype.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Diagnosis • Clinical Picture =def. – A representation of a clinical phenotype that is inferred from the combination of laboratory, image and clinical findings about a given patient. • Diagnosis =def. – A conclusion of an interpretive process that has as input a clinical picture of a given patient and as output an assertion to the effect that the patient has a disease of such and such a type.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences A well-formed diagnosis of ‘pneumococal pneumonia’ • A configuration of Disease representational units; isa • Believed to mirror the person’s disease; Pneumococcal pneumonia • Believed to mirror the disease’s cause; Instance-of at t 1 • Refers to the universal of which the disease is #78 #56 caused John’s portion John’s believed to be an by of pneumococs Pneumonia instance.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Some motivations and consequences (1) • No use of debatable or ambiguous notions such as proposition, statement, assertion, fact, . . . • The same diagnosis can be expressed in various forms. Disease isa Pneumococcal pneumonia Instance-of at t 1 #78 caused by #56 Portion of pneumococs caused by isa Pneumonia Instance-of at t 1 #56 caused by #78 at t 1
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Some motivations and consequences (2) • A diagnosis can be of level 2 or level 3, i. e. either in the mind of a cognitive agent, or in some physical form. • Allows for a clean interpretation of assertions of the sort ‘these patients have the same diagnosis’: The configuration of representational units is such that the parts which do not refer to the particulars related to the respective patients, refer to the same portion of reality.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Distinct but similar diagnoses Pneumococcal pneumonia Instance-of at t 1 #78 John’s portion of pneumococs caused by Instance-of at t 2 #56 #956 John’s Pneumonia Bob’s pneumonia caused by #2087 Bob’s portion of pneumococs
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Some motivations and consequences (3) • Allows evenly clean interpretations for the wealth of ‘modified’ diagnoses: – With respect to the author of the representation: • ‘nursing diagnosis’, ‘referral diagnosis’ – When created: • ‘post-operative diagnosis’, ‘admitting diagnosis’, ‘final diagnosis’ – Degree of the belief: • ‘uncertain diagnosis’, ‘preliminary diagnosis’
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Re. MINE Adverse Event Ontology
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Re. MINE Project
R T U New York State Center of Excellence in Bioinformatics & Life Sciences To Err is Human: Building a Safer Health System • 2000 Institute of Medicine Report: • estimated 98, 000 deaths a year caused by adverse events. Other studies indicate that this is an underestimation. Since then various agencies started to fund projects to improve the quality of healthcare; many devoted to detecting and reporting adverse events. Need for a common ontology. 68
R T U New York State Center of Excellence in Bioinformatics & Life Sciences What is an adverse event ? • We asked Google • We asked the experts • And as so often in biomedical terminology, … • … we obtained many distinct and mutually incompatible answers • … creating a silo problem 69
R T U New York State Center of Excellence in Bioinformatics & Life Sciences What is an adverse event ? • • • a reaction … an effect … an event … a problem … an experience … an injury … a symptom … an illness … an occurrence … a change … and also: • something … • an act … • an observation … as well as … • a term !!! 70
R T U New York State Center of Excellence in Bioinformatics & Life Sciences The view of some experts D 4 D 5 D 6 D 7 D 8 D 9 an observation of a change in the state of a subject assessed as being untoward by one or more interested parties within the context of a protocol-driven research or public health. an event that results in unintended harm to the patient by an act of commission or omission rather than by the underlying disease or condition of the patient any unfavorable and unintended sign (including an abnormal laboratory finding), symptom, or disease temporally associated with the use of a medical treatment or procedure that may or may not be considered related to the medical treatment or procedure any untoward medical occurrence in a patient or clinical investigation subject administered a pharmaceutical product and which does not necessarily have to have a causal relationship with this treatment an untoward, undesirable, and usually unanticipated event, such as death of a patient, an employee, or a visitor in a health care organization. Incidents such as patient falls or improper administration of medications are also considered adverse events even if there is no permanent effect on the patient. an injury that was caused by medical management and that results in measurable disability. BRIDG IOM NCI CDISC JTC QUIC 71
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Clearly, confusion reigns … The question “What are adverse events? ” cannot be answered directly, but needs to be reformulated as “What might the author of a particular sentence containing the phrase ‘adverse event’ be referring to when he uses that phrase? ”. 72
R T U New York State Center of Excellence in Bioinformatics & Life Sciences At least one argument • There is no single entity which each of these authors would be able to point to and say, faithfully and honestly, – “that is an observation” (definition D 4), – “that is an injury” (definition D 9), – “that is a laboratory finding” (definition D 6). • Clearly, – nothing which is an injury can be a laboratory finding, although, of course, laboratory findings can aid in diagnosing an injury. – nothing which is a laboratory finding, can be an observation, although, of course, some observation must have been made if we are to arrive at a laboratory finding. 73
R T U New York State Center of Excellence in Current approaches to bringing clarity Bioinformatics & Life Sciences • Building a consensus definition (and rejecting the others): – e. g. BRIDG (an observation of a change in a subject ) – the other definitions do not disappear and will still be used • Building an ontology of all of those things relevant to understanding any given use of ‘adverse event’ – Done thus far, unfortunately, by using very weak principles underlying ‘concept’-orientation, e. g. • allowing ‘age’ and ‘gender’ to be subclasses of ‘patient’, • not allowing adequate treatment of temporal sequence 74
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Our research questions • Can realism-based ontology be of value – to identify the different sorts of entities that can be denoted by the term ‘adverse event’ ? – to establish how these entities relate to each other and to use these relations to identify to what extent the various definitions overlap ? – to describe the portion of reality that is covered by all entities denoted by the terms that appear in the various definitions for ‘adverse event’ ? 75
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Basic hypothesis – all the authors of the mentioned definitions use the term ‘adverse event’ in contexts which look quite similar – in each of them, more or less the same sorts of entities seem to be involved • … there is some common ground (some portion of reality) which is such that the entities within it can be used as referents for the various meanings of ‘adverse event’. 76
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Study design • Goal: – to bring clarity into the wilderness that grew from current efforts. • Methods: – analyze the literature and collect all relevant definitions. – study a variety of relevant classification systems, taxonomies, terminologies and concept-based ontologies, – apply the realism-based principles advocated in • Basic Formal Ontology (BFO) • Referent Tracking (RT) to build a representation for the relevant portion of reality – assess whether the representation covers what is (or might be) expressed in the various definitions. 77
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Ontology development in Re. MINE support annotation Re. MINE Taxonomies Description of specific adverse event domains (childbirth, patient transfer, . . ) as cognized by human beings support reasoning Re. MINE Adverse Event Domain Ontology Re. MINE Application Ontology Ontologies higher order logic Realism-based, purpose independent representation of the portion of reality described in the taxonomies description logic Purpose dependent reformulations of the parts of RAEDO which are relevant for a specific domain
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Re. MINE Taxonomy Annotated Events Risk Manager’s Event Administration System
R T U New York State Center of Excellence in Bioinformatics & Life Sciences The cognitive model underlying the taxonomies Risky Parameters has_quality has_role Software Situation Environment occurs_in SHEL entity Time Interval Hardware occurs_in has_quality Contributing Factor causes Adverse_Event has_quality prevents Mitigation Factor results_in Impact on Patient Incident Type occurs_during Impact on Organization Liveware Process results_in Problem has Primary Diagnosis
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Re. MINE’s notion of adverse event 1. an ‘incident [that] occurred during the past and [is] documented in a database of adverse events’ – Stefano Arici, Paolo Bertele. Re. MINE Deliverable D 4. 1 – RAPS Taxonomy: approach and definition. V 1. 0 (Final) August 8, 2008. (p 21) … which is a ‘perdurant’ - ibidem (p 26) … ‘that occurs to a patient’ - ibidem (p 23) 2. an expectation of some future happening that can be prevented - ibidem (p 23)
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Terminologists agree, ontologists think … • Can something which is an incident be at the same time an expectation ? • Can something which is an incident a time t, later become an adverse event simply because it [? ] has been entered in a database ? • Can adverse events really occur in software ? • …
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Intermediate conclusion • The Re. MINE taxonomy (and all concept-based terminologies and ‘ontologies’ in general) provides a distorted view of reality. • For good reasons: the distortion is such that – it reflects a pragmatic view on what is relevant for the purposes it is designed, – it does away with complexities that do not help human beings in doing a better job. • But with some negative consequences: – reusability out of the Re. MINE context is hampered, – integration with other descriptive systems becomes cumbersome, and – advanced reasoning turns out to be impossible.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Level 2 interpretation Diagnoses Interpretations Hypotheses Risk assessments … documentation Level 1 Primary care processes Secondary processes management research Patients Clinicians Drugs Disorders … Level 3 Risk Management Ontology guidance Patient documentation Protocols Guidelines Event reports Scientific literature …
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Using the 3 levels and the particular/universal/class distinctions • Level 1: – #1: an incident that happened in the past; • Level 2: – #2: the interpretation by some cognitive agent that #1 is an adverse event; – #3: the expectation by some cognitive agent that similar incidents might happen in the future; • Level 3: – #4: an entry in the adverse event database concerning #1; – #5: an entry in some other system about #3 for mitigation or prevention purposes.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Allows appropriate error management • Some possibilities: 1. #1 with unjustified absence of #2: • #1 was not perceived at all, or not assessed as being an adverse event 2. Unjustified presence of #2: • There was no #1 at all, or #1 was not an adverse event 3. Unjustified absence of #4 • Same reasons as under (1) above • Justified presence of #2 but not reported in the database – … Ceusters W, Smith B. A Realism-Based Approach to the Evolution of Biomedical Ontologies. Proceedings of AMIA 2006, Washington DC, 2006; : 121 -125.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Part of the Re. MINE Domain Ontology
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Higher order logical representation • an incident (#1) that happened at time t 2 to a patient (#2) after some intervention (#3 at t 1) • is judged at t 3 to be an adverse event, thereby giving rise to a belief (#4) about #1 on • the part of some person (#5, a caregiver as of time t 6). • This requires the introduction (at t 4) of an entry (#6) in the adverse event database (#7, installed at t 0).
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Back-linking of the ontology to the taxonomies • ‘Re. M: Insufficient illumination’ is a Re. MINE term representing a defined class whose members are all instances of the universal represented by the Re. MINE term ‘Re. M: illumination’, that universal enjoying an isa relation with the universal represented by the BFO term ‘BFO: Quality’ • ‘Re. M: international guideline’ is a Re. MINE term representing a defined class whose members are all instances of the universal represented by the UCore-SL term ‘UCore: Plan’, that universal enjoying an isa relation with the universal represented by the BFO term ‘BFO: Information. Content. Entity’
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Advantages • Synchronisation of two distinct representations of the same reality: – taxonomies: • user-oriented view • data annotation – ontologies: • realism-based view • unconstrained reasoning • Domain ontology compatible with OBO-Foundry ontologies: – no overlap, – easier to re-use. • Not only tracking of incidents, but also: – how well individual clinicians and organizations manage adverse events, – how well one learns from past experiences.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Application in pharmacogenomics 101
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Pharmacogenomics • What is it ? – ‘The branch of pharmacology which deals with the influence of genetic variation on drug response in patients by correlating gene expression or singlenucleotide polymorphisms with a drug's efficacy or toxicity’. – ‘Pharmacogenomics is the whole genome application of pharmacogenetics, which examines the single gene interactions with drugs’. » http: //en. wikipedia. org/wiki/Pharmacogenomics
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Typical approach (1) • Building a huge matrix with patient cases in one dimension and patient characteristics in the other dimension Characteristics Cases ch 1 ch 2 ch 3 ch 4 ch 5 ch 6 . . . case 1 case 2 case 3 case 4 case 5 case 6 . . .
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Typical approach (2) • Use statistical correlation techniques to find associations between characteristics and (dis)similarities between cases Characteristics Cases ch 1 ch 2 ch 3 ch 4 ch 5 ch 6 . . . case 1 case 2 case 3 case 4 case 5 case 6 . . .
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Fundamental questions 1. 2. What is a characteristic ? What (sorts of) (pharmacogenomically relevant) characteristics go in here ? 3. 4. 5. How can we make distinct pharmacogenomic studies comparable? Because such matrices tend to become huge, how can we make analysis feasible ? How can we make results re-usable?
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Q 1: what is a characteristic ? – it is for sure not a category entities can belong to: there is no generic entity for which the name ‘characteristic’ would be appropriate on an exclusive basis; – there is also no particular entity that you could point to and state ‘that over there is the only existing characteristic’ – thus: there are no characteristics, there is just the term ‘characteristic’ which is used to describe that some entities are (acknowledged to be) in some way of interest in some context and for some purpose.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences This requires rephrasing Q 2 What (sorts of) (pharmacogenomically relevant) characteristics go in here? What entities described as being characteristic for pharmacogenomic purposes should be represented here?
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Examples Universals • portion of C 19 H 17 Cl. N 2 O 4 Independent • human being Continuant • gene Continuant Particulars • portion of Glifanan in the tablet in front of me • the HTR 2 A gene on chromosome 13 of the most frontal cell in the tip of my nose • shape • temperature • the temperature of the Glifanan tablet in front of me • length • the length of that HTR 21 gene • change in shape • unfolding of a DNA molecule • motion Dependent Continuant • the shape of my nose • the circulation of a Glifanan molecule in my bloodstream • rise in temperature • the rise of my body temperature while teaching this seminar Occurrent
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Two distinct (? ) sorts of relevant entities Characteristics Cases ch 1 ch 2 ch 3 ch 4 ch 5 ch 6 . . . case 1 case 2 case 3 case 4 case 5 case 6 . . . phenotypic genotypic
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Genotype / Phenotype Gene Ontology genes Human Phenotype ‘Ontology’ gene products features
R T U New York State Center of Excellence in Bioinformatics & Life Sciences The Gene Ontology components • Molecular Function = elemental activity/task – the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity • Biological Process = biological goal or objective – broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions • Cellular Component = location or complex – subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Application of good ontological principles
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Human Phenotype ‘Ontology’ http: //www. humanphenotypeontology. org/index. php/hpo_home. html
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Q 3: How can we make distinct pharmacogenomic studies comparable? • Map any characteristic used to relevant, standard and high quality ontologies
R T U New York State Center of Excellence in Bioinformatics & Life Sciences The positive effects of appropriate mappings
R T U New York State Center of Excellence in Bioinformatics & Life Sciences The positive effects of appropriate mappings • identification of ontological relations prior to statistical correlation: – – ch 1 and ch 4 ch 1 and ch 5 ch 1 and ch 2 … • Contributes to answering ‘Q 4: how can we make analysis feasible’ – this method allows for datareduction without information loss.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Filling the grid • We know that here go labels from appropriate ontologies • But, what goes here?
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Remember we had this … Universals • portion of C 19 H 17 Cl. N 2 O 4 Independent • human being Continuant • gene Continuant Particulars • portion of Glifanan in the tablet in front of me • the HTR 2 A gene on chromosome 13 of the most frontal cell in the tip of my nose • shape • temperature • the temperature of the Glifanan tablet in front of me • length • the length of that HTR 21 gene • change in shape • unfolding of a DNA molecule • motion Dependent Continuant • the shape of my nose • the circulation of a Glifanan molecule in my bloodstream • rise in temperature • the rise of my body temperature while teaching this seminar Occurrent
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Or after transposition … Universals Continuant Independent Continuant portion of C 19 H 17 Cl. N 2 O 4 human being Occurrent Dependent Continuant gene shape temperature length change in shape motion rise in temperature Particulars • portion of Glifanan in the tablet in front of me • the HTR 2 A gene on chromosome 13 of the most frontal cell in the tip of my nose • the shape of my nose • unfolding of a DNA molecule • the temperature of the Glifanan tablet in front of me • the circulation of a Glifanan molecule in my bloodstream • the length of that HTR 21 gene • the rise of my body temperature while teaching this seminar
R T U New York State Center of Excellence in Bioinformatics & Life Sciences … and for many patients case 1 Particulars case 2 . case 3 case 4 case 5 case 6 case 7 case 8 … . . . .
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Referent Tracking unique identification by means of ‘codes’ unique identification by means of ‘instance unique identifiers’
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Data and Information Models An ontological view
R T U New York State Center of Excellence in Bioinformatics & Life Sciences This holds also for data and information models 1 Patient-specific information 3 Scientific “knowledge” 2 • EHR-EMR-ENR-… • PHR • Various modality related databases – Lab, imaging, … • Textbooks • Classification systems • Terminologies • Ontologies
R T U New York State Center of Excellence in Bioinformatics & Life Sciences What is an Information Model? • An information model is: – ‘a representation of concepts, relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse that satisfy some industry need’. • A ‘quality’ information model is: – ‘an information model that is complete, sharable, stable, extensible, well-structured, precise, and unambiguous’. Y. Tina Lee. Information Modeling: From Design To Implementation. http: //www. mel. nist. gov/msidlibrary/doc/tina 99 im. pdf
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Why are there so many IM but no ‘quality’ IM? • An information model is: – ‘a representation of concepts, relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse that satisfy some industry need’. • many domains, • different needs within the same domain, • selection of ‘concepts’, ‘relationships’, … relevant for the needs. can never be complete
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Why are there so many? Blobel B, Pharow P: Analysis and Evaluation of EHR Approaches. MIE 2008, 26 -28 May 2008, Göteborg, Sweden
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Why are so many incompatible? • An information model is: – ‘a representation of concepts, relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse that satisfy some industry need’. • confusion about: – what ‘concepts’ and ‘relationships’ are, – whether a ‘domain of discourse’ is: » what is or can be said, versus, » that about what something is or can be said, – ‘semantics’. can never be unambiguous and precise
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Two major problems in information modeling (1) • Tyranny of the use case: – ‘if most people wrongly believe that crocodiles are a kind of mammal, then most people would find it easier to locate information about crocodiles if it were located in a mammals grouping, rather than where it factually belonged’. (p 89) Huhns MN, Stephens LM. Semantic Bridging of Independent Enterprise Ontologies. In: Kosanke K, ed. Enterprise Inter- and Intra. Organizational Integration: Building International Consensus. Boston, MA: Kluwer Academic Publishers; 2002: 83 – 90.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Two major problems in information modeling (2) • Assumption of inherent classification: – – we identify every thing by a specific class to which it belongs; and there exists a preferred set of classes to describe a domain. • Sad consequences: – – – ‘the complexity of problems in schema integration, schema evolution, and interoperability, violates philosophical and cognitive guidelines on classification and is, therefore, inappropriate in view of the role of data modeling in representing knowledge about application domains’. Parsons, J. and Wand, Y. Emancipating instances from the tyranny of classes in information modeling. ACM Trans. Database Syst. 25, 2 (June 2000), 228– 268.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Both problems have a common ground • Confusion brought about by the (dis)similarity between information and what the information is about: space } } anamnesis clinical examination diagnosis therapeutic interventions time
R T U New York State Center of Excellence in Bioinformatics & Life Sciences The worst of all: Health Level 7 RIM
R T U New York State Center of Excellence in Bioinformatics & Life Sciences HL 7 EHR structure For HL 7, a document is an act !
R T U New York State Center of Excellence in Bioinformatics & Life Sciences HL 7 said for over 15 years … • ‘The truth about the real world is constructed through a combination (and arbitration) of such attributed statements only, and there is no class in the RIM whose objects represent "objective state of affairs" or "real processes” independent from attributed statements. As such, there is no distinction between an activity and its documentation. ’
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Then what about this advice from the Surgeon General • ‘The Nation must take an informed, sensitive approach to communicate with and educate the American people about health issues related to overweight and obesity. ’ • ‘ACTION: The Nation must take action to assist Americans in balancing healthful eating with regular physical activity. ’ physical activity http: //www. surgeongeneral. gov/topics/obesity/calltoaction/fact_vision. htm
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Because of HL 7: Americans think that watching sports is as good as doing sports …
R T U New York State Center of Excellence in Bioinformatics & Life Sciences . . . or reading about sports
R T U New York State Center of Excellence in Bioinformatics & Life Sciences America’s future www. sfpix. com/health_saturdays/Heal_sat 1. html
R T U New York State Center of Excellence in http: //hl 7 -watch. blogspot. com/ Bioinformatics & Life Sciences 128
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Not much better: Microsoft Health. Vault • an Allergic Episode = (a) a single piece of data, that is (b) in a health record that is (c) accessible through Microsoft Healthvault Other Health Record Items: a blood pressure measurement, an exercise session, an insurance claim.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Even the very promising Open. EHR Model switching between data structures and what the data are about T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5. 1. 1. 16 Aug 2008
R T U New York State Center of Excellence in Bioinformatics & Life Sciences A bad ontology for this model Martínez-Costa, Menárguez-Tortosa, Fernández-Breis, Maldonado. A model-driven approach for representing clinical archetypes for Semantic Web environments. Journal of Biomedical Informatics 42(1), February 2009, 150 -164
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Not every term denotes (1) • ‘A well-known problem in clinical information recording is the problem of assigning “status”, including variants like “actual value of P” (P stands for some phenomenon), “family history of P”, “risk of P”, “fear of P”, as well as negation of any of these, i. e. “not/no P”, “no history of P” etc. • A proper analysis of these so called statuses shows that they are not “statuses” at all, …’ – this is so true ! • ‘… but different categories of information as per the ontology. The common statement types mentioned here are mapped as follows: • • actual value of P ⇒ Observation (of P); no/not P ⇒ Observation (of excluded P or types of P, e. g. allergies). family history of P ⇒ Evaluation (that patient is at risk of P); no family history of P ⇒ Evaluation (that P is an excluded risk); risk of P ⇒ Evaluation (that patient is at risk of P); no risk of P ⇒ Evaluation (that patient is not at risk of P); fear of P ⇒ Observation (of FEAR, with P mentioned in the description); ’ – some of these P’s do not exist at all ! T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5. 1. 1. 16 Aug 2008
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Not every term denotes (2) • ‘Another set of statement types that can be confused in systems that do not properly model information categories concern interventions, e. g. “hip replacement (5 years ago)”, “hip replacement(planned)”, “hip replacement (ordered for next tuesday 10 am)”. ’ – this is so true ! • ‘Ambiguity is removed here as well, with the use of the correct information categories, e. g. (I stands for an intervention): • I (distant past/unmanaged/passively documented) – ⇒ Observation (of I present in patient); • I (recent past) ⇒ Action (of I having been done to/for patient); • I (proposed) ⇒ Evaluation, subtype Proposal (of I suggested/likely for patient); • I (ordered) ⇒ Instruction (of I for patient for some date in the future). ’ – some of these I’s do not exist at all ! T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5. 1. 1. 16 Aug 2008
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Schemas like this need to be corrected T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5. 1. 1. 16 Aug 2008
R T U New York State Center of Excellence in Bioinformatics & Life Sciences An appropriate view on reality … T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5. 1. 1. 16 Aug 2008
R T U New York State Center of Excellence in Bioinformatics & Life Sciences A bit less appropriate view on reality … K Bernstein, M Bruun-Rasmussen, S Vingtoft, SK Andersen, C Nøhr. Modelling and implementing electronic health records in Denmark. International Journal of Medical Informatics (2005) 74, 213— 220.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences … can still lead to an erroneous ‘ontology’ Clinical Investigator Recording (CIR) ontology T Beale, S Heard, D Kalra, D Lloyd. EHR Information Model. Revision: 5. 1. 1. 16 Aug 2008
R T U New York State Center of Excellence in Bioinformatics & Life Sciences … and to leaving observed distinctions implicit ‘not knowing’ or ‘not specifying’ something is not a property of that what is not known or that about what a specification should be given, but a property of the agent involved. T Beale, S Heard, D Kalra, D Lloyd. The open. EHR Architecture Support Terminology. Revision: 1. 0. 1; 04 Aug 2008
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Another example Over the past 15 years, nearly 500 genes that contribute to inherited eye diseases have been identified. Diseasecausing mutations are associated with many ocular diseases, including glaucoma, cataracts, strabismus, corneal dystrophies and a number of forms of retinal degenerations. This remarkable new genetic information highlights the significant inroads that are being made in understanding the medical basis of human ophthalmic diseases. As a result, gene-based therapies are actively being pursued to ameliorate ophthalmic genetic diseases that were once considered untreatable.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Objectives of the Network • provide easy access to genetic testing for patients diagnosed with ocular diseases by screening for these genes, • collect and maintain relevant information in secure databases – to help speed the progress toward developing treatments and – to identify those who are most likely to benefit from them, • maintain a genetic specimen repository.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences The eye. GENE™ Database System • a repository of genotype and phenotype information of patients with eye diseases, • linked to a repository of DNA samples of patients with inherited eye diseases, • originally designed as a stand alone application, • but now moving towards a system that can be ‘integrated’ with a variety of other health care IT systems.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Core medical information in eye. GENE™ • Patient profile information: – DOB, contact info, race, sex, etc. • Family information: – presence of ‘the same’ disease in family members • Phenotype information: – One or more diagnoses – currently there are 21 potential diagnoses – clinical findings data obtained through structured questions for each diagnosis. • Genetic test results: – Result rows organized by Gene (with unique GI#), exons screened, and lab procedures – For each gene, exons screened, and lab procedures, results are registered as either ‘negative’, ‘mutation’ or ‘variant’ – For mutation or variant, results consist of exon, DNA changes, protein changes, and genotype.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Enhancements to core system capabilities
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Objectives of our study 1. to understand the type of view embedded in the eye. GENE™ database and, 2. in case this view would differ from the realist one, to propose a migration path towards the latter.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Materials & methods • We studied – the available documentation about eye. GENE™’s core medical information, including parts of its information model and user interfaces. – some of the clinical questions (and corresponding possible-answer sets) that are asked to eye. GENE™ users when they enter data in the system, – system generated reports about lab procedures performed on genes. • We did not have access to a data-dictionary with data-definitions and corresponding business rules – thus had to do some guessing about the exact meaning of data-fields • We checked – for design choices in the system that would lead the information to be collected not to match with the corresponding structure of reality; – for structural and functional issues in eye. GENE™ that in absence of sufficient background information for disambiguation would lead to difficulties in interpreting data once entered.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Realist framework used • Levels of reality: – L 1 - entities: such as specific patients, their relatives, the disorders they are suffering from, the lab tests that have been conducted, and so forth; – L 2 - entities: interpretations and opinions on the side of clinicians, including hypotheses and diagnoses; • thus being about entities in first-order reality, although not accessible to third parties without additional L 3 references; – L 3 -entities: information-elements about L 1 or L 2 entities, examples being entries in information systems such as the eye. GENE™ database. • The (type of) relationships that obtain between entities in each of these levels and across levels.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Results • the pragmatic design approach initially followed by eye. GENE™ exhibits several limitations: (1) conflating the three levels of reality as described above, (2) not representing faithfully the relevant portions of reality at each level, (3) forcing ‘data’ to be entered while there is nothing the data can be data about. some examples …
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences ‘Required fields’ • User must provide data for such fields, but what is the relation with reality? – country: each person for sure lives in some country – postal code, state: • not all countries use postal codes nor consist of states – phone number • not everybody has a phone, or at the time of data entry the number might not be known. • No other option than entering fake data.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Reductionism (1) • Forced selection from incomplete list – 22 ocular disease types • aren’t there any more? are the others not of interest? Just now or never? … • Forced structure of data-types – Belgian phone numbers are not structured the way US phone numbers are structured
R T U New York State Center of Excellence in Bioinformatics & Life Sciences eye. GENE™ core medical data schema Patient Clinical Encounter Patient Clinical Finding Patient Diagnosis Clinical Finding Diagnosis Finding Link Clinical Finding Unit Link Units Specimen Lab Result
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Reductionism (2) • Where are the disorders ? – diagnoses are in the heads of e. g. physicians – disorders are in the body of the patient • L 1 -L 3 confusions • The way clinical findings are linked to diagnoses does not allow to study how findings are related to disorders.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Some recommendations (1) • For each table, data field and associated allowed values, hard- or soft-coded business rule that restrict data-input, 1. assess what (type of) entity in reality would be denoted by any data instance, – includes any ‘value’ from ‘value sets’, external terminologies, etc 2. represent how these entities in reality relate to each other as well as to other ontologically relevant entities that are not explicitly addressed in the information model, • the domain model proper, – based on realism-based ontologies 3. describe formally how the information model has to be interpreted in terms of the domain model. – ‘interpretation model’
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Some recommendations (2) • • The (relevant parts of the) interpretation model should be part of any information exchange. Change user interfaces and information model only when no ‘realist interpretation’ is possible or faithful data entry cannot be achieved. – certain fields should not be ‘required’, – formatting, e. g. phone numbers, is acceptable in a userinterface when it satisfies local situations (not ‘requirements’), but not for exchange, – ‘unknown’ and ‘null values’ are acceptable, if suitable interpretations are provided in the interpretation model, not just as text in data-dictionaries.
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Conclusions • eye. GENE™ is successfully in use and processes by now over 100 samples / month. • the NIH roadmap goal to ‘require new ways to organize how clinical research information is recorded, new standards for clinical research protocols, modern information technology’, is not reached now. (Does any system ? ). • making eye. GENE™ ‘reality-aware’ is feasible. • the hope that at some future time relevant phenotypic data can be automatically extracted from an electronic medical record will remain a dream as long as these systems do not change.