10ad9d2b9b63c23525c1a03fce22c381.ppt
- Количество слайдов: 20
Direction of Proposals for New Edition (E 3) of ISO/IEC 11179 XMDR Working Group Presentation to SC 32/WG 2 meeting September, 2005 Toronto, Canada XMDR Presentation Page
Where have we been? Where are we now? …& where are we planning to go? System manuals Sem antic Data dictionaries Semantic s services 11179 E 1 (SSOA) XMDR Project 11179 E 3 Complex semantics management Data engineering/XML Data Te rm in ol og ie s, o nt o lo gi es , et c. XML & related standards 11179 E 2 grids XMDR Presentation Semantics management for data Data Standards/Data Administration Page
Improvements in Semantic Management Technology Semantics Management Code sets 11179 (E 1) 11179 (E 2) 11179 (E 3) … ----------------------------------20943 & 19763 & 20944 24707 XMDR Presentation Page
The semantics challenge has evolved Computer Era: 3 rd Generation Languages - Challenge: Automated Data Processing – convert paper data systems to automated systems and improve processing. Coded data to save memory, disk & tape. • Began to identify data with meaningful names – Data naming methods were innovative and helpful • Described data using unstructured text in manuals and/or with comments embedded in software – Only visible & useful to programmers • Text/documents were not computerized (remember typewriters, stencils, mimeographs, carbon paper? ) ISO/IEC JTC 1/SC 14 developed standard code sets (valid values) • Focus: Nomenclature for data only XMDR Presentation Page
The semantics challenge has evolved Computer Era: Early DBMS, 4 th GL query systems, word processing Challenge: Manage data – schema integration, eliminate “bittwiddling” • Document data in data dictionaries, software packages usually linked to a DBMS. Enforce “integrity constraints (e. g. , valid values)”. Use “description” field to describe data • Manage data life cycle – Standard code sets (valid values) were useful, but difficult to manage – tended to be left behind by programming changes required to keep up with real world changes. – Data naming methods failed to achieve interoperation of content between applications and between organizations, but remain useful as human friendly identifiers SC 14 began to develop methodology for data element standardization 11179 Edition 1 - Part 3, written in text, had ~15 attributes for data elements (editor-Netherlands) • Focus on standards for data elements XMDR Presentation Page
The semantics challenge has evolved Computer Era: DBMS, query systems, word processing - challenge: Manage data – DBMS schema integration, data quality (continued) • Began to model data and processes Modeling standards became useful, ERD, NIAM, UML • Word processing began to capture text documents • Keywords, glossaries, thesauri, and taxonomies became “machine readable”, but were treated as documents and were used manually SC 14 Developed methodology for data element standardization • 11179 (E 1) Parts 4 & 5 covered data definitions and names. • Part 6 covered registration • Part 2 WD suggested development of a global taxonomy, then changed to specify classification attributes (term, definition & identifier) in Part 2 (E 1) • All parts of 11179 were written in text. • Focus on managing data elements and classification of data elements XMDR Presentation Page
The semantics challenge has evolved Computer Era: Maturing Relational DBMS, Metadata Registries, XML, early WWW - challenge: Manage metadata, use terminology for data integration, data interoperability, data provenance, XML schema integration SC 14 -> SC 32/WG 2. Developed 11179 Edition 2: • Broadened from data elements to management of all “administered items”. • 11179 theme became “metadata registries” • Part 3 was expressed as a metamodel. • Part 3 included a “classification scheme region” (nodes & relationships) to improve semantics management – Link terms in definitions and valid values to terms and definitions in vocabularies and terminologies – Align concepts used in data with concepts used in text – Use computers to create and manage terminologies, thesauri, taxonomies • Part 2 (E 2) restated the classification scheme region attributes from Part 3. (All Part 2 E 1 attributes were included in Part 3 (E 2)). • Focus on semantics for data and text XMDR Presentation Page
The semantics challenge has evolved Computer Era: WWW, Concept systems, XMDR challenge: Semantics management & semantics Services SC 32/WG 2 developing 11179 (E 3). Proposals are made to extend semantics management and semantics services for MDR XMDR Presentation Page
ISO/IEC 11179 MDR Standard Goals • Used to record and link: – – – Data elements Data element concepts Conceptual Domains Value Domains: e. g, enumerated value domains Classification Schemes …. . • Goal: – To record the unambiguous meaning of data • Human understandable semantics: Current paradigm is natural language definitions • For E 3: Machine “Understandable”: Formal definitions (and axioms). Machine “understandable” in sense that computer can make use of concept systems for processing 2005 -07 -27
Advanced 11179 E 3 Use Scenario A User is concerned about a specific type of cancer • Wants to discover any documents on the web (reliable and unreliable sources) about the disease, causes, treatment, victims, and researchers • Wants to link concepts and individuals found in text to metadata and data in databases (where metadata/data relate to the concepts/individuals) • Wants to find relevant information where the terms used for the concepts vary: by regions, disciplines, scientific nomenclature, vernacular usage, language, and names of individuals. • Want to find information that is related through generalization and specialization and other relationships. • Note: No assumption of federation or central control over data and text generation. However, well managed concept systems and metadata (e. g. , data definitions) help. XMDR Presentation Page
Finding Hidden Information Waterfowl Goose XMDR Presentation Duck Page
*Concept Use and Integration with 11179 Part 3, Edition 2 Classification Schemes ca. DSRTraining Object Class Chemopreventive Agent Conceptual Domain Agent Valid Values Data Element Concept Chemopreventive Agent NSC Number Cyclooxygenase Inhibitor Doxercalciferol Eflornithine … Ursodiol Value Domain NSC Code Property NSCNumber Representation Code Data Element Chemopreventive Agent Name Context ca. CORE XMDR Presentation Page
Semantic Management Extensions Goals for Edition 3 • Sharable data that can easily be identified, shared, integrated, and made interoperable across information systems and organizations (a continuing challenge) – Unambiguous metadata characteristics to register semantic, syntactic and lexical information about data and text • Human AND machine “understandable” • Maintain backward compatibility with 11179 (E 2) implementations. • Registration and management of any semantic information useful for administering and managing the content of data and text XMDR Presentation Page
Semantic Management Extensions Goals for Edition 3 • Specify disciplined way to manage linkage of concept systems (KOS) to administered items. • Improve the linkage of concept systems to data and text • Enable users to find correspondences between concepts in text and in data, where these are found in dispersed documents and databases. Concepts may be given linguistic expression with terms that vary by synonymy, discipline, region, language, etc. • Registration of semantics to facilitate concept (and data) mapping, inference, aggregation • Manage metadata for not only DBMS & XML schemas, but also for knowledge bases, concept systems, … XMDR Presentation Page
Semantic Management Extensions Goals for Edition 3 (Continued) • Manage both data life cycle and ontology life cycle • Help to harmonize ontologies • Manage metamodels, reference ontologies & local ontologies • Restate Part 3 as an ontology and in Common Logic to enable use in Semantics technologies (Semantic Web, inference engines, reasoners, …). • Restate Part 3 using MOF • 11179 registries provide support for ISO/IEC 19763. • Specify semantics services for a semantics service oriented architecture. Enabler for semantic computing, semantic agents, semantic grids. – Semantic services needed for semantic web and semantic grids to become part of ISO/IEC 20944. XMDR Presentation Page
XMDR Intentions • We want to try capture existing thesauri, terminologies, ontologies as sources for the semantic specification of data elements to be used in databases, XML documents, messages, etc. • We want to incorporate more formal semantic specifications (e. g. , ontologies, formal statements (axioms, sentences, . . . )) to permit more precise semantic specifications (cf. to natural language definitions). • We want to incorporate formal semantic specifications to facilitate machine processing of semantic specifications, e. g. , by inference engines, agents, etc. Such machine processing of semantic specifications can be used in support of federated database access, web service identification and coordination, agent-based computations, etc. • We want to provide a framework for the registration, harmonization, evolution and standardization of ontologies. XMDR Presentation Page
Conceptual vs. Information Centric Metadata Standards Information Artifacts Metadata OMG Standards: MOF, CWM, UML Connections ? ? ? Ontology Standards: OWL, KIF, CL, . . . Terminology Standards Conceptual Level 2005 -07 -27
Space of Metadata Standards ISO/IEC 11179 connects both conceptual models and information artifacts. About information artifacts: data elements, schemas, UML models, . . . OMG Standards: MOF, UML, CWM MMF & ISO/IEC 11179 Edition 3 Metadata Registry Standards Terminology Standards Ontology Standards: OWL, KIF, CL, XTM, . . Conceptual models of the “real world” 2005 -07 -27
ISO/IEC 11179 Metadata Registry Standard • Connects both: – Conceptual models of the real world: • Concepts, data element concepts, classification schemes • Terminologies, taxonomies, ontologies – Information Artifacts • Data elements, enumerated values, . . . • UML models (e. g. , in ca. DSR) 2005 -07 -27
“ SEMANTICS XMDR Presentation ” US Page


