
352ef94a52f5ccfd9d6eef34267f1326.ppt
- Количество слайдов: 36
INLS 520 Information Organization INLS 520 Erik Mitchell
Review • Controlled vocabularies – Term Lists, Hierarchies, Trees, Paradigms, Facets, Folksonomies • Knowledge organization systems – Term Lists, Thesauri, Taxonomies, Ontologies INLS 520 Erik Mitchell
Today • Protege tutorial – Create a thesaurus – Create an ontology • Ontologies – Basic concept – Building in protege – RDF (? ) – OWL (? ) INLS 520 Erik Mitchell
Assignment 1 recap • Required XML tags – XML. . . ? > • Required DC elements – None, need a content wrapper
CV Concepts & definitions • Controlled Vocabularies – Organized Lists – Relationships between concepts • Knowledge organization systems – Typed relationships – Direct / inferable knowledge INLS 520 Erik Mitchell
Thesauri Definitions – “Guide to use of terms, showing relationships between them, for the purpose of providing standardized, controlled vocabulary for information storage and retrieval”(Monash) – “A list of words showing similarities, differences, dependencies, and other relationships to each other”(USG) INLS 520 Erik Mitchell
Thesauri Concepts • • • Preferred terms Non-preferred terms Semantic relations between terms How to apply terms (guidelines, rules) Scope notes Adding terms (How to produce terms that are not listed explicitly in thesaurus) INLS 520 Erik Mitchell
Common thesaural identifiers • SN Scope Note – Instruction, e. g. don’t invert phrases • USE Use (another term in preference to this one) • • UF Used For BT Broader Term NT Narrower Term RT Related Term INLS 520 Erik Mitchell
Thesauri Guides • National Information Standards Organization. (2005). Guidelines for the construction, format, and management of monolingual thesauri. ANSI/NISO Z 39. 192005. Bethesda, MD: NISO Press. – http: //www. niso. org/standards/resources/Z 39 -192005. pdf? CFID=5559601&CFTOKEN=31747314 • Aitchison, Jean & Gilchirist, Alan. Thesaurus Construction: A Practical Guide. 3 rd ed. London: Aslib, 1997. • Willpower Information Management Consultants – http: //www. willpower. demon. co. uk/thesprin. htm INLS 520 Erik Mitchell
Ontology Definitions • “The study of being or existence” • “A conceptualization of a specification” (Gruber) • “An ontology formally defines a common set of terms that are used to describe and represent a domain. ” (OWL) INLS 520 Erik Mitchell
Webster’s Dictionary • Webster’s Third New International Dictionary defines Ontology as: 1. A science or study of being, specifically a branch of metaphysics* relating to the nature and relations of being. 2. A theory concerning the kinds of entities and specifically the kinds of abstract entities that are to be admitted to a language system. INLS 520 *Metaphysics: Nature of being “or” existence. Erik Mitchell
Ontology Concepts • Classes – Names of objects in the domain • Relationships between classes • Connections between classes • Properties of classes • Background or identifying knowledge of these objects • Constraints on these properties & relationships • Limits and parameters of the relationships INLS 520 Erik Mitchell
Class exercise • Protégé overview – Orientation – Object types (Classes, Slots, Instances) – Relationships (hierarchies, associative) • As a group, we will work through the protege training guide – http: //protege. stanford. edu/doc/tutorial/get_started/ get-started. pdf INLS 520 Erik Mitchell
What is the semantic web • • URI (Universal resource identifier) OWL/RDFS All built on top of regular web RDF underlying language of semantic web – Xml represents data (document based) – RDF represents pure information (anyone can use, re-harvestable), you could call this knowledge • Examples – Swoogle – Goog 411 INLS 520 Erik Mitchell
Ontologies (review) • “A common set of terms that are used to describe and represent a domain” – Classes, Relationships, Properties, Constraints • A formal organization of knowledge – The primary role of an ontology is to define a language which people and computers in a given domain can share INLS 520 Erik Mitchell
A good ontology has • Features: – Meaningful – all classes have instances – Accurate / correct – Non-redundant – each class/instance is represented in a single way – Rich in description – context, content • Enabled functionality: – Able to use queries to connect new pieces of information – Use XML & definitions to integrate knowledge across domains INLS 520 Erik Mitchell
Ontology Continuum • • • Keyword Lists Basic Thesauri Complex Thesauri Taxonomies Simple Ontologies (wordnet) Complex Ontologies (OWL) INLS 520 Erik Mitchell
SHOE Ontology project – • Possible to build an ontology for anything – Simple HTML Ontology Extensions (SHOE) Project • http: //www. cs. umd. edu/projects/plus/SHOE/html-pages. html • Sample projects – Beer Ontology • http: //www. cs. umd. edu/projects/plus/SHOE/onts/index. html#beer – Document Ontology • http: //www. cs. umd. edu/projects/plus/SHOE/onts/docmnt 1. 0. html INLS 520 Erik Mitchell
Ontology Concepts • • Multiple inheritance Vertical and horizontal relationships Decomposed subject/object Predicate based description (is. Relatedto, has. Version) • First Order Predicate Logic – Statements broken down into subjects/predicates • Proposition – All men are mortal, Socrates is a man • Therefore – Socrates is mortal INLS 520 Erik Mitchell
Creating a CV review • Design methods – Re-use existing, start with content & desired use ideas – Committee / community approach • Top-down – Concept driven • Bottom-up – Document driven – Empirical approach • Deductive approach – Select terms, create relationships, perform term control • Inductive approach – Establish CV at outset, build hierarchies on as needed basis INLS 520 Erik Mitchell
Creating a CV review (2) • Top-Down – Identify audience – Identify all topics, concepts, uses, and context of the domain – Sort topics identified into an appropriate organization scheme (enumerative, hierarchical, faceted) – Solidify structure and clean up gaps & redundancies – Assign documents to categories, test retrieval • Bottom-up INLS 520 Erik Mitchell – Identify audience – Survey documents for topics/concepts. – Build system on the fly – let content drive structure and limits of system – Identify gap & redundancies in system – Test retrieval
Creating a CV review (3) • Think about scope, use, content, maintenance • Gather Terms – Based on existing systems, content – Based on user needs/expectations – Investigate issues of specificity, exhaustivity, granularity • Build hierarchies, relationships – Broader/narrower terms, Related terms, Use/Use for, see/see also • • Establish Rules Implement Evaluate Maintain http: //www. boxesandarrows. com/view/creating_a_controlled_vocabulary INLS 520 Erik Mitchell
Creating an Ontology • Determine Scope of field, define boundaries • Check for existing ontologies, vocabularies • Select a top-down/bottom-up approach – Identify concepts, vocabulary, parameters, constraints • Identify relationships – Multiple hierarchies, inheritance • Build, test, maintain INLS 520 Erik Mitchell
Class exercise • Design your own ontology – In Groups, pick a domain of knowledge • Type of food (pizza, soup, beer), field of study (library science, math), etc • Come up with a basic ontological framework and begin creating it in Protege • Be prepared to share a brief overview with the class which will include – – Domain area Top level classses Instance definitions Relationships INLS 520 Erik Mitchell
Assignment 2 • Overview • In this assignment you will create an ontology on a topic of your choice. Your ontology should contain multiple classes and instances and be focused on a specific purpose. This assignment includes an implementation of the ontology in Protégé and a brief paper explaining your ontology. • Guidelines • Select a topic of interest and determine the top level (i. e. Basketball, Chocolate, etc). • Define the scope (depth/breadth) and purpose of the ontology. Define specific classes and facets (known as slots in Protégé) that describe those classes. Your ontology should have between 5 -10 classes with multiple (2 -5) slots for each class. Think about the use of hierarchy and multiple inheritance in your ontology. • Summarize your ontology in a short paper (no more than two pages). Outline your ontology and discuss your rationale and key decisions (e. g. scope, purpose, classes and slots, defining relationships) • Implement the ontology in Protégé. Define your classes and instances. Create two queries that illustrate ways in which the data could be retrieved. • Dates & groupwork • Due – November 6 th • Groupwork is acceptable INLS 520 Erik Mitchell
RDF • Subject, property, object triples • Transmitted in xml • RDFS extends RDF with an ontology language – Properties, specialization • OWL – More powerful extension of RDFS – Uses same syntax of RDF INLS 520 Erik Mitchell
RDF Model Author Webpage: http: //www. stuff. com “Saki Knafo” (Resource) (Property type) (Value) Subject Predicate Object “The author of the stuff webpage is Saki Knafo” - A literal, a triple, a statement INLS 520 Erik Mitchell
How is RDF different? • RDF is a descriptive model that – Allows variable contextualized description – Deconstructs the descriptive process – Allows more granular automated processing of data – Uses exact markup to indicate the context of values (namespaces, schemas) INLS 520 Erik Mitchell
Encoding RDF in XML
Iterative RDF description
RDFS • RDF Schema – Defines additional rdf elements that help type relationships • Special Classes – Based on RDF Classes / Properties / Attributes with additional • http: //www. w 3 schools. com/rdf_reference. asp • Allows the creation of vocabularies / ontologies INLS 520 Erik Mitchell
OWL (Web Ontology Language) • An ontolgy that is geared towards representing information on the web – Classes, properties, and relationships that describe URIs and their facets. • Based on the Triple concept – Subject, Predicate, Object – 3 versions: OWL-Lite, OWL-DL, OWL-Full • Formatted in RDF/XML – Uses RDF and RDFS as a foundation – Adds new elements in the owl namespace INLS 520 Erik Mitchell
OWL Versions • OWL-Lite – Simple hierarchies, constraints • OWL-DL – Uses description logics • Logic-based semantic markup based on first-order predicate logic – Still guarantees finite relationship processing – Best suited for automation • OWL-Full – Most complex – Open ended, possible to get into infinite processing INLS 520 Erik Mitchell
More OWL Examples • Airport • Pizza INLS 520 Erik Mitchell
Next Week(s) • Fall Break – Enjoy • 10/30 – Guest speaker Lorrie Eakin • 11/6 – First Group presentations INLS 520 Erik Mitchell