5c8e6340f718f7c318c14ff3a814a6c4.ppt
- Количество слайдов: 15
Vocabulary Workshop, RAL, February 25, 2009 NERC Data. Grid Controlled Vocabularies: What, Why, How?
Metadata Ø Love it or hate it without metadata automated data handling isn’t possible NERC Data. Grid Ø For automated data handling to be possible across distributed data sources metadata standards are required Ø Standardised metadata comprises fields that represent real world entities such as location, time, phenomena, etc.
Metadata Ø These fields need to be populated NERC Data. Grid Ø Plaintext may be used. Makes population easy, but it’s next to useless. Ø Some real examples: · · A wide variety of chemical and biological parameters Amplitude de l'echo retrodiffuse Cu, Zn, Fe, Pb, Cd, Cr, Ni in biota MACR 0 -MEIOFAUNA, SED BIOCHEMISTRY, ZOOPLANKTON, CILIATES, BACT CELLS, BACT BIOMASS, LEUCINE UPT, PRIM. PROD, METABOL, COCCOLITH Ø Plaintext should be confined to abstracts
Controlled Vocabularies NERC Data. Grid Ø Much better to use concepts labelled using universally agreed terms that have universally agreed meanings Ø A collection of concepts designed to populate a given metadata field may be called a controlled vocabulary Ø Controlled vocabularies · Ensure consistent spellings · Ensure consistent syntax Ø Well-managed controlled vocabularies · Prevent metadata misunderstandings · Maintain a static relationship between metadata fields and the real world
Thesuari NERC Data. Grid Ø Concepts within a controlled vocabulary may be semantically connected using simple relationships: · Blue broader colour · Colour narrower blue · Colour related pigmentation Ø Concepts from different controlled vocabularies describing the same type of thing may be semantically connected using simple mapping relationships: · Bacillariophycaea exact. Match diatoms · IPTS 68 temperature close. Match ITS 90 temperature · Nutrients in rivers related. Match nitrate in water bodies · Salinity broad. Match physical oceanography · Physical oceanography narrow. Match salinity Ø The results may termed thesauri
Ontologies NERC Data. Grid Ø But what if the controlled vocabularies describe different types of thing? Ø We can relate them by increasing the semantic richness of the relationships Ø For example: · We could have a controlled vocabulary of instruments · We could also have a controlled vocabulary of parameters
Ontologies NERC Data. Grid Ø We can link these up using relationships such as: · Themosalinograph measures salinity · Fluorometer measures chlorophyll · Air temperature measured. By psychrometer Ø The result may be termed an ontology
Ontologies Ø Ontology relationships are: NERC Data. Grid · Semantically rich · Potentially abundant Ø Software agents need to have some relationship understanding to exploit the knowledge encoded in the ontology Ø This is achieved through relationships describing relationships called rules
Knowledge Representation NERC Data. Grid Ø Relationships between concepts may be expressed using Resource Description Framework (RDF) Ø W 3 C standard XML encoding having ‘triples’ as its basic building block Ø Each triple has a subject, a predicate and an object. For example: · Colour related pigmentation · Thermosalinograph measures salinity Ø Familiar?
Knowledge Representation NERC Data. Grid Ø Controlled vocabularies (concept collections) and thesauri may be represented using the Simple Knowledge Organization System (SKOS) Ø W 3 C standard XML schema based on RDF Ø Jointly developed by STFC and Manchester University Computer Science Ø 2008 version is the one to use
Knowledge Representation <? xml version="1. 0" ? > NERC Data. Grid - <rdf: RDF xmlns: rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#" xmlns: skos="http: //www. w 3. org/2004/02/skos/core#" xmlns: dc="http: //purl. org/dc/elements/1. 1/"> - <skos: Concept rdf: about="http: //vocab. ndg. nerc. ac. uk/term/P 011/116/TEMPS 901" > <skos: external. ID>SDN: P 011: 116: TEMPS 901</skos: external. ID> <skos: pref. Label>Temperature (ITS-90) of the water column by CTD or STD</skos: pref. Label> <skos: alt. Label>CTDTmp 90</skos: alt. Label> <skos: definition>Unavailable</skos: definition> <dc: date>2009 -02 -09 T 10: 45: 32. 262+0000</dc: date> <skos: broad. Match rdf: resource="http: //vocab. ndg. nerc. ac. uk/term/P 021/37/TEMP" /> </skos: Concept> </rdf: RDF>
Knowledge Representation Ø Ontologies may be represented using Web Ontology Language (OWL) NERC Data. Grid Ø W 3 C standard XML schema based on RDF Ø Example OWL document http: //mida. ucc. ie/ont/20080124/theme. owl Ø Alternative simple text encodings are available such as Open Biomedical Ontologies (OBO) Ø OBO used for NERC-related Env. O ontology
Knowledge Management Tools Ø RDF NERC Data. Grid · Tools abound – see for example http: //planetrdf. com/guide/ · Jena is one of the better known Ø SKOS · See the SKOS Tool Shed http: //esw. w 3. org/topic/Skos. Dev/Tool Shed · Note this includes a Protégé plugin
Knowledge Management Tools Ø OWL NERC Data. Grid · Protégé with appropriate plugin is the most widely used · There are commercial alternatives such as Top. Braid Composer · MMI (http: //marinemetadata. org) has developed a vocabulary to OWL converter (voc 2 OWL) Ø OBO · Text so text tools work · OWL and SKOS converters available
Knowledge Management Tools ØMapping NERC Data. Grid · MMI have developed a mapping tool (VINE) to build maps from two OWL files ØVisualisation · Concept maps are useful * Cmap tools is very good * Free. Mind (open source)
5c8e6340f718f7c318c14ff3a814a6c4.ppt