- Количество слайдов: 15
Vocabulary Workshop, RAL, February 25, 2009 NERC Data. Grid Controlled Vocabularies: What, Why, How?
Metadata Ø Love it or hate it without metadata automated data handling isn’t possible NERC Data. Grid Ø For automated data handling to be possible across distributed data sources metadata standards are required Ø Standardised metadata comprises fields that represent real world entities such as location, time, phenomena, etc.
Metadata Ø These fields need to be populated NERC Data. Grid Ø Plaintext may be used. Makes population easy, but it’s next to useless. Ø Some real examples: · · A wide variety of chemical and biological parameters Amplitude de l'echo retrodiffuse Cu, Zn, Fe, Pb, Cd, Cr, Ni in biota MACR 0 -MEIOFAUNA, SED BIOCHEMISTRY, ZOOPLANKTON, CILIATES, BACT CELLS, BACT BIOMASS, LEUCINE UPT, PRIM. PROD, METABOL, COCCOLITH Ø Plaintext should be confined to abstracts
Controlled Vocabularies NERC Data. Grid Ø Much better to use concepts labelled using universally agreed terms that have universally agreed meanings Ø A collection of concepts designed to populate a given metadata field may be called a controlled vocabulary Ø Controlled vocabularies · Ensure consistent spellings · Ensure consistent syntax Ø Well-managed controlled vocabularies · Prevent metadata misunderstandings · Maintain a static relationship between metadata fields and the real world
Thesuari NERC Data. Grid Ø Concepts within a controlled vocabulary may be semantically connected using simple relationships: · Blue broader colour · Colour narrower blue · Colour related pigmentation Ø Concepts from different controlled vocabularies describing the same type of thing may be semantically connected using simple mapping relationships: · Bacillariophycaea exact. Match diatoms · IPTS 68 temperature close. Match ITS 90 temperature · Nutrients in rivers related. Match nitrate in water bodies · Salinity broad. Match physical oceanography · Physical oceanography narrow. Match salinity Ø The results may termed thesauri
Ontologies NERC Data. Grid Ø But what if the controlled vocabularies describe different types of thing? Ø We can relate them by increasing the semantic richness of the relationships Ø For example: · We could have a controlled vocabulary of instruments · We could also have a controlled vocabulary of parameters
Ontologies NERC Data. Grid Ø We can link these up using relationships such as: · Themosalinograph measures salinity · Fluorometer measures chlorophyll · Air temperature measured. By psychrometer Ø The result may be termed an ontology
Ontologies Ø Ontology relationships are: NERC Data. Grid · Semantically rich · Potentially abundant Ø Software agents need to have some relationship understanding to exploit the knowledge encoded in the ontology Ø This is achieved through relationships describing relationships called rules
Knowledge Representation NERC Data. Grid Ø Relationships between concepts may be expressed using Resource Description Framework (RDF) Ø W 3 C standard XML encoding having ‘triples’ as its basic building block Ø Each triple has a subject, a predicate and an object. For example: · Colour related pigmentation · Thermosalinograph measures salinity Ø Familiar?
Knowledge Representation NERC Data. Grid Ø Controlled vocabularies (concept collections) and thesauri may be represented using the Simple Knowledge Organization System (SKOS) Ø W 3 C standard XML schema based on RDF Ø Jointly developed by STFC and Manchester University Computer Science Ø 2008 version is the one to use
NERC Data. Grid -
Knowledge Representation Ø Ontologies may be represented using Web Ontology Language (OWL) NERC Data. Grid Ø W 3 C standard XML schema based on RDF Ø Example OWL document http: //mida. ucc. ie/ont/20080124/theme. owl Ø Alternative simple text encodings are available such as Open Biomedical Ontologies (OBO) Ø OBO used for NERC-related Env. O ontology
Knowledge Management Tools Ø RDF NERC Data. Grid · Tools abound – see for example http: //planetrdf. com/guide/ · Jena is one of the better known Ø SKOS · See the SKOS Tool Shed http: //esw. w 3. org/topic/Skos. Dev/Tool Shed · Note this includes a Protégé plugin
Knowledge Management Tools Ø OWL NERC Data. Grid · Protégé with appropriate plugin is the most widely used · There are commercial alternatives such as Top. Braid Composer · MMI (http: //marinemetadata. org) has developed a vocabulary to OWL converter (voc 2 OWL) Ø OBO · Text so text tools work · OWL and SKOS converters available
Knowledge Management Tools ØMapping NERC Data. Grid · MMI have developed a mapping tool (VINE) to build maps from two OWL files ØVisualisation · Concept maps are useful * Cmap tools is very good * Free. Mind (open source)