
cfd8d06d00a342f4ee521bbb962cde5c.ppt
- Количество слайдов: 25
Archaeology and Terminology Ceri Binding Hypermedia Research Unit, University of Glamorgan, Wales, UK http: //hypermedia. research. glam. ac. uk/
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK STAR project - overview • AHRC funded project in collaboration with English Heritage Centre for Archaeology, Portsmouth • Aim: to investigate the potential of semantic technologies for widening access to digital archaeology resources, including disparate datasets and associated grey literature. Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK STAR - general architecture Applications – Server Side, Rich Client, Browser Data access layer - Web Services, SQL, SPARQL RDF Based Common Ontology Data Layer (CRM / CRMEH / SKOS) Indexing Grey Literature reports Conversion (SKOS) EH thesauri, glossaries Data Mapping / Normalisation STAN RRAD IADB LEAP Archaeological Datasets Mind the Lexical Gap – EUROVOC conference 2010 RPRE
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK The Archaeological Archipelagos [Keith May, English Heritage] Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK English Heritage controlled vocabularies • 27 glossaries – from English Heritage recording manuals (2006) • 6 main thesauri used: – Monument Types thesaurus – Archaeological Sciences thesaurus – Evidence thesaurus – Main Building Materials thesaurus – MDA Object Types thesaurus – Timelines thesaurus • Converted to SKOS format for use within STAR Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Expressive vs. controlled vocabulary “…how many of those writing [grey literature] reports would think to describe what they are recording/writing about using the same thesauri? […] it would have been a lot quicker and easier if standardised terminology had been used in the report text when describing types of monument, event and artefact, as well as dates/periods etc. ” [G. Falkingham] “Grey Literature is very often the only place where field workers have any opportunity to engage in creating their own narrative of the site, both of the archaeological event and of the archaeological story of the site itself. I think it would be throwing the baby out with the bath water to concentrate solely on the data without continuing to offer highly skilled and experienced fieldworkers the opportunity to actually tell us what they think the data means. . . ” [S. Jeffrey] Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Descriptive, semi-controlled vocabulary… Deposit Colour Deposit Texture Deposit Compaction (Reddy) Brown Dark orange/brown Orangy brown, very Firm Plastic 9 Reddy) brown Dark red brown light brown on edges Friable Sticky Brown Grey brown and sides of profile Friable to loose Sticky (wet) Brown red Grey/brown Red /brown Friable/loose Sticky/firm Brown/reddy Light brown Red brown Friable-loose Varies Dark brown Light yellow brown Red/brown Loose “…another of my examples has something about some flint that is Dark brown/orange Medium brown Reddish brown Loose/friabe Dark grey brown Mid brown Reddy brown Loose/friable ‘snuff coloured’ & I don’t know if I’ve ever seen snuff, let alone Dark orange brown Mid red brown Varies know what colour it is, or might have been over 150 years ago, Dark orange brown Orange brown Very light brown and I would think it would make sense to take some kind of with darker Orange/brown White integrated approach from the outset, rather than the usual patches Orangy brown Yellow brown ‘bricolage’ of having one route for the archivists, another for Dark orange loam Yellow/orange brown those interested in searching spreadsheets, another for people interested in googling graphics, etc. ” [G. Carver] Worst of all worlds? Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Terminology control for time periods • • Centuries BC / AD years 3 age system Monarchs / Roman emperors Cultural styles Geological periods Prefixes: pre, post, mid etc. Any combinations of these Examples of periods encountered in data MLC 2 -C 3 AD 341 -6 Iron Age First half 1 st century? Antonine LC 2/EC 3 Early C 3 MLA Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Time period alignment – data cleansing / semantic enrichment Object No 1519 AD 354 -64 1520 1 st century AD 1538 2 nd century 1548 1 st century 1562 AD 367 -75 1563 AD 348 -50 1567 Mid 1 st century AD 1571 First half 1 st centu 1572 Mid first century AD 1580 c. AD 270 1583 First half first cen 1591 AD 341 -6 1593 AD 287 -93 1594 AD 43 -44 1595 Medieval 1627 2 nd century AD 1631 ? 1 st century 1635 AD 354 -64 1664 AD 330 -5 1681 Medieval 1701 Romano-British 1704 Modern? 98157 post-mediaeval Period MIN YEAR MAX YEAR 354 1 101 1 367 348 33 1 33 270 1 341 287 43 1066 101 1 354 330 1066 43 1901 1540 Mind the Lexical Gap – EUROVOC conference 2010 364 100 200 100 375 350 66 270 50 346 293 44 1540 200 100 364 335 1540 410 2100 1901
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Time period relationships Period P 1 occurs before P 1* occurs after P 1* meets P 1 met by P 1 overlaps P 1 overlapped by P 1 starts P 1* started by P 1* finishes P 1* finished by P 1* includes P 1* occurs during P 1* equal to P 1* Time Mind the Lexical Gap – EUROVOC conference 2010 [*Transitive]
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Time Period Comparison – Closeness Calculation IU Period P 1 Period P 2 NMP MP NMP Period P 3 NMP D NMP Time Match(P 1, P 2) = W 1 (MP / IU) + W 2 (IU / (NMP + IU)) + W 3 (IU / (D + IU)) Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK SKOS Concepts + CRM Entities Time period concepts also have implicit spatio-temporal context skos: Concept crm: E 2. Temporal. Entity crm: P 4 F. has_time-span rdf: type <#stuart> crm: E 52. Time-Span rdfs: sub. Class. Of crm: P 7 F. took_place_at rdf: type crm: E 4. Period crm: E 53. Place crm: P 116 F. starts skos: broader <#jacobean> skos: broader <#caroline> crm: P 119 F. meets skos: broader <#restoration> crm: P 118 F. overlaps crm: P 115 F. finishes skos: broader <#williamandmary> crm: P 119 F. meets Mind the Lexical Gap – EUROVOC conference 2010 <#queenanne>
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Time period alignment – data processing • Align data relative to closest period concepts from English Heritage ‘Timelines’ thesaurus Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Time period alignment - results Data records relative to closest ‘known’ periods Data record (dates deduced from labels) Label From To 1555 -1623? 1555 AD 270 -284 270 Relationship overlaps occurs during 1623 includes overlapped by occurs during overlaps includes 284 started by overlapped by met by Calculated closest matching known periods Label From To Closeness JAMES I AND VI 1567 1625 0. 895 POST MEDIEVAL 1540 1901 0. 838 ELIZABETHAN 1558 1603 0. 814 nd HALF 16 th CENTURY AD 2 1551 1600 0. 808 rd CENTURY LATE 3 267 300 0. 885 th QUARTER 3 rd CENTURY AD 4 276 300 0. 706 PROBUS 276 282 0. 699 AURELIAN 270 275 0. 665 3 rd QUARTER 3 rd CENTURY AD 251 275 0. 610 QUINTILLUS 270 0. 532 Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Data aligned to closest ‘ known’ periods Data record – dates deduced from labels ID Label From To Closest controlled match based on dates ID Label From To 1315 AD 228 -31 228 231 136122 ALEXANDER SEVERUS 222 235 1316 AD 364 -78 364 378 900014 3 RD QUARTER 4 TH CENTURY AD 351 375 1317 AD 69 -79 69 79 136087 VESPASIAN 69 79 1318 AD 270 -4 270 274 136164 TETRICUS I 270 274 1319 AD 275 -402 275 402 134825 4 TH CENTURY AD 300 399 1320 AD 341 -6 341 346 900013 2 ND QUARTER 4 TH CENTURY AD 326 350 1321 AD 268 -70 268 270 136154 CLAUDIUS II GOTHICUS 268 270 1322 AD 367 -75 367 375 900014 3 RD QUARTER 4 TH CENTURY AD 351 375 1324 AD 270 -84 270 284 135952 LATE 3 RD CENTURY 266 299 1325 AD 270 -84 270 284 135952 LATE 3 RD CENTURY 266 299 1326 AD 367 -75 367 375 900014 3 RD QUARTER 4 TH CENTURY AD 351 375 1327 AD 383 -8 383 388 900015 4 TH QUARTER 4 TH CENTURY AD 376 399 1328 AD 330 -40 330 340 900013 2 ND QUARTER 4 TH CENTURY AD 326 350 1337 Post-medieval 1540 1901 134746 POST MEDIEVAL 1540 1901 1370 Medieval 1066 1540 134745 MEDIEVAL 1066 1540 1371 AD 1943 134848 SECOND WORLD WAR 1939 1945 Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Timeline service test client Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Semantic enrichment • Borderline between data cleansing and data creation… “Possibly fragment of belt buckle or nail” • BELT • Belt Clasp -> use STRAP FITTING • BUCKLE • Buckle Plate -> use BUCKLE • NAIL • HOBNAIL • SHOEING NAIL “The single most useful thing you can do to ensure the long-term preservation of your data is to plan for it to be re-used” [Archaeology Data Service] Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Aligning controlled vocabularies • Different scope notes, same concepts? • Different thesauri, same concepts? Archaeological Objects • SARCOPHAGUS • SUNDIAL • WALL PAINTING • WHIPPING POST RCHME Monument Types • SARCOPHAGUS • SUNDIAL • WALL PAINTING • WHIPPING POST RCHMS Monument Types RCHMW Monument Types Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK STAR general architecture • Windows applications • Browser components • Full text search • Browse concept space • Navigate via expansion • Cross searchaeological datasets STAR client applications English Heritage thesauri (SKOS) Grey literature indexing STAR web services Archaeological Datasets (CRM) STAR datasets Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Windows Client Applications Browse available thesauri Search across multiple thesauri Navigate via semantic expansion Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Interactive tools to aid data entry Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Controlled types used in main search interface § Interactive selection from glossary/thesaurus concepts § Filtered to concepts actually used in indexing § Group / context types – from (enhanced) cuts and deposits glossary § Context find materials – from building materials thesaurus § Context find types – from MDA Object types thesaurus § Context sample types – from existing data values. . . Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Interactive tools to aid data entry Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology – Ceri Binding, University of Glamorgan, Wales, UK Summary • Tension between expressive vs. controlled vocabulary • Semantic enrichment process and terminology control (e. g. for time periods) • Alignment of controlled vocabularies • Web services and interactive tools to aid data entry and search • Issues encountered are not about particular technologies – more fundamental KO issues Mind the Lexical Gap – EUROVOC conference 2010
Archaeology and Terminology Ceri Binding Hypermedia Research Unit, University of Glamorgan, Wales, UK http: //hypermedia. research. glam. ac. uk/
cfd8d06d00a342f4ee521bbb962cde5c.ppt