Скачать презентацию Metadata issues and DOI doi Metadata issues Скачать презентацию Metadata issues and DOI doi Metadata issues

3c5ae7cf7c6b6d7e0d930b7b42526cfa.ppt

  • Количество слайдов: 37

Metadata issues and DOI doi> Metadata issues and DOI doi>

Metadata issues and DOI Presentation based on one given at the IDF members conference, Metadata issues and DOI Presentation based on one given at the IDF members conference, July 2000… and Forthcoming additional Handbook material on DOI Genres

Metadata issues and DOI overview of presentation. . . Background Three <indecs> conclusions The Metadata issues and DOI overview of presentation. . . Background Three conclusions The metadata landscape Which schemes matter most to DOI? DOI metadata - practical implications DOI Genres DOI Kernels Handle and metadata Conclusion

Definitions of metadata popular. . . Metadata is data about data. Everyone logical. . Definitions of metadata popular. . . Metadata is data about data. Everyone logical. . . An item of metadata is a relationship that someone claims exists between two entities. framework functional. . . Metadata is the life-blood of e-commerce. John Erickson (HP)

#1: All metadata is just a view e. g. Views of a “person”: some #1: All metadata is just a view e. g. Views of a “person”: some (generic) ways in which you might be identified in metadata schemes. . . Son Legal person Agent Alien Scholar Library user Composer credit card holder Shoe purchaser Author Lottery entrant Hospital patient Citizen Car driver Rights owner Marathon runner Software licensee Parent Tax payer Club member e-consumer Back account holder Husband Charity giver Hotel guest Speeding ticket recipient Disney. World visitor Frequent Flyer Concert-goer Passenger Employee Voter Dog owner In each of these roles “you” will have different IDs and attributes. Three conclusions

#1: All metadata is just a view Creations are the same. An identifier for #1: All metadata is just a view Creations are the same. An identifier for a published article may refer to. . . A manuscript The abstract work A draft A (class of) physical copy in a publication A (class of) digital copy (not in a publication) A (class of) digital copy in a publication A (class of) digital format A specific digital copy A (class of) paper copy A specific paper copy An edition A reprint A translation etc…and many combinations of the above Similar views apply to other types of creations. Three conclusions

#1: All metadata is just a view Views must not be confused for digital #1: All metadata is just a view Views must not be confused for digital content and rights management. Mistaken identity can be catastrophic. Increasingly, views need to be interoperable (eg production workflow, rights, marketing within one business). The need for automated, interoperable views in dcommerce will be enormous. Three conclusions

#2: (Almost) all terms need identifiers Each of the values of a view must #2: (Almost) all terms need identifiers Each of the values of a view must be defined and identified if other views are to recognize them (what do you mean by an abstract work? an edition? a format? a scholar? a dog owner? a name? ) So views need comprehensive controlled vocabularies (nb our reliance on ISO language, territory, currency, time codes). Automation needs disambiguity. Terms of rights must be unambiguous. Anything may be a term of an agreement. Emergence of the value of structured ontologies for commerce (like the indecs model). Three conclusions

#3: Events are the key to interoperability Most metadata is “thing” or “people” based. #3: Events are the key to interoperability Most metadata is “thing” or “people” based. • static views e. g. “a creation” In the Web future, metadata interoperability will be achieved by describing “events”; relating things and people • dynamic views e. g. “A created B” Event descriptions will also be the key to rights metadata (transactions are events) Three conclusions

The metadata landscape These conclusions are being reached increasingly often elsewhere. There is an The metadata landscape These conclusions are being reached increasingly often elsewhere. There is an explosion of metadata activity: • Models, Identifiers, Vocabularies, Dictionaries, Ontologies. • XML/RDF schemas. • Registries/Repositories/”crosswalks”. • Technical standards.

The metadata landscape for “creations” The metadata landscape for “creations”

The metadata landscape for “creations” Libraries Archives Museums Education Technology Newspapers Magazines Standards Journals The metadata landscape for “creations” Libraries Archives Museums Education Technology Newspapers Magazines Standards Journals Books Texts Audiovisual Audio Music Copyright

1980 s The metadata landscape for “creations” Libraries Archives Museums Education MARC Technology Newspapers 1980 s The metadata landscape for “creations” Libraries Archives Museums Education MARC Technology Newspapers UPC EAN Magazines ISSN Standards ISO codes ISBN Journals Books Texts Audiovisual CAE Audio Music Copyright

The metadata landscape for “creations” Libraries MARC Technology Archives Museums IMS Education IIM FRBR The metadata landscape for “creations” Libraries MARC Technology Archives Museums IMS Education IIM FRBR Dublin Core UPC url urn Handle Multimedia mid 90’s Newspapers EAN Standards DOI Magazines ISSN ISO codes ISBN ISRC Audio ISMN Music CIS Books Texts ISWC Audiovisual ISAN Journals CAE Copyright

today The metadata landscape for “creations” Libraries MARC Technology XML schema EBooks e. Books today The metadata landscape for “creations” Libraries MARC Technology XML schema EBooks e. Books MPEG 7 Multimedia Archives FRBR RDF ISO 11179 url uri Handle urn MPEG 21 Museums CIDOC IMS Education IIM NITF LOM Dublin Core Newspapers UPC Standards ISO codes Audiovisual ISAN SMPTE DMCS ISRC Audio DOI ISMN Music ISSN SICI EPICS Journals Books BICI ISTC Texts ISWC CIS ISBN ONIX Xr. ML IPDA PRISM Magazines CROSSREF P/META UMID abc EAN CAE Copyright

Convergence All serious schemes are becoming. . . EPICS/ONIX (text) • Granular (parts and Convergence All serious schemes are becoming. . . EPICS/ONIX (text) • Granular (parts and versions) SMPTE (audiovisual) • Modular (creations within creations) e. Books • Multimedia • Multinational • Multilingual • Multipurpose SDMI/DCMS (audio/music) DOI genres CIDOC (museums/archives) FRBR (libraries) Dublin Core CIS (copyright societies) PRISM (magazines) NITF (newspapers) MPEG 7 (multimedia) Result: major “sector” schemes are now trying to define metadata with broadly the same scope, only different emphases.

Which initiatives matter most to DOI? MPEG 21 SMPTE data dictionary EPICS/ONIX Xr. ML Which initiatives matter most to DOI? MPEG 21 SMPTE data dictionary EPICS/ONIX Xr. ML Criteria. . . Strong underlying data model Multi-purpose Extensive, structured vocabulary Commercial critical mass Outward-looking

MPEG 21 Began 2000 (ISO Motion Picture Expert Group). Possible umbrella for digital multimedia MPEG 21 Began 2000 (ISO Motion Picture Expert Group). Possible umbrella for digital multimedia standards. Place to bring technology and content standards together. MPEG track record of disciplined standards development. Most major players getting involved. Not many lawyers (yet). Short-term perception problem: “MPEG is audiovisual”. Is the challenge too great?

SMPTE Data Dictionary/UMID Began 1998 (Society of Motion Picture and Television Engineers). Well-structured multimedia SMPTE Data Dictionary/UMID Began 1998 (Society of Motion Picture and Television Engineers). Well-structured multimedia technically-oriented data dictionary. ISO 11179 metadata registry based, good governance and update procedure. SMPTE track record of disciplined standards development. UMID (Unique Media Identifier) for digital material complementary to “editorial” identifiers like DOI. Guaranteed implementation in “home” sector. Start point for MPEG 7 metadata work.

EPICS & ONIX International EDIt. EUR (EPICS) and AAP (ONIX) convergence (May 2000). Substantial EPICS & ONIX International EDIt. EUR (EPICS) and AAP (ONIX) convergence (May 2000). Substantial and extensible EPICS metadata dictionary, -model based, from which “ONIX” XML-tagged subset(s) are taken. Commerce-driven (Amazon etc) with transatlantic industry support and International Steering Group. Likely to be used by e. Books, ISTC. ONIX for video (Amazon initiative)? ONIX for audio? Best chance of e-commerce multimedia vocabulary and schema (and maybe d-commerce? ).

Standard controlled vocabularies Existing… Territories, Language, Currency, Date/Time (ISO) Measures (U. C. U. M) Standard controlled vocabularies Existing… Territories, Language, Currency, Date/Time (ISO) Measures (U. C. U. M) Needed… Creation types Derivation types (adaptation, sample, compilation…) Contributor roles (author, translator, cameraman…) Title types (abbreviated, inverted, formal. . . etc) Media types (formats) Name types Identifier types Encoding types Tools/instruments User roles etc. . . and many identifiers need establishing or creating (Parties, Agreements, ISWC, ISTC, ISAN, UMID etc)

Xr. ML and Rights metadata DRM (Digital Rights Management) systems at present are for Xr. ML and Rights metadata DRM (Digital Rights Management) systems at present are for “unitary” rights: doesn’t deal with modularity. Holdup 1: Rights vocabularies need descriptive vocabularies - not yet ready. Holdup 2: Events model needed to integrate descriptions and rights - event-based tools not yet developed. Xr. ML likely focal point for next stage. 2001+ before mature interoperable developments start to emerge. DOI-R? Interested partners in a prototype?

Metadata issues and DOI metadata - practical implications DOI Genres DOI Kernels Handle and Metadata issues and DOI metadata - practical implications DOI Genres DOI Kernels Handle and metadata Conclusion

DOI Genres A genre is a DOI view: mechanism for “unity in diversity”. Genre DOI Genres A genre is a DOI view: mechanism for “unity in diversity”. Genre based on any interest group’s view of a type of creation. Functional granularity: create a genre when you need it. Genres can overlap: creations can be in multiple genres. Genre has metadata kernel, Registration Agency, Genre Development / Steering Group? Base Genre for new, unplaced DOIs. Zero Genre for legacy DOIs.

DOI Metadata Kernel Each Genre starts from Base Genre kernel (8 elements) and may DOI Metadata Kernel Each Genre starts from Base Genre kernel (8 elements) and may add whatever else it needs. IDF/indecs to develop a kernel extension model this September. DOI Genre vocabulary to be developed - bin tandem with EPICS/ONIX? Can/should coincide with or provide sector requirements (eg ISBN, ISRC, ISWC etc). Different Genres’ metadata will interoperate if vocabularies are developed within indecs/EPICS/ONIX model.

DOI Base Genre Kernel Contains critical minimum metadata for basic recognition (but not complete DOI Base Genre Kernel Contains critical minimum metadata for basic recognition (but not complete disambiguation). Standard base vocabulary (eg manifestation, version) mean all DOI applications can expect base genre metadata. DOI Genre type (eg “book”) must be analysable in terms of other attributes (eg media, mode, content, subject). DOI 10. 1000/ISBN 0141255559 DOI Genre Book Identifier ISBN 0141255559 Title Two for the dough Type Manifestation Derivation Original Primary Agent Janet Evanovich Agent Role Author

DOI Genre Kernel Extensions IDF to develop an extended “catalogue” for all extended metadata DOI Genre Kernel Extensions IDF to develop an extended “catalogue” for all extended metadata requirements from indecs/EPICS model and vocabulary, along these lines. . . DOI Genre Identifier(s) Title(s) + Types, Languages Type Derivation Media Encoding Genre(s) Form(s) Subject(s) Content Language + Use Type Measures + Units of Measure Content Creations Content Link Sequence, Measure Related Creations + Link Type Creation Event + Type Primary Agent + Agent Role + Tool Source Creation Date(s) Location(s) Availability Event + Type Agent + Agent Role Date(s) Location(s) Price + Type

Metadata declarations Either local webpage or central repository or both (as decided by Genre). Metadata declarations Either local webpage or central repository or both (as decided by Genre). Base kernel metadata must be declared. Genre-specific metadata is a matter for the Genre (Development Group/Registration Agency) to decide. Automated access to metadata declaration via Handle data types. XML schemas.

Roles of declared metadata = Functional spec of the DOI kernel (a) to assign Roles of declared metadata = Functional spec of the DOI kernel (a) to assign a unique DOI to the creation [DOI] (b) to link the DOI to the principal local identifier of a creation (if any) to enable the integration of DOI-related applications and metadata with others [Identifier] (c) to enable a searcher or application to identify the creation by its most common name and the parties(s) responsible for its creation or publication [Title, Main Creator, Role]

Roles of declared metadata (continued) (d) to enable a searcher or application to distinguish Roles of declared metadata (continued) (d) to enable a searcher or application to distinguish the fundamental type of creation (abstract, physical, digital or spatio-temporal), and thereby also to distinguish between creations of different types with the same names and creators. [Type] (e) to enable a searcher or application or distinguish the mode in which the creation is (abstract, audio, visual, audiovisual, multimedia). [Mode] (f) to enable a searcher or application to determine to which DOI Genre the creation belongs [DOI Genre].

Roles of declared metadata (continued) (g) to enable basic metadata about creations from different Roles of declared metadata (continued) (g) to enable basic metadata about creations from different DOI Genres to be searched, stored or processed in common applications [All, plus controlled values of Role, Type, Mode, DOI Genre].

Handle and metadata Handle data types can create a “distributed database”. eg metadata@10. 1000/123456 Handle and metadata Handle data types can create a “distributed database”. eg metadata@10. 1000/123456 rights@10. 1000/123456 abstract@10. 1000/123456 sample@10. 1000/123456 buy@10. 1000/123456 license@10. 1000/123456 pdf@10. 1000/123456 etc Data types (and results) must be consistent, so the Handle data type vocabulary must be developed with great care within indecs-based model. Some data types could be genre specific.

Conclusion: DOI - the Integrator? G. Rust 1998: “DOI is the most ambitious identifier Conclusion: DOI - the Integrator? G. Rust 1998: “DOI is the most ambitious identifier in the history of the world”. But now several things are becoming established. . . …it has a persistent, granular, flexible, unique identifier which can be a “wrapper” for other IDs. Not competitive enhances legacy identifiers’ functionality in d-commerce. DOI as the integrating digital identifier? . . . a strong, established metadata model and vocabulary. …a controlled but flexible development structure. …it does not confuse names with addresses. …allows multiple, standardised automated actions. Nothing else comes close. . .

The DOI model Identifier Description doi> Action The DOI model Identifier Description doi> Action

The DOI model Identifier Description doi> Action Rights DOI for parties and events in The DOI model Identifier Description doi> Action Rights DOI for parties and events in future?

Metadata tasks for DOI ? • Mapping ONIX to <indecs> – reconcile any differences Metadata tasks for DOI ? • Mapping ONIX to – reconcile any differences • data dictionary – elements and iids tested in depth; for mappings • maintaining iid registry – database – available to anyone building a genre schema, but not need to be public • applications based on iid registry – technology tools to ease genre building • developing rights management aspects of dictionary.

Metadata issues and DOI Metadata issues and DOI