
5ee1a25fd3db69d0fe68bf5b1157f58d.ppt
- Количество слайдов: 34
I. What other metadata schemes are available? • Digital Object Identifier • Resource Description Format • Persistent URL II. What does the use of metadata mean for information management? Introduction to Informatics - Fall 02
Digital object identifiers http: //www. doi. org/ This is an initiative from international book and journal publishers managed by the DOI Foundation It is a new identification system to be used for all digital content The DOI system provides a unique identification for that content, protecting intellectual property There are close to 200 companies using ~ 4 million DOIs This system provides a way to link users of materials to the rights holders to facilitate automated commerce on the net Introduction to Informatics - Fall 02
The DOI is a system for interoperably identifying and exchanging intellectual property It is an extensible framework for managing intellectual content in any form at any level of granularity It allows automated copyright management for all types of media It is used to record and return content and present links to multiple related materials For example: articles, books, images, bibliographies, supporting data, videos, charts, tables, and audio and electronic files It is a persistent means to authenticate content, ensuring that what the customer is requesting is what is being sent Introduction to Informatics - Fall 02
The DOI is analogous to the Universal Product Code (UPC) bar code It identifies the product and can be integrated into inventory control and reporting systems within the supply chain The DOI is NOT equivalent to a Universal Resource Locator (URL) The URL points to the location (or instance) of content A DOI permanently names the content It facilitates retrieving multiple instances of the content, or of other associated information, without regard for location Introduction to Informatics - Fall 02
The DOI System has three parts, the identifier, the directory, and the database. Enumeration: the identifier has two components Prefix: assigned to the publisher by the Directory Manager At this phase of the Prototype, the prefixes all begin with 10 to designate the Directory Manager making the assignment of the prefix This is followed by a number designating the publisher who will be depositing the individual DOIs Publishers may chose to request a prefix for each imprint or product line, or may use a single prefix Introduction to Informatics - Fall 02
Suffix: The second element, following a slash mark This is the designation assigned by the publisher to the specific content being identified Many use recognized international standards for their suffixes If they do so, they are encouraged to indicate the standard being used by preceding it with a code The suffix can follow any system of the publisher’s choosing, and be assigned to objects of any size - book, article, abstract, chart - or any file type - text, audio, video, image or software Introduction to Informatics - Fall 02
An object (book) may have one DOI, and a component within that object (chapter) may have another DOI The publisher decides the level of identification based on the nature of objects sold and distributed over the Internet The suffix can be as simple as a sequential number or a publishers' own internal numbering system Prefix Suffix 10. 1002/[ISBN]0 -471 -58064 -3 Directory Registrant Prefix Code (Optional) Item # Introduction to Informatics - Fall 02
The DOI is a unique “dumb number” assigned to an entity Metadata is needed for information to be determined from the enumeration This may change (i. e. when ownership changes), but the DOI remains persistent for the life of the object The DOI mandates a minimum level of public structured associated metadata to describe an entity: DOI Industry or proprietary identification code or number Title, agent and role: publisher, producer, author Type: digital file, physical object, abstract, performance Mode: text, audio, visual, audiovisual Introduction to Informatics - Fall 02
The Directory: The DOI system acts as a routing system Digital content may change ownership or location, so the DOI system uses a central directory When a user clicks on a DOI, a message is sent to the central directory where the current web address associated with that DOI appears This location is sent back to the user’s browser with a message telling it to “go to this particular net address. ” In a second the user sees a “response screen” - a Web page - on which the publisher offers the reader either the content itself, or further information about the object how to obtain it Introduction to Informatics - Fall 02
If the object moves to a new server or the copyright holder sells it, one change is made in the directory and users are sent to the new site The DOI is reliable and accurate because the link to the content or associated information is easily managed The database Information about the object, beyond simply the response screen is maintained by the publisher It might include the content or information about where and how to obtain the content or other related data The information that the user has access to in response to a DOI query is the third component of the DOI system Introduction to Informatics - Fall 02
The DOI conforms to, and takes advantage of, relevant international standards The syntax of the DOI is being proposed as NISO standard Z 39. 84 DOI metadata will be expressed in RDF using XML DOI conforms to the syntax for URNs (IETF) DOI works with Interoperability of Data in ECommerce Systems (INDECS) INDECS uses the current major initiatives of structured metadata (including Dublin Core) to define a common metadata model for ecommerce. The DOI Foundation is a member of the W 3 and works with standardization activities from ISO, WIPO, and others Introduction to Informatics - Fall 02
Examples of DOI Usage An article reference found on the net is linked to an abstract and information about the availability of full text A reader of one article is linked to related material including similar articles, or books A reader using DOIs sees the full text of an article, the Table of Contents of the journal in which the article appeared She could subscribe to the journal, purchase a book, or order the content for later delivery A user is able to use the DOI to automatically contact a help service, or download the current driver for a software product Introduction to Informatics - Fall 02
RDF: Resource Description Framework 2/99 W 3 C (World Wide Web Consortium) initiative W 3 C’s RDF provides a generic metadata architecture It is a standard that supports the definition of metadata across the web It describes how metadata for content is defined in web documents This metadata is descriptive information about the structure and content of information in a document RDF is useful for describing information about indexing, navigating and searching a site, as well as push channel definitions and digital signatures http: //www. w 3 c. org/RDF/ Introduction to Informatics - Fall 02
RDF is the instantiation of the Warwick Framework for the Web The basic RDF data model consists of three object types: Resource: anything that can be specified by a URI, such as a web page, an entire web site, a specific newsgroup message Properties: characteristics or attributes of a resource, along with some notion of meaning, valid values, etc. Statements: resource + named property + value of property This is expressed as a “tuple” {subject predicate object} Introduction to Informatics - Fall 02
It will be the foundation for an architecture for metadata on the Web Resource description Electronic commerce Site mapping Third party rating Digital signatures Search engine data collection (web crawling) Digital library collections Distributed authoring Introduction to Informatics - Fall 02
Using XML, it might look like this: <RDF xmlns: DC="http: //purl. org/DC"> <Description about="http: //www. w 3. org/folio. html"> <DC: Title>The W 3 C Folio 1999</DC: Title> <DC: Creator>W 3 C Communications Team</DC: Creator> <DC: Date>1999 -03 -10</DC: Date> <DC: Subject>Web development, World Wide Web Consortium, Interoperability of the Web</DC: Subject> </Description> </RDF> Introduction to Informatics - Fall 02
OCLC’s Persistent URL (PURL) project http: //www. oclc. org/purl Functionally, a PURL is a URL with three parts Protocol: this is used to access the PURL resolver This protocol may differ from that used to access the resource associated with the PURL Resolver address: the IP address or domain name of the PURL resolver This portion of the PURL is resolved by the Domain Name Server (DNS) Name: user-assigned name Note: This may differ from the name of the resource in the associated URL Introduction to Informatics - Fall 02
Instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service The PURL resolution service associates the PURL with the actual URL and returns the URL to the client The client can then complete the URL transaction in the normal fashion. The advantage of PURLs is that they persist over time no matter where the page moves http: //purl. oclc. org/your. address/yourfile. html protocol resolver address filename Introduction to Informatics - Fall 02
The model works something like this: PURL C L I E N T URL PURL SERVER RESOURCE Introduction to Informatics - Fall 02
I. What other metadata schemes are available? • Digital Object Identifier • Resource Description Format • Persistent URL II. What does the use of metadata mean for information management? Introduction to Informatics - Fall 02
III. What does the use of metadata mean for information management? There are social and technical issues in the use of metadata Metadata use requires collaboration because there is little benefit if authors simply add whatever metadata they like People have to agree on the metadata schemes to use (purposes and values of the scheme) To reach agreement, people must be willing to abandon old procedures and adopt new methods It is critical to address the social aspects of metadata early and often Introduction to Informatics - Fall 02
Issues: Implementing new ways of storing and retrieving information requires the cooperation of various stakeholders Education is necessary for those who may never see a metadata record up close and personal It requires attention to staffing and work flow It is difficult to find people with the specialized skills to evaluate, implement, and maintain systems that exploit metadata Administrators must understand that use of metadata will require commitment of time and resources for staff training and education Introduction to Informatics - Fall 02
Convincing creators of digital information to use metadata For an organizaation to use metadata, the creators of digital documents must embed it in the document This is not a trivial because adding metadata is an investment of time and effort Librarians can help creators work with metadata but should not take responsibility for putting it in documents and files This is because metadata is embedded in the work itself and will rarely if ever be directly controlled by the librarian Introduction to Informatics - Fall 02
Convincing people to understand metadata and tools which exploit it Metadata is a tool, not a solution We must understand metadata and possess certain skills in order to make it useful for others Introducing entirely new systems for information access may undermine the goal of providing integrated access to the organization’s information assers It may requires that people who handle the information learn new skills For these reasons and others, some may feel that it is not worthwhile to work with metadata Introduction to Informatics - Fall 02
Who will be responsible for creating and maintaining metadata? Publisher side Author Webmaster Institution Service side Search service Third party creators How will it be done? Automatically generated Hand crafted Introduction to Informatics - Fall 02
Technical Issues Compatibility with present access mechanisms and data One issue is to retain compatibility with existing access mechanisms For example, the catalog is the primary access point to the vast majority of library resources MARC has become the accepted standard for exchange of library information and has influenced storage and display of information It is a not wise to become dependent on a technology that is incompatible with MARC There should be a compelling case that a technology will become dominant or that migration will be possible Introduction to Informatics - Fall 02
There are no widely accepted metadata standards yet Some efforts have attracted interest, but use is infrequent and inconsistent Librarians are interested in the Dublin Core because its elements transfer relatively easily to MARC But the DC does not do well with resources that don't behave like paper documents Also, the DC is that it defines a minimal set of elements and further development seems to have stopped There is no guarantee that metadata generated today will be useful for providing access to documents in the future Banerje, K. (1999). Practical Applications of Metadata at Oregon State University http: //ucs. orst. edu/~banerjek/papers/ola 1999. html Introduction to Informatics - Fall 02
Libraries Working Group Charter: Foster increased operability between DC and library metadata by identifying issues and solutions; Keep the library community informed on DC developments; Consider reasons to experiment more widely with Dublin Core in libraries; Build a library Implementors community; Explore the need for a cross domain namespace(s) to register non-DC elements and qualifiers needed by the library community http: //purl. oclc. org/dc/groups/libraries. htm Introduction to Informatics - Fall 02
What will information professionals have to learn? The range of applicable metadata schemes, their strengths and weaknesses How to apply appropriate schemes to digital information The ways in which various metadata schemes facilitate resource discovery How they affect the administration of digital information, information security, documentation, data mining… The relationship between standards and metadata How to test and evaluate various metadata schemes Introduction to Informatics - Fall 02
A challenge for information professionals is to work with different metadata schemes This involves developing metadata “crosswalks” Fluid capability to work with same data in different metadata structures Requires agreement on semantics Requires standardized mappings for interoperability It will involve working with metadata in two forms Embedded with data Independent of items Introduction to Informatics - Fall 02
Here an examples of a metadata crosswalk from Dublin Core to MARC The conversion of DC style record involves Skeletal record for enhancement Incorporating DC record into MARC database Subject and Keywords Simple: 653$a (Index term--Uncontrolled) Complex: If scheme=LCSH: 650$a If scheme=LCC: 050$a If scheme=DDC: 082$a If scheme=(other): 650$a with $2 (code) This enables communication of the DC record in MARC Introduction to Informatics - Fall 02
Here are some other examples of crosswalks: DC/MARC/GILS Crosswalk http: //www. loc. gov/marc/dccross. html MARC/FGDC http: //alexandria. sdc. ucsb. edu/public-documents/metadata/ fgdc 2 marc. html GILS/MARC http: //www. usgs. gov/gils/prof_v 2. html#annex_b Also: Dublin Core to FGDC MARC to SGML Crosswalks allows resource discovery across syntaxes Introduction to Informatics - Fall 02
Examples of metadata schemes Text Encoding Initiative (TEI) http: //www. uic. edu/orgs/tei/ Global Information Locator Service (GILS) http: //www. gils. net/index. html Computer Interchange of Museum Information (CIMI) http: //www. cimi. org/ Encoded Archival Description (EAD) http: //lcweb. loc. gov/ead/ Content Standards for Digital Geospatial Metadata (CSDGM) http: //www. fgdc. gov/metadata/contstan. html Introduction to Informatics - Fall 02
Nordic Metadata Project http: //linnea. helsinki. fi/meta/ Hot. Oil: Distributed Searching over Heterogeneous Information Sources http: //www. dstc. edu. au/Research/Projects/hotoil/ National Biological Information Infrastructure (NBII) http: //www. nbii. gov/index. html Categories for the Description of Works of Art (CDWA) http: //www. ukoln. ac. uk/metadata/desire/overview/rev_03. htm Interoperability of Data in ECommerce Systems (INDECS) http: //www. indecs. org/ Introduction to Informatics - Fall 02
5ee1a25fd3db69d0fe68bf5b1157f58d.ppt