Скачать презентацию Digital Collections Storage and Access Jon Dunn Assistant Скачать презентацию Digital Collections Storage and Access Jon Dunn Assistant

286b7bc39ede7a3bcd8059603942b7b7.ppt

  • Количество слайдов: 22

Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana. edu

Storage n Why is storage an issue? n n Space requirements Persistence Accessibility Needs Storage n Why is storage an issue? n n Space requirements Persistence Accessibility Needs depend on purpose of storage n n n Capture/encoding Access/delivery Preservation October 2, 2003 ALI Digital Library Workshop

Storage: Working Space n n Space for storage of digital files during capture/encoding/quality control Storage: Working Space n n Space for storage of digital files during capture/encoding/quality control process Possibilities n n n PC hard drive File server / LAN Issues n Capacity, backup, speed, accessibility October 2, 2003 ALI Digital Library Workshop

Storage: Access/Delivery n Storage of derivative files for web delivery n n Possibilities n Storage: Access/Delivery n Storage of derivative files for web delivery n n Possibilities n n Image, audio, video, text files, etc. Local web server Commercially-hosted web site Consortial service provider Issues: capacity, backup, performance, software integration, maintenance/migration October 2, 2003 ALI Digital Library Workshop

Storage: Preservation n n Much harder problem Longer term n n n Larger files Storage: Preservation n n Much harder problem Longer term n n n Larger files n n Issues of longevity of media, hardware, file format “Where did we put the files? ” Hard disk storage, traditional backup methods not cost-effective Infrequency of access n Problems do not become immediately evident October 2, 2003 ALI Digital Library Workshop

Long-Term Storage Options n Removable media stored offline n Optical n n n Tape Long-Term Storage Options n Removable media stored offline n Optical n n n Tape n n CD-R (CD-Recordable) DVD-R (DVD-Recordable), DVD+RW, DVD-RW, … DLT, 8 mm, DAT, … Pros: cheap, easy, produces tangible item Cons: Low capacity, physical space requirements, unknown longevity, migration, potential format obsolescence Online/nearline storage systems n HSM: Hierarchical Storage Management n n Combine disk and automated tape storage with software to keep track of where files are located Locally managed or remote provider Pros: high capacity, migration can be handled by software, Cons: expensive, complex, network bandwidth issues, must trust service provider, potential single point of failure October 2, 2003 ALI Digital Library Workshop

October 2, 2003 ALI Digital Library Workshop October 2, 2003 ALI Digital Library Workshop

October 2, 2003 ALI Digital Library Workshop October 2, 2003 ALI Digital Library Workshop

HSM Example: IU’s Massive Data Storage Service (MDSS) n HPSS (High Performance Storage System) HSM Example: IU’s Massive Data Storage Service (MDSS) n HPSS (High Performance Storage System) software n n Four tape robots n n n Developed as collaboration of IBM and US national labs 2 in Bloomington, 2 in Indianapolis Data can be mirrored 540 terabytes (TB) total storage n ~75 TB used as of April 2001 October 2, 2003 ALI Digital Library Workshop

A digital object is more than just a file! Metadata Delivery page image files A digital object is more than just a file! Metadata Delivery page image files (JPEG) Hi-res page image files (TIFF) Text file (TEI/XML) October 2, 2003 ALI Digital Library Workshop

A digital object is more than just a file! EAD Finding Aid October 2, A digital object is more than just a file! EAD Finding Aid October 2, 2003 ALI Digital Library Workshop

DL Objects n Digital library “objects” have many parts n n Metadata Preservation/archival files DL Objects n Digital library “objects” have many parts n n Metadata Preservation/archival files Delivery files How do we keep them connected? n n Now: Good practice in file naming, directory organization, project documentation -not scalable! Future: Digital object repository October 2, 2003 ALI Digital Library Workshop

Data Persistence n n Key is migration Keeping the bits alive n n n Data Persistence n n Key is migration Keeping the bits alive n n n Keeping the bits understandable n n n Physical media Logical media format File format Metadata Small “pockets” of digital content pose a problem for migration October 2, 2003 ALI Digital Library Workshop

DL Object Repository Preservation version in HSM Users and applications Repository System Delivery version(s) DL Object Repository Preservation version in HSM Users and applications Repository System Delivery version(s) on web server Metadata records October 2, 2003 ALI Digital Library Workshop

Web Delivery Functions n Searching n n n Browsing n n For audio/video Reuse Web Delivery Functions n Searching n n n Browsing n n For audio/video Reuse n n n Page turning, image panning/zooming, … Streaming n n By subject, date, author, … Navigation n n Metadata Full text Downloading, format conversion Linking, persistent naming Access control n If necessary October 2, 2003 ALI Digital Library Workshop

Digital Collection Delivery Software n n Very complex systems Need to integrate data from Digital Collection Delivery Software n n Very complex systems Need to integrate data from databases, full-text search engines, file systems, and other sources Cross-collection searching Commercial n n Open source n n Content. DM, Luna Insight, various library management system addons UMich DLXS, Greenstone, Eprints, MIT DSpace, … Homegrown October 2, 2003 ALI Digital Library Workshop

October 2, 2003 ALI Digital Library Workshop October 2, 2003 ALI Digital Library Workshop

Demonstration n n Hoagy Carmichael Collection, IU Digital Library Program http: //www. dlib. indiana. Demonstration n n Hoagy Carmichael Collection, IU Digital Library Program http: //www. dlib. indiana. edu/collections/hoagy/ October 2, 2003 ALI Digital Library Workshop

October 2, 2003 ALI Digital Library Workshop October 2, 2003 ALI Digital Library Workshop

Exposing Digital Resources Broadly n Pay services n n RLG Cultural Materials, Archival Resources Exposing Digital Resources Broadly n Pay services n n RLG Cultural Materials, Archival Resources Free services n University of Michigan OAIster n n UIUC Digital Gateway to Cultural Heritage Materials n n oai. grainger. uiuc. edu OAI-PMH n n n www. oaister. org Open Archives Initiative Protocol for Metadata Harvesting www. openarchives. org Google October 2, 2003 ALI Digital Library Workshop

OAI Metadata Harvesting n n Extract metadata from various sources Build services on local OAI Metadata Harvesting n n Extract metadata from various sources Build services on local copies of metadata all searching, browsing, etc. performed on the metadata here user search for “Indiana” Service provider metadata harvested offline local copy of metadata harvested offline . . . October 2, 2003 ALI Digital Library Workshop Data providers

More Information n Bibliography to be made available at: n http: //www. dlib. indiana. More Information n Bibliography to be made available at: n http: //www. dlib. indiana. edu/workshops/alioct 03/ October 2, 2003 ALI Digital Library Workshop