286b7bc39ede7a3bcd8059603942b7b7.ppt
- Количество слайдов: 22
Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana. edu
Storage n Why is storage an issue? n n Space requirements Persistence Accessibility Needs depend on purpose of storage n n n Capture/encoding Access/delivery Preservation October 2, 2003 ALI Digital Library Workshop
Storage: Working Space n n Space for storage of digital files during capture/encoding/quality control process Possibilities n n n PC hard drive File server / LAN Issues n Capacity, backup, speed, accessibility October 2, 2003 ALI Digital Library Workshop
Storage: Access/Delivery n Storage of derivative files for web delivery n n Possibilities n n Image, audio, video, text files, etc. Local web server Commercially-hosted web site Consortial service provider Issues: capacity, backup, performance, software integration, maintenance/migration October 2, 2003 ALI Digital Library Workshop
Storage: Preservation n n Much harder problem Longer term n n n Larger files n n Issues of longevity of media, hardware, file format “Where did we put the files? ” Hard disk storage, traditional backup methods not cost-effective Infrequency of access n Problems do not become immediately evident October 2, 2003 ALI Digital Library Workshop
Long-Term Storage Options n Removable media stored offline n Optical n n n Tape n n CD-R (CD-Recordable) DVD-R (DVD-Recordable), DVD+RW, DVD-RW, … DLT, 8 mm, DAT, … Pros: cheap, easy, produces tangible item Cons: Low capacity, physical space requirements, unknown longevity, migration, potential format obsolescence Online/nearline storage systems n HSM: Hierarchical Storage Management n n Combine disk and automated tape storage with software to keep track of where files are located Locally managed or remote provider Pros: high capacity, migration can be handled by software, Cons: expensive, complex, network bandwidth issues, must trust service provider, potential single point of failure October 2, 2003 ALI Digital Library Workshop
October 2, 2003 ALI Digital Library Workshop
October 2, 2003 ALI Digital Library Workshop
HSM Example: IU’s Massive Data Storage Service (MDSS) n HPSS (High Performance Storage System) software n n Four tape robots n n n Developed as collaboration of IBM and US national labs 2 in Bloomington, 2 in Indianapolis Data can be mirrored 540 terabytes (TB) total storage n ~75 TB used as of April 2001 October 2, 2003 ALI Digital Library Workshop
A digital object is more than just a file! Metadata Delivery page image files (JPEG) Hi-res page image files (TIFF) Text file (TEI/XML) October 2, 2003 ALI Digital Library Workshop
A digital object is more than just a file! EAD Finding Aid October 2, 2003 ALI Digital Library Workshop
DL Objects n Digital library “objects” have many parts n n Metadata Preservation/archival files Delivery files How do we keep them connected? n n Now: Good practice in file naming, directory organization, project documentation -not scalable! Future: Digital object repository October 2, 2003 ALI Digital Library Workshop
Data Persistence n n Key is migration Keeping the bits alive n n n Keeping the bits understandable n n n Physical media Logical media format File format Metadata Small “pockets” of digital content pose a problem for migration October 2, 2003 ALI Digital Library Workshop
DL Object Repository Preservation version in HSM Users and applications Repository System Delivery version(s) on web server Metadata records October 2, 2003 ALI Digital Library Workshop
Web Delivery Functions n Searching n n n Browsing n n For audio/video Reuse n n n Page turning, image panning/zooming, … Streaming n n By subject, date, author, … Navigation n n Metadata Full text Downloading, format conversion Linking, persistent naming Access control n If necessary October 2, 2003 ALI Digital Library Workshop
Digital Collection Delivery Software n n Very complex systems Need to integrate data from databases, full-text search engines, file systems, and other sources Cross-collection searching Commercial n n Open source n n Content. DM, Luna Insight, various library management system addons UMich DLXS, Greenstone, Eprints, MIT DSpace, … Homegrown October 2, 2003 ALI Digital Library Workshop
October 2, 2003 ALI Digital Library Workshop
Demonstration n n Hoagy Carmichael Collection, IU Digital Library Program http: //www. dlib. indiana. edu/collections/hoagy/ October 2, 2003 ALI Digital Library Workshop
October 2, 2003 ALI Digital Library Workshop
Exposing Digital Resources Broadly n Pay services n n RLG Cultural Materials, Archival Resources Free services n University of Michigan OAIster n n UIUC Digital Gateway to Cultural Heritage Materials n n oai. grainger. uiuc. edu OAI-PMH n n n www. oaister. org Open Archives Initiative Protocol for Metadata Harvesting www. openarchives. org Google October 2, 2003 ALI Digital Library Workshop
OAI Metadata Harvesting n n Extract metadata from various sources Build services on local copies of metadata all searching, browsing, etc. performed on the metadata here user search for “Indiana” Service provider metadata harvested offline local copy of metadata harvested offline . . . October 2, 2003 ALI Digital Library Workshop Data providers
More Information n Bibliography to be made available at: n http: //www. dlib. indiana. edu/workshops/alioct 03/ October 2, 2003 ALI Digital Library Workshop


