Скачать презентацию Securing your digital heritage Practical tips and solutions Скачать презентацию Securing your digital heritage Practical tips and solutions

4db4931fec25daa58087ae2caa0f2aaf.ppt

  • Количество слайдов: 50

Securing your digital heritage: Practical tips and solutions for smaller archives Joanne Anthony, University Securing your digital heritage: Practical tips and solutions for smaller archives Joanne Anthony, University of London Computer Centre CC: Working Together Teamwork Puzzle Concept, by lumaxart, Flickr, http: //www. flickr. com/photos/lumaxart/2137737248/

USAGE RIGHTS: The contents of this Power. Point presentation are provided under the following USAGE RIGHTS: The contents of this Power. Point presentation are provided under the following open source licence: http: //creativecommons. org/licenses/by-sa/3. 0/ In summary, you are free: n to share — to copy, distribute and transmit the work n to remix — to adapt the work Under the following conditions: n Attribution. You must attribute the work by referring to ULCC http: //www. ulcc. ac. uk/ (but not in any way that suggests that we endorse you or your use of the work). n Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license.

Coverage n n n Smaller archives: sustainability challenges Digital Preservation: Why it matters Salvaging Coverage n n n Smaller archives: sustainability challenges Digital Preservation: Why it matters Salvaging your website: some options [Digital Preservation] What you can do: some practical solutions The Future? Useful resources for small archives

Smaller archives: sustainability challenges n n n n Lack of resources e. g. technical, Smaller archives: sustainability challenges n n n n Lack of resources e. g. technical, fundraising and bid writing skills; small time-specific or project-based budgets, skills vacuum with staff turnover Limited technological infrastructure and technical expertise to implement tools/software Not always linked to or supported by broader policy / mandate; and varying or non-existent levels of commitment by an institutional partner Project-based funding: difficult to integrate digital preservation as mainstream Limited resources and plans to actively curate digital assets over the long-term Audience Sustainability: “After the launch” Access: online archive portals: resources to undertake updates & remain ‘linked up’ Organisations themselves: already stretched resources in addition to core operations

Source: Bernie Grant Archives website: Home Page: http: //www. berniegrantarchive. org. uk/ Source: Bernie Grant Archives website: Home Page: http: //www. berniegrantarchive. org. uk/

Source: Bernie Grant Archives website: http: //www. berniegrantarchive. org. uk/archive/showcase. asp, Audio clips within Source: Bernie Grant Archives website: http: //www. berniegrantarchive. org. uk/archive/showcase. asp, Audio clips within ‘The Archive; Showcase’.

Source: Bernie Grant Archives website: http: //www. berniegrantarchive. org. uk/archive/showcase. asp, Video clips within Source: Bernie Grant Archives website: http: //www. berniegrantarchive. org. uk/archive/showcase. asp, Video clips within ‘The Archive; Showcase’.

Digital Preservation: Why bother? CC: Computer Says No_; by Benjibot: http: //www. flickr. com/photos/benjibot/3141128891/sizes/m/ Digital Preservation: Why bother? CC: Computer Says No_; by Benjibot: http: //www. flickr. com/photos/benjibot/3141128891/sizes/m/

Digital Preservation: Why it matters? n Increasing dependence on digital materials n They won’t Digital Preservation: Why it matters? n Increasing dependence on digital materials n They won’t take care of themselves… n Increasing risk of loss - rapid loss of cultural/corporate/community memories n Timeframe to salvage is short – most digital objects survive less than 5 years! n Archives, regardless of format, reveal what a society chooses to remember, and what it chooses to forget! n Community practitioners can make a vital contribution to informing & shaping archival practice (including digital preservation) in this digital age.

Web Archiving Options: …. What are they? Before you start: ACKNOWLEDGE you own this Web Archiving Options: …. What are they? Before you start: ACKNOWLEDGE you own this web resource and that something needs to be done – it won’t take care of itself. Options: n Do it yourself n Look for outside support

Do it yourself: … it’s within your reach! n It’s technically possible: web harvesting Do it yourself: … it’s within your reach! n It’s technically possible: web harvesting tools exist that are free and open source - more flexibility and control n Salvage the website’s core content/format types, not functionality n Save underlying databases as non-proprietary ‘CSV’ format n Keep supporting contextual documentation e. g. web specifications/contracts, database design documents etc. n Take screenshots of web interface to show original ‘look and feel’ n Save flat html files i. e. context of links between pages n Save image/audio/video files separately in preservation formats

Look for outside support n Nominate your site: n UK Web Archive, British Library Look for outside support n Nominate your site: n UK Web Archive, British Library http: //www. webarchive. org. uk/ukwa/info/nominate See tips on "Making Your Website Crawler-Friendly” http: //www. webarchive. org. uk/ukwa/info/technical n Huge burden lifted when national institutions are committed to capturing and sustaining these resources n “Selection policies may be inadequate, reactive or too broad in selection; so you must be proactive in archiving your sites. ” [nb: personal opinion, not that of ULCC]

Let other people do it? n Internet Archive: http: //www. archive. org/ – they Let other people do it? n Internet Archive: http: //www. archive. org/ – they possibly have snapshots already n Good to have another backup here – some room for flexibility over the harvesting process n http: //www. archive-it. org/ Sign up and improve harvesting n Snapshots may be incomplete or sporadic, and dynamic elements missed in web-harvest e. g. databases, audio/video clips etc.

Practical tips for digital preservation: n Identify digital objects and assess risks and solutions: Practical tips for digital preservation: n Identify digital objects and assess risks and solutions: What n “Lots of copies keeps stuff safe”! Different media, in different places n Different preservation solutions for different resources n Develop a preservation plan: Document everything e. g. digital capture n Stick to best practice and widely accepted standards e. g. have you got & do you need to keep everything? (e. g. identify formats, survey records, selection/appraisal, keep inventory/capture metadata, link cataloguing details to objects) - spread the risks; but also document where you’ve put copies Vs masters, and link them to their contextual inventory/cataloguing details (migration, emulation, refresh, replicate etc. ) There’s more than one way! processes/workflow, formats/standards used, budget, responsibilities; migration, test and refresh procedures etc. metadata, formats etc. – more chance your collections will be used, integrated with other resources, & preserved over the long term.

What you can do: take stock! What digital assets have you got/are about to What you can do: take stock! What digital assets have you got/are about to create? n n n Electronic documents - ‘digital paper’ (including email); Spreadsheets; Databases (e. g. collection management system, underlying database of a website, research datasets); Digital audio/video/images; Websites; Web 2. 0: wikis, blogs etc. ; Exotic forms: virtual worlds, games, programs etc.

Source: Dance Heritage Coalition (U. S): http: //www. danceheritage. org/ NB: Of note, see Source: Dance Heritage Coalition (U. S): http: //www. danceheritage. org/ NB: Of note, see ‘Digital Video Preservation Reformatting Project’; and see link to ‘Dance Videotapes at Risk’ for inventory guidance.

Good Digital Preservation depends on: Taking stock of what you can and can’t control Good Digital Preservation depends on: Taking stock of what you can and can’t control operationally; and where you need outside help: 1. Organizational Infrastructure e. g. policies, preservation plans, 2. Technological Infrastructure e. g. hardware/software, 3. Resources Framework e. g. staff, technology, space, storage etc. institutional commitment etc. storage/formats, security, workflow, procedures, archival/technical skills etc.

Preservation Strategies – there’s more than one way… Migration n Obsolescence is our enemy! Preservation Strategies – there’s more than one way… Migration n Obsolescence is our enemy! Transfer content from one format (such as a Word document) onto a different format (PDF). So the resource remains functional and accessible. Refreshment n Copying data onto another example of the same storage media (such as from an old CD-ROM to a new CD-ROM). “Same file, new carrier”. Emulation n Replicating functionality of an obsolete application (often as original system is no longer available). E. g. playing vintage computer games on a contemporary games emulator. Using virtual machines/programs to make new computers behave like old ones.

Refreshing Word V 2 file Migrating Word V 2 file PDF file Refreshing Word V 2 file Migrating Word V 2 file PDF file

Quick ways to reduce loss: n Replicate data: Another Preservation Strategy: Keep lots of Quick ways to reduce loss: n Replicate data: Another Preservation Strategy: Keep lots of copies of digital objects on different storage media (and use different brands) n Store any CD etc. produced in a secure, stable, and controlled environment n Handle media properly n Ensure off-site storage of copies for security purposes n Store archival-quality digital images on a server, if possible n Store copies in various locations, using combination of offline and online storage media

More ways… n Maintain and refresh data: e. g. implement regular refreshment cycles to More ways… n Maintain and refresh data: e. g. implement regular refreshment cycles to copy onto newer media n Migrate formats e. g. every 3 to 5 years, and quality check integrity of data after each migration n At point of creation of object, make preservation copies (assuming licensing/copyright permission i. e. engage with rights holders of software and hardware etc. ) n Subject media to management routines e. g. media testing, keep inventory of what data is held where Warning: Backups of networks aren’t preservation, and storage on disks etc doesn’t mean permanence (not even gold CD’s!) - so, have more than one approach…

Don’t forget your source material! Videodrome street theater: CC: by Jima: http: //www. flickr. Don’t forget your source material! Videodrome street theater: CC: by Jima: http: //www. flickr. com/photos/jima/3711736520/

Storage and Re-use n As a minimum you need to create a high quality Storage and Re-use n As a minimum you need to create a high quality ‘master’ from which other versions of your digital material (for example images you might make available over the Internet) can be made. This digital master should be stored independently e. g offline. n Link digital objects to e. g. an inventory spreadsheet or collection management database.

Create once, use many times! Floppy Pencil Box, CC: by alwright 1, http: //www. Create once, use many times! Floppy Pencil Box, CC: by alwright 1, http: //www. flickr. com/photos/alwright 1/27914688 94/ Nice Display: CC: by Mike-Andrews: http: //www. flickr. com/photos/smaller-spaces/3284418116/sizes/m/

More to Preservation than Storage… Curation of whole life cycle of a digital object More to Preservation than Storage… Curation of whole life cycle of a digital object n n n n Ingest: accessioning/incoming donations, selection/appraisal etc. Data management: metadata/cataloguing Access and Delivery: dissemination Storage Preservation Planning Administration “Designated communities” you are serving

A matter of formats… n n n Understanding formats is crucial to long-term accessibility A matter of formats… n n n Understanding formats is crucial to long-term accessibility and preservation Find out what you can about formats in use Make informed decisions about preservation formats n n n Pick ones that conform to published standards Contrast preservation with reuse Databases, websites: keep supporting documentation to allow reconstruction Buzzwords: WAV, AVI, TIFF, PDF, CSV Avoid ‘lossy compression’ for preservation e. g. JPEG, MP 3

Access Vs Preservation “Access is about current fashion; preservation is about enduring style”… Papyrus. Access Vs Preservation “Access is about current fashion; preservation is about enduring style”… Papyrus. gif 162 k CC: i. Pod shuffle 3 G : Ntr 23: http: //www. flickr. com/photos /ntr 23/3348000167/sizes/s/ Audio. mp 3 CC: RCA Dog: Heather. L: http: //www. flickr. com/photos /suzieblue/512881759/sizes/s / Audio. wav

Have a cunning plan. . . a Preservation Plan? : n Identify what to Have a cunning plan. . . a Preservation Plan? : n Identify what to preserve and define your selection criteria. n Identify roles and responsibilities n Determine requirements for donation/accessioning/ingest (formats, metadata, storage media). n Degree of integration with storage, backup, and preservation for non-digital resources. n Maintenance strategies (backups - online and/or offline, monitoring, refreshing). n A prioritized plan is needed, with built-in review periods to assess potential changes to technology and storage media. n In-house versus outsourcing options. Outline any reliance on outside consulting and archiving services, if any contract negotiation, etc.

Source: JISC Project Report: Digitisation Programme: Preservation Study April 2009 http: //www. jisc. ac. Source: JISC Project Report: Digitisation Programme: Preservation Study April 2009 http: //www. jisc. ac. uk/whatwedo/programmes/digitisationpreservationstudy. aspx

Digital Preservation in short: Act now! n There is no access without preservation! n Digital Preservation in short: Act now! n There is no access without preservation! n Needs active and ongoing management n Preservation and its strategies need to be led by our values: research values, your users, your requirements/priorities - not simply technology-led n Small steps usually better than no steps at all n Big institutions and DP players don’t have it all figured out – Must act now! n Don’t need to do it all or know everything at once! Talk to ICT staff. Seek external advice e. g. Digital Preservation Coalition (DPC) n Preservation should not be postponed until a perfect solutions appears…

The Future? n Need more joined-up thinking on digital preservation – between n DP The Future? n Need more joined-up thinking on digital preservation – between n DP is everyone’s responsibility n Get talking to other departments/similar organisations to yours/natural partners: ICT staff, project managers, information creators, custodians, project managers, users, funders, ICT etc managers, users, partners like Local Authorities – all stakeholders n Digital Repositories / Digital Asset Management Systems n Archive Press: Blog self-archiving – ULCC/British Library led project, n The DP community needs you! Contribute to archival and DP processes – you are close to, and know your users (“designated http: //www. dcc. ac. uk/resource/briefing-papers/digital-repositories/ JISC funded http: //archivepress. ulcc. ac. uk/ communities”); and can advocate their needs and concerns.

Useful resources for smaller archives n n n n JISC Digital Media http: //www. Useful resources for smaller archives n n n n JISC Digital Media http: //www. jiscdigitalmedia. ac. uk/ - excellent source of introductory advice on still images, moving images and sound, from file format selection to access and preservation. See Introduction to Digital Preservation: http: //www. jiscdigitalmedia. ac. uk/crossmedia/advice/an-introduction-to-digital-preservation/ Digital Preservation Coalition (see useful advice and publications) http: //www. dpconline. org Wordpress: Alan’s Notes and Thoughts on Digital Preservation: http: //alanake. wordpress. com/so-you-want-to-keepall-your-stuff Digital. NZ: Make It Digital website: http: //makeit. digitalnz. org/guidelines/preserving-digital-content/ Managing and Preserving Community Archives, National Preservation Office Te Tari Tohu Taonga, June 2005, http: //www. natlib. govt. nz/catalogues/library-documents/managing-community-archives "Rethinking Personal Digital Archiving, Part 1: Four Challenges from the Field", Catherine C. Marshall, D-Lib Magazine, March/April 2008, Volume 14 Number ¾, http: //www. dlib. org/dlib/march 08/marshall/03 marshall-pt 1. html Digital Curation Centre Case Studies and Interviews: Presto. Space: Preservation towards storage and access. Standardised Practices for Audiovisual Contents in Europe http: //www. prestospace. org) March 2008 http: //www. dcc. ac. uk/webfm_send/110 International Association of Sound and Audiovisual Archivists at: http: //www. iasa-web. org/special_publications. asp PARADIGM - Personal Archives Accessible in Digital Media (paradigm) project: can be aligned better to community archives http: //www. paradigm. ac. uk/workbook/appendices/guidelines-tips. html (e. g. practical tips and guidelines for creators & donors of personal digital archives). US-Based Dura. Space Blog: Small Archives: http: //www. fedoracommons. org/confluence/display/FCCWG/Small+Archives Around the World in 80 Gigabytes: Alexandra Eveleigh's Archives & Technology Blog: Tag on 'small archives‘ http: //80 gb. wordpress. com/tag/small-archives/ Preserving Your Personal Digital Archives, Heather Louise Mae Bowden, June 24 th, 2008, The Long Now Blog, http: //blog. longnow. org/2008/06/24/preserving-your-personal-digital-archives/ Wright, R (2008) Preservation of Digital Audiovisual Content. DPE Briefing Paper. Retrieved 19 October 2008 from: http: //www. digitalpreservationeurope. eu/publications/ briefs/audiovisual_v 3. pdf Film Archive forum - http: //bufvc. ac. uk/faf/guidance. htm

Source: Alan’s notes and thoughts on digital preservation: http: //alanake. wordpress. com/so-you-want-to-keep-all-your-stuff/ Source: Alan’s notes and thoughts on digital preservation: http: //alanake. wordpress. com/so-you-want-to-keep-all-your-stuff/

Source: Digital. NZ : http: //makeit. digitalnz. org/ Source: Digital. NZ : http: //makeit. digitalnz. org/

General resources on digital preservation: n n n n JISC Digital Media http: //www. General resources on digital preservation: n n n n JISC Digital Media http: //www. jiscdigitalmedia. ac. uk/ Digital Preservation Coalition (see useful advice and publications) http: //www. dpconline. org Digital Curation Centre http: //www. dcc. ac. uk/ Digital Preservation Training Programme http: //www. ulcc. ac. uk/dptp/ Digital Preservation Management Online Tutorial (Cornell University) http: //www. icpsr. umich. edu/dpm/ AV Preservation: Prestospace http: //prestospace. org/ AV Preservation: TAPE http: //www. tape-online. net/ D-Lib Magazine http: //www. dlib. org/ UKOLN http: //www. ukoln. ac. uk/ Joint Information Systems Committee http: //www. jisc. ac. uk The National Archives http: //www. nationalarchives. gov. uk/preservation/digital. htm The British Library http: //www. bl. uk/about/collectioncare/digpresintro. html AHDS preservation handbooks http: //ahds. ac. uk/preservation/ahds-preservationdocuments. htm Listserv for Digital Preservation https: //www. jiscmail. ac. uk/cgibin/webadmin? A 0=digital-preservation PLANETS: http: //www. planets-project. eu/

Additional resources to take home: Additional resources to take home:

Video/Film & Audio Preservation Significant characteristics to consider when storing, transmitting, and preserving: n Video/Film & Audio Preservation Significant characteristics to consider when storing, transmitting, and preserving: n Film/Video: resolution, size, aspect ratio, frame rate and fields, bit rate, bit depth and compression method (codec) n Audio: bit depth, sampling rate, compression method (codec), and number of channels

DPE: Preservation of digital AV content What to do - Despite the problems, some DPE: Preservation of digital AV content What to do - Despite the problems, some clear statements can be made about AV preservation: Preserve the artefact: Keep the ‘original’, even if compressed. ‘Preserve the bits’, whatever else is done. AV content has one advantage: there is a lot of it, in a relatively small number of formats. Methods to ‘play the bits’ may exist. n Decode to uncompressed and save as uncompressed (in addition to keeping the original). This is a demanding requirement for video (100 GB/hr for 625 -line TV), but storage is now very inexpensive. n Enhance the metadata: A file extension (e. g. . wav, . avi is not sufficient). There are over 50 registered variants of encoding within the definition of. wav; MPEG-1 and MPEG-2 use the extension , mpg. Ideally, there will be a metadata extraction tool; otherwise, manual testing and documentation is needed. n You are not alone: Use the file-type registries, software repositories, emulation platforms, and Preservation Guides listed in the [DPE] references. n Source: Wright, R (2008) Preservation of Digital Audiovisual Content. Digital Preservation Europe (DPE) Briefing Paper. Retrieved 19 October 2008 from: http: //www. digitalpreservationeurope. eu/publications/briefs/audiovisual_v 3. pdf

Common reasons for data loss n n n n Obsolete file formats / software Common reasons for data loss n n n n Obsolete file formats / software / media Insufficient catalogue information/context (“metadata”) Corrupted files on portable media Uncontrolled number of file formats Insufficiently documented proprietary file formats Inaccessible data at point of donation to archive Software updates or emulations not fully compatible with data Data physically lost Source: Digital Preservation Coalition, Survey 2005

Storage Media Variety of online and offline storage media: n n n n CD-ROM Storage Media Variety of online and offline storage media: n n n n CD-ROM DVD-ROM LTO (Linear Tape Open) DLT (Digital Linear Tape) Networked/managed server storage Hard drives e. g. Online storage is often mirrored across multiple disks using redundant disk arrays (RAID). Image and video hosting websites e. g. Flickr http: //www. flickr. com/photos/nga_researchlibrary/ Tips: Never use rewritable discs for long-term storage. Don’t buy media from one single supplier or name brand – spread risks.

Media testing n n n n All media needs periodic testing e. g. random Media testing n n n n All media needs periodic testing e. g. random error checking Use of brand names doesn’t guarantee longevity – (use variety of brands/suppliers) Verify initial transfer to new media Confirm continued viability of stored files Spot degradation prior to permanent loss Spot trends in media degradation Support media refreshing and migration decisions Confidence in longevity requires ¨ Initial testing of drives and media ¨ Proper handling and storage ¨ Periodic re-sampling

Other Digital Preservation considerations: n Best Practice Standards: OAIS (Open Archive Ingest System), Trusted Other Digital Preservation considerations: n Best Practice Standards: OAIS (Open Archive Ingest System), Trusted Digital Repositories (TDR), PREMIS, DRAMBORA n Metadata: “data about data” (embedded like ‘TIFF’ or external like cataloguing information). Note: preservation metadata needed to preserve digital objects over time (PREMIS) n Intellectual Property Rights affect DP e. g. database rights (expire after c. 15 years); find out with website contractor: do you have the right(s) to make a preservation copy or migrate etc? n Tools and technologies e. g. media testing, format identification/migration, automatic metadata extraction n Formats – in one sense DP is about making calculated decisions about formats – which will last, which we can trust, what do we know about them; can we be assured to migrate to/from them in the future…

Preservation Planning Tools Migration Decision-Making: PRONOM (UK National Archives) File format registry, offers, e. Preservation Planning Tools Migration Decision-Making: PRONOM (UK National Archives) File format registry, offers, e. g: Basic data about file formats Searchable Web database by file format, product name, vendor name, date Information about file formats and the software and hardware required to access them

Example: Database Preservation National Digital Archive of Datasets (NDAD) http: //www. ndad. nationalarchives. gov. Example: Database Preservation National Digital Archive of Datasets (NDAD) http: //www. ndad. nationalarchives. gov. uk/

Example: Storage Plan n n Resources currently in use: kept online with regular backup, Example: Storage Plan n n Resources currently in use: kept online with regular backup, refreshment, and migration. Whether online or not: all archival versions (highest resolution, fullest capture, lossless compression) are written to approved storage media and stored off-line in the Library Digital Program Division, with a schedule for regular refreshment, and migration. For archival versions which are not currently online: a duplicate offline copy is created for storage at a different site. All versions, online and offline, are tracked through the CUL local asset management system. Source: Columbia University Libraries Policy for Preservation of Digital Resources July 2000 (rev. 2006) - http: //www. columbia. edu/cu/lweb/services/preservation/dlpolicy. html

Before a Preservation Plan: n n n List resources: All types of digital resources Before a Preservation Plan: n n n List resources: All types of digital resources that you either currently or plan to create, own or subscribe to Identification: Document risks for each resource type – e. g. website changes, software version changes, media degradation, hardware failure etc. Implications: Consider implications for your service in worst case scenario. Which resources are ephemeral Vs permanent? Assessment: Assess the value of groups of resources and the impact on your service if these no longer exist or are inaccessible. Solutions: For each case, identify what the options are, how much they will cost and what they will require in terms of staff time and skills. Decide: Decide on the strategies which are most appropriate for each type of resource. Source: CC: Adapted from UKOLN “Developing your Digital Preservation Policy” http: //www. ukoln. ac. uk/cultural-heritage/documents/briefing-42/

Examples of Preservation Policies: n n n UKOLN Guidance: Developing Your Digital Preservation Policy Examples of Preservation Policies: n n n UKOLN Guidance: Developing Your Digital Preservation Policy http: //www. ukoln. ac. uk/cultural-heritage/documents/briefing-42/briefing 42. doc Yale University Library: Digital Preservation Policy http: //www. library. yale. edu/iac/DPC/final 1. html Columbia University Libraries Policy for Preservation of Digital Resources http: //www. columbia. edu/cu/lweb/services/preservation/dlpolicy. html Preservation Policy of “Discover. Archive” http: //discoverarchive. vanderbilt. edu/bitstream/handle/1803/2361/Preservati on. Policy. pdf? sequence=1 Moving Here: Digitisation Guidelines (Audio and Video) (See page 13) http: //www. movinghere. org. uk/help/documents/audiovideo_guidelines_2005. pdf OCLC Digital Archive Preservation Policy and Supporting Documentation http: //www. oclc. org/support/documentation/digitalarchive/preservationpolicy. pdf

USAGE RIGHTS: The contents of this Power. Point presentation are provided under the following USAGE RIGHTS: The contents of this Power. Point presentation are provided under the following open source licence: http: //creativecommons. org/licenses/by-sa/3. 0/ In summary, you are free: n to share — to copy, distribute and transmit the work n to remix — to adapt the work Under the following conditions: n Attribution. You must attribute the work by referring to ULCC http: //www. ulcc. ac. uk/ (but not in any way that suggests that we endorse you or your use of the work). n Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license.