Скачать презентацию Introduction to Digital Libraries Week 1 Digital Libraries Скачать презентацию Introduction to Digital Libraries Week 1 Digital Libraries

3ae45f0c1bcd510fa6447dc28f57a848.ppt

  • Количество слайдов: 50

Introduction to Digital Libraries Week 1: (Digital) Libraries Defined Old Dominion University Department of Introduction to Digital Libraries Week 1: (Digital) Libraries Defined Old Dominion University Department of Computer Science CS 695 Fall 2005 Michael L. Nelson 08/29/05 ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Overview • class info: • http: //www. cs. odu. edu/~mln/teaching/cs 695 -f 05/ • Overview • class info: • http: //www. cs. odu. edu/~mln/teaching/cs 695 -f 05/ • http: //list. odu. edu/listinfo/cs 695 -f 05/ • class grading • 4 assignments that gradually culminate in the student building a digital library of their own • 1 mid-term, 1 final • both take home ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

What is a Library? Main Entry: li·brary Pronunciation: 'l. I- What is a Library? Main Entry: li·brary Pronunciation: 'l. I-"brer-E; British usually and US sometimes -br&r-E; US sometimes -br. E, ÷-"ber-E Function: noun Inflected Form(s): plural -brar·ies Etymology: Middle English, from Medieval Latin librarium, from Latin, neuter of librarius of books, from libr-, liber inner bark, rind, book 1 a : a place in which literary, musical, artistic, or reference materials (as books, manuscripts, recordings, or films) are kept for use but not for sale b: a collection of such materials 2 a : a collection resembling or suggesting a library b: MORGUE 2 3 a : a series of related books issued by a publisher b: a collection of publications on the same subject 4: a collection of sequences of DNA and especially recombinant DNA that a re maintained in a suitable cellular environment and that represent the genetic material of a particular organism or tissue http: //www. m-w. com/cgi-bin/dictionary? book=Dictionary&va=library&x=0&y=0 ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

A Tool For Communicating With The Future… SCROLLS FROM THE DEAD SEA The Ancient A Tool For Communicating With The Future… SCROLLS FROM THE DEAD SEA The Ancient Library of Qumran and Modern Scholarship http: //www. ibiblio. org/expo/deadsea. scrolls. exhibit/intro. html ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

A History of Libraries in 1 Slide • Lyceum - Ancient Greece • http: A History of Libraries in 1 Slide • Lyceum - Ancient Greece • http: //en. wikipedia. org/wiki/Lyceum • Alexandria - Ancient Egypt • http: //en. wikipedia. org/wiki/Library_of_Alexandria • (…skipping a bit…) • Boston Public Library - First US public lending library (1848) • http: //en. wikipedia. org/wiki/Boston_Public_Library • http: //www. bpl. org/ • “The commonwealth requires the education of the people as the safeguard of order and liberty” more info: http: //www. dlib. org/dlib/january 00/01 levy. html ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

“Lone Scientist” Stereotypes Max Munk H. J. E. Reid Enrico Fermi http: //history. nasa. “Lone Scientist” Stereotypes Max Munk H. J. E. Reid Enrico Fermi http: //history. nasa. gov/SP-4103/ch 4. htm http: //www. anl. gov/Media_Center/logos 20 -1/fermi 01. htm John Stack Albert Einstein http: //www. hq. nasa. gov/office/pao/History/x 1/stack. html http: //www. artnet. com/artist/92724/Vishniac_Roman. htm ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Vannevar Bush (1890 -1974) • Director of the Office of Scientific Research and Development Vannevar Bush (1890 -1974) • Director of the Office of Scientific Research and Development • lead 6000 scientists in R&D for WWII • previously, science lacked large scale teams • also director of NACA (1939)! • Predicted many technological advances • the “memex” is one whose spirit we are implementing • the purpose was to provide scientists the capability to exchange information; to have access to the totality of recorded information image from: http: //www. ibiblio. org/pioneers/bush. html ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Memex • • Integrated computer, keyboard, and desk “mechanized private file and library” • Memex • • Integrated computer, keyboard, and desk “mechanized private file and library” • remove drudgery from information retrieval • suggested implementation was microfilm • various user operations are suggested • Associative indexing was the main purpose • “the process of tying two items together is the important thing” • prelude to hypertext. . . Image from: http: //www. dynamicdiagrams. com/case_studies/mit_memex. html ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Memex • Information could come pre-associatively indexed, but the key point was user customization Memex • Information could come pre-associatively indexed, but the key point was user customization • WWW still does not provide that today • Bush observes that tools change our way of doing, and expand the horizons before us • full impact of WWW and DLs still not known • Interesting: Bush’s AM article did not predict freetext searching. . . • knowledge trails only; DMOZ w/o keyword searching ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

from Lesk, http: //community. bellcore. com/lesk/columbia/session 2/ digital preservation research www. digitalpreservation. gov “digital from Lesk, http: //community. bellcore. com/lesk/columbia/session 2/ digital preservation research www. digitalpreservation. gov “digital information lasts forever -or 5 years, whichever comes first” -- Jeff Rothenberg ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ ODU CS 695 Fall 2005 Michael L. from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

What is a Digital Library (DL)? • “…a managed collection of information, with associated What is a Digital Library (DL)? • “…a managed collection of information, with associated services, where the information is stored in digital formats and accessible over a network” (Arms, p. 2) • there any number of alternate definitions, but this seems fair enough • no mention of architecture, implementation, content, etc. ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

How is a DL different from a database? • A traditional SQL database has How is a DL different from a database? • A traditional SQL database has as its basic element data items in a relation: • select name • from employee, project • where employee. deptnumber = “ 25” AND • project. number = “ 100” • • databases exploit known structures and relations DBMS retrieval is not probabilistic (Frakes, Baeza -Yates, p. 3) ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

How is a DL different from the WWW? • The keyword is managed • How is a DL different from the WWW? • The keyword is managed • The WWW is not managed • Some meta searchers (Yahoo, Lycos, DMOZ) attempt to add an organizational framework to their web holdings • However, most are focused on keyword searching (i. e. , Google) ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

A Garden vs. Desert Scrub http: //www. gingerbread-mansion. com/ourgarden. html http: //www. filmdeserts. com/Open%20 A Garden vs. Desert Scrub http: //www. gingerbread-mansion. com/ourgarden. html http: //www. filmdeserts. com/Open%20 Desert%20 Areas_1. htm ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

How is a DL different from the WWW? • Another key difference is who How is a DL different from the WWW? • Another key difference is who controls the input into the system • most meta searchers hunt down their holdings • Lycos is short for Lycosidae lycosa (the “wolf spider”), which pursues its prey and does not build a web (Mauldin, IEEE Expert, 1/97) • some (DMOZ, Yahoo) have humans in the loop for review and classification • DLs are generally more tightly controlled, and have a targeted customer set ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

DL = Content + Services • “Why not just use the WWW” ? • DL = Content + Services • “Why not just use the WWW” ? • WWW by itself has low archival & management characteristics • “Why not use a RDBMS? ” • In the same way that a card catalog is not a TL, a RDBMS is candidate technology for use in DLs • DL is the union of the content and services defined on the content ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

The Study of Digital Libraries is Multidisciplinary • computer science • tools, protocols, transport The Study of Digital Libraries is Multidisciplinary • computer science • tools, protocols, transport • information science • models of information access and storage • human factors • usability, adaptability • law • rights management • economics • “it’s all about the benjamins…” ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

How is a DL Different from a Traditional Library? • TL has as its How is a DL Different from a Traditional Library? • TL has as its focus physical objects • even if the card catalog (metadata) is electronic, the purpose is to point you to a physical location • trafficking in physical objects has both obvious and subtle implications • object can exist only in 1 place • if you have it, I can’t have it (zero-sum distribution) • I have to go to the object, or wait for it to come to me ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

TLs vs. DLs • DLs clearly better than TLs at: • Dissemination, storing information TLs vs. DLs • DLs clearly better than TLs at: • Dissemination, storing information variety • However, TL objects are more survivable • Who will archive the research information? • the publishers? • the institutions? • the authors? • Will the average DL object still be accessible in 10 years? • see me for digital preservation related projects, seminars image from: http: //www. ancientegypt. co. uk/writing/rosetta. html ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

How is a DL Different from a Traditional Library? • Digital Library • removing How is a DL Different from a Traditional Library? • Digital Library • removing the physical restriction has obvious benefits • multiple access, multiple listings, electronic transmission • also complicates many other issues. . . • intellectual property, terms and conditions, etc. • Note that a TL offers additional social and educational benefits • Most TLs also offer hybrid services too. ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ ODU CS 695 Fall 2005 Michael L. from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

TLs vs. DLs • Where does publishing stop, and libraries begin? • there has TLs vs. DLs • Where does publishing stop, and libraries begin? • there has always been tensions between TLs and traditional publishers, but the roles were fairly well defined • DLs can muddle the separation of these responsibilities • result: conflict, and/or new models ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Traditional Players service book store publisher library archive responsibility over time ODU CS 695 Traditional Players service book store publisher library archive responsibility over time ODU CS 695 Fall 2005 Michael L. Nelson [email protected]s. odu. edu

Does Content Influence Design? Which can you personally verify? Which would you accept from Does Content Influence Design? Which can you personally verify? Which would you accept from a stranger? ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

What is Scientific and Technical Information? • STI is the collection of materials, independent What is Scientific and Technical Information? • STI is the collection of materials, independent of format, used in research, development, and other technical activities • reports, data sets, images, videos, software, etc. • It is also the output of such R&D activities • STI includes both white and grey literature ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

White and Grey Literature • The line between the two is not always clear White and Grey Literature • The line between the two is not always clear • Grey Net offers an admittedly obsolete definition of grey literature: • “Information produced on all levels of government, academics, business and industry in electronic and print formats not controlled by commercial publishing" • http: //www. greynet. org/ ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

White and Grey Literature • Intuitively : • White: author and publisher are often White and Grey Literature • Intuitively : • White: author and publisher are often different, the work has been independently reviewed, obtaining the work is straightforward • Grey: may not be reviewed, often “published” from the source origin, may be difficult to obtain ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Literature Examples • White • Journals, books, edited conference proceedings, etc. • Grey • Literature Examples • White • Journals, books, edited conference proceedings, etc. • Grey • technical reports, government reports, unedited proceedings, non-document STI, etc. ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

So Why Worry About Grey Literature? • White is generally perceived as having a So Why Worry About Grey Literature? • White is generally perceived as having a higher pedigree, easier to obtain (in a sense), etc. • • it is generally less timely and is often a summary or abstract of a larger body of work Pyramid of STI ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

History of STI Distribution • Originally, scientists published books to document their findings • History of STI Distribution • Originally, scientists published books to document their findings • but the delay was terribly long • Then, scientists exchanged personal letters among themselves for rapidity • but this is point-to-point communication, not broadcast ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

History of STI Distribution • The current system of journals evolved in the 17 History of STI Distribution • The current system of journals evolved in the 17 th century as the synthesis of both previous models • more timely than books, more available than letters • in fact, some journals with the emphasis on “speed” still have “Letters” in their title • historical information from (Odlyzko, 1995) ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

But Are Journals Still Relevant? • People still publish in them (tenure and promotions But Are Journals Still Relevant? • People still publish in them (tenure and promotions are still largely “count the journal publications” exercises) • But do people read them? • The current use of journals is now: • “a medium for priority claiming, quality control, and archiving scientific work” (Bennion, 1994) ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Unavailable, or Not Worth Citing? from Lesk, http: //community. bellcore. com/lesk/columbia/session 13/ figure 9. Unavailable, or Not Worth Citing? from Lesk, http: //community. bellcore. com/lesk/columbia/session 13/ figure 9. 7 in text ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

But Are Journals Still Relevant? • • • How important is refereeing anyway? Most But Are Journals Still Relevant? • • • How important is refereeing anyway? Most rejected papers end up published somewhere else (Lesk, p. 214) Referees have rejected many worthy papers, including some that are the most cited in their respective journals (Campanario, 1996) • this is another well studied problem, contact me for more details ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

But Are Journals Still Relevant? • Different disciplines have adapted: • physics - “the But Are Journals Still Relevant? • Different disciplines have adapted: • physics - “the small amount of filtering provided by refereed journals plays no effective role in our research” (Ginsparg, 1994) • math - “it is rare for experts in any mathematical subject to learn of a major new development in their area through a journal publication” (Odlyzko, 1995) ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

But Are Journals Still Relevant? • computer science • “in his area, journals have But Are Journals Still Relevant? • computer science • “in his area, journals have become irrelevant” (Odlyzko, quoting Rob Pike) • “if it did not happen at a conference, it didn’t happen” (Odlyzko, quoting Joan Feigenbaum) • “if I read it in a journal, I’m not in the loop” (Grycz, 1992) ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Solutions by Discipline • Physics • pre-prints • Mathematics • pre-prints • Computer Science Solutions by Discipline • Physics • pre-prints • Mathematics • pre-prints • Computer Science • technical reports, conference proceedings • Chemistry • still mainly journals, but review is cursory (Quinn, 1995) • Economics • working papers ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Journal System - Economic Problems • 20, 000 primary research journals (Bennion, 1994) • Journal System - Economic Problems • 20, 000 primary research journals (Bennion, 1994) • the number of scientific papers published annually doubles every 10 -15 years (Price, 1956) • STI does not enjoy economies of scale • intended audiences are generally static; the content becomes more specialized (Odlyzko, 1995) ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Journal System - Economic Problems • Because of the academic pressures, journals tend to Journal System - Economic Problems • Because of the academic pressures, journals tend to stay the same size, but the number of titles goes up (Quandt, 1996) • The acquisition budget of a library is constant (or decreasing), so it must be more selective in which titles it provides • If libraries cancel subscriptions, the cost to the remainder of the subscribers goes up ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Journal System - Economic Problems • • • The rising cost causes other libraries Journal System - Economic Problems • • • The rising cost causes other libraries to cancel subscriptions, causing the price to go up further. . . Journals driving themselves out of business is a well studied problem - contact me for more information Odlyzko estimates that: American universities spend as much buying mathematics journals as the NSF spends doing mathematical research ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

DL Economic Drivers from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ ODU CS 695 Fall DL Economic Drivers from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

original data from the ALA; slide from http: //lib-www. lanl. gov/~herbertv/presentations/vala-2004 -hvds. pdf ODU original data from the ALA; slide from http: //lib-www. lanl. gov/~herbertv/presentations/vala-2004 -hvds. pdf ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Journal System - Economic Problems • Chemical Abstracts (Lesk, pp. 203 -204) • begun Journal System - Economic Problems • Chemical Abstracts (Lesk, pp. 203 -204) • begun in 1950 s, used to cost dozens of dollars per year, and invidual chemists subscribed • today, it costs $17, 400 / year. • Okerson & Stubbs, 1992 • university book purchases down 15% 1986 -1991 • journals/faculty 14 -> 12 in same period • by year 2017, libraries would buy nothing at all! ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ figure 9. 2 in text ODU CS from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ figure 9. 2 in text ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Journal System - Coverage Problems • But journals only cover a fraction of available Journal System - Coverage Problems • But journals only cover a fraction of available STI • approximately 100 K domestic, unrestricted STI technical reports (grey literature) produced annually (Esler & Nelson, 1998) • Print journals, by definition, cannot provide access to non-report STI • software, datasets, etc. ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Electronic Journals? • An experiment that most scholars agree is good, is the eventual Electronic Journals? • An experiment that most scholars agree is good, is the eventual path, and is a great idea for everyone else’s papers. . . • until tenure is given based on publications in electronic journals, they will not be fully accepted ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Many DL Projects Are “Journal Centric” • Many DL projects (JSTOR, TULIP, etc. ) Many DL Projects Are “Journal Centric” • Many DL projects (JSTOR, TULIP, etc. ) are focused on automating the traditional journal methods • this is acceptable for archiving past issues, but seems unsatisfying for future STI ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

My Prediction for Journals • Highly specialized titles will go completely electronic, driven by My Prediction for Journals • Highly specialized titles will go completely electronic, driven by the rising cost and static readership • economics and academic acceptance will determine when this happens • “Popular” titles with broader appeal will exist in a hybrid format, both paper and electronic version • “subscribers” are likely to receive the value added material (soft copy, additional materials, etc. ) ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu

Common Shortcomings of Current DLs • Focused on journals, despite their decreasing to some Common Shortcomings of Current DLs • Focused on journals, despite their decreasing to some fields • Inadequate treatment of grey literature, the grist of technical exchange • Non-document STI (software, datasets, etc. ) not handled ODU CS 695 Fall 2005 Michael L. Nelson [email protected] odu. edu