
3ae45f0c1bcd510fa6447dc28f57a848.ppt
- Количество слайдов: 50
Introduction to Digital Libraries Week 1: (Digital) Libraries Defined Old Dominion University Department of Computer Science CS 695 Fall 2005 Michael L. Nelson <mln@cs. odu. edu> 08/29/05 ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Overview • class info: • http: //www. cs. odu. edu/~mln/teaching/cs 695 -f 05/ • http: //list. odu. edu/listinfo/cs 695 -f 05/ • class grading • 4 assignments that gradually culminate in the student building a digital library of their own • 1 mid-term, 1 final • both take home ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
What is a Library? Main Entry: li·brary Pronunciation: 'l. I-"brer-E; British usually and US sometimes -br&r-E; US sometimes -br. E, ÷-"ber-E Function: noun Inflected Form(s): plural -brar·ies Etymology: Middle English, from Medieval Latin librarium, from Latin, neuter of librarius of books, from libr-, liber inner bark, rind, book 1 a : a place in which literary, musical, artistic, or reference materials (as books, manuscripts, recordings, or films) are kept for use but not for sale b: a collection of such materials 2 a : a collection resembling or suggesting a library <a library of computer programs> <wine library > b: MORGUE 2 3 a : a series of related books issued by a publisher b: a collection of publications on the same subject 4: a collection of sequences of DNA and especially recombinant DNA that a re maintained in a suitable cellular environment and that represent the genetic material of a particular organism or tissue http: //www. m-w. com/cgi-bin/dictionary? book=Dictionary&va=library&x=0&y=0 ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
A Tool For Communicating With The Future… SCROLLS FROM THE DEAD SEA The Ancient Library of Qumran and Modern Scholarship http: //www. ibiblio. org/expo/deadsea. scrolls. exhibit/intro. html ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
A History of Libraries in 1 Slide • Lyceum - Ancient Greece • http: //en. wikipedia. org/wiki/Lyceum • Alexandria - Ancient Egypt • http: //en. wikipedia. org/wiki/Library_of_Alexandria • (…skipping a bit…) • Boston Public Library - First US public lending library (1848) • http: //en. wikipedia. org/wiki/Boston_Public_Library • http: //www. bpl. org/ • “The commonwealth requires the education of the people as the safeguard of order and liberty” more info: http: //www. dlib. org/dlib/january 00/01 levy. html ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
“Lone Scientist” Stereotypes Max Munk H. J. E. Reid Enrico Fermi http: //history. nasa. gov/SP-4103/ch 4. htm http: //www. anl. gov/Media_Center/logos 20 -1/fermi 01. htm John Stack Albert Einstein http: //www. hq. nasa. gov/office/pao/History/x 1/stack. html http: //www. artnet. com/artist/92724/Vishniac_Roman. htm ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Vannevar Bush (1890 -1974) • Director of the Office of Scientific Research and Development • lead 6000 scientists in R&D for WWII • previously, science lacked large scale teams • also director of NACA (1939)! • Predicted many technological advances • the “memex” is one whose spirit we are implementing • the purpose was to provide scientists the capability to exchange information; to have access to the totality of recorded information image from: http: //www. ibiblio. org/pioneers/bush. html ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Memex • • Integrated computer, keyboard, and desk “mechanized private file and library” • remove drudgery from information retrieval • suggested implementation was microfilm • various user operations are suggested • Associative indexing was the main purpose • “the process of tying two items together is the important thing” • prelude to hypertext. . . Image from: http: //www. dynamicdiagrams. com/case_studies/mit_memex. html ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Memex • Information could come pre-associatively indexed, but the key point was user customization • WWW still does not provide that today • Bush observes that tools change our way of doing, and expand the horizons before us • full impact of WWW and DLs still not known • Interesting: Bush’s AM article did not predict freetext searching. . . • knowledge trails only; DMOZ w/o keyword searching ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
from Lesk, http: //community. bellcore. com/lesk/columbia/session 2/ digital preservation research www. digitalpreservation. gov “digital information lasts forever -or 5 years, whichever comes first” -- Jeff Rothenberg ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
What is a Digital Library (DL)? • “…a managed collection of information, with associated services, where the information is stored in digital formats and accessible over a network” (Arms, p. 2) • there any number of alternate definitions, but this seems fair enough • no mention of architecture, implementation, content, etc. ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
How is a DL different from a database? • A traditional SQL database has as its basic element data items in a relation: • select name • from employee, project • where employee. deptnumber = “ 25” AND • project. number = “ 100” • • databases exploit known structures and relations DBMS retrieval is not probabilistic (Frakes, Baeza -Yates, p. 3) ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
How is a DL different from the WWW? • The keyword is managed • The WWW is not managed • Some meta searchers (Yahoo, Lycos, DMOZ) attempt to add an organizational framework to their web holdings • However, most are focused on keyword searching (i. e. , Google) ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
A Garden vs. Desert Scrub http: //www. gingerbread-mansion. com/ourgarden. html http: //www. filmdeserts. com/Open%20 Desert%20 Areas_1. htm ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
How is a DL different from the WWW? • Another key difference is who controls the input into the system • most meta searchers hunt down their holdings • Lycos is short for Lycosidae lycosa (the “wolf spider”), which pursues its prey and does not build a web (Mauldin, IEEE Expert, 1/97) • some (DMOZ, Yahoo) have humans in the loop for review and classification • DLs are generally more tightly controlled, and have a targeted customer set ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
DL = Content + Services • “Why not just use the WWW” ? • WWW by itself has low archival & management characteristics • “Why not use a RDBMS? ” • In the same way that a card catalog is not a TL, a RDBMS is candidate technology for use in DLs • DL is the union of the content and services defined on the content ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
The Study of Digital Libraries is Multidisciplinary • computer science • tools, protocols, transport • information science • models of information access and storage • human factors • usability, adaptability • law • rights management • economics • “it’s all about the benjamins…” ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
How is a DL Different from a Traditional Library? • TL has as its focus physical objects • even if the card catalog (metadata) is electronic, the purpose is to point you to a physical location • trafficking in physical objects has both obvious and subtle implications • object can exist only in 1 place • if you have it, I can’t have it (zero-sum distribution) • I have to go to the object, or wait for it to come to me ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
TLs vs. DLs • DLs clearly better than TLs at: • Dissemination, storing information variety • However, TL objects are more survivable • Who will archive the research information? • the publishers? • the institutions? • the authors? • Will the average DL object still be accessible in 10 years? • see me for digital preservation related projects, seminars image from: http: //www. ancientegypt. co. uk/writing/rosetta. html ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
How is a DL Different from a Traditional Library? • Digital Library • removing the physical restriction has obvious benefits • multiple access, multiple listings, electronic transmission • also complicates many other issues. . . • intellectual property, terms and conditions, etc. • Note that a TL offers additional social and educational benefits • Most TLs also offer hybrid services too. ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
TLs vs. DLs • Where does publishing stop, and libraries begin? • there has always been tensions between TLs and traditional publishers, but the roles were fairly well defined • DLs can muddle the separation of these responsibilities • result: conflict, and/or new models ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Traditional Players service book store publisher library archive responsibility over time ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Does Content Influence Design? Which can you personally verify? Which would you accept from a stranger? ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
What is Scientific and Technical Information? • STI is the collection of materials, independent of format, used in research, development, and other technical activities • reports, data sets, images, videos, software, etc. • It is also the output of such R&D activities • STI includes both white and grey literature ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
White and Grey Literature • The line between the two is not always clear • Grey Net offers an admittedly obsolete definition of grey literature: • “Information produced on all levels of government, academics, business and industry in electronic and print formats not controlled by commercial publishing" • http: //www. greynet. org/ ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
White and Grey Literature • Intuitively : • White: author and publisher are often different, the work has been independently reviewed, obtaining the work is straightforward • Grey: may not be reviewed, often “published” from the source origin, may be difficult to obtain ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Literature Examples • White • Journals, books, edited conference proceedings, etc. • Grey • technical reports, government reports, unedited proceedings, non-document STI, etc. ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
So Why Worry About Grey Literature? • White is generally perceived as having a higher pedigree, easier to obtain (in a sense), etc. • • it is generally less timely and is often a summary or abstract of a larger body of work Pyramid of STI ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
History of STI Distribution • Originally, scientists published books to document their findings • but the delay was terribly long • Then, scientists exchanged personal letters among themselves for rapidity • but this is point-to-point communication, not broadcast ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
History of STI Distribution • The current system of journals evolved in the 17 th century as the synthesis of both previous models • more timely than books, more available than letters • in fact, some journals with the emphasis on “speed” still have “Letters” in their title • historical information from (Odlyzko, 1995) ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
But Are Journals Still Relevant? • People still publish in them (tenure and promotions are still largely “count the journal publications” exercises) • But do people read them? • The current use of journals is now: • “a medium for priority claiming, quality control, and archiving scientific work” (Bennion, 1994) ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Unavailable, or Not Worth Citing? from Lesk, http: //community. bellcore. com/lesk/columbia/session 13/ figure 9. 7 in text ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
But Are Journals Still Relevant? • • • How important is refereeing anyway? Most rejected papers end up published somewhere else (Lesk, p. 214) Referees have rejected many worthy papers, including some that are the most cited in their respective journals (Campanario, 1996) • this is another well studied problem, contact me for more details ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
But Are Journals Still Relevant? • Different disciplines have adapted: • physics - “the small amount of filtering provided by refereed journals plays no effective role in our research” (Ginsparg, 1994) • math - “it is rare for experts in any mathematical subject to learn of a major new development in their area through a journal publication” (Odlyzko, 1995) ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
But Are Journals Still Relevant? • computer science • “in his area, journals have become irrelevant” (Odlyzko, quoting Rob Pike) • “if it did not happen at a conference, it didn’t happen” (Odlyzko, quoting Joan Feigenbaum) • “if I read it in a journal, I’m not in the loop” (Grycz, 1992) ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Solutions by Discipline • Physics • pre-prints • Mathematics • pre-prints • Computer Science • technical reports, conference proceedings • Chemistry • still mainly journals, but review is cursory (Quinn, 1995) • Economics • working papers ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Journal System - Economic Problems • 20, 000 primary research journals (Bennion, 1994) • the number of scientific papers published annually doubles every 10 -15 years (Price, 1956) • STI does not enjoy economies of scale • intended audiences are generally static; the content becomes more specialized (Odlyzko, 1995) ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Journal System - Economic Problems • Because of the academic pressures, journals tend to stay the same size, but the number of titles goes up (Quandt, 1996) • The acquisition budget of a library is constant (or decreasing), so it must be more selective in which titles it provides • If libraries cancel subscriptions, the cost to the remainder of the subscribers goes up ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Journal System - Economic Problems • • • The rising cost causes other libraries to cancel subscriptions, causing the price to go up further. . . Journals driving themselves out of business is a well studied problem - contact me for more information Odlyzko estimates that: American universities spend as much buying mathematics journals as the NSF spends doing mathematical research ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
DL Economic Drivers from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
original data from the ALA; slide from http: //lib-www. lanl. gov/~herbertv/presentations/vala-2004 -hvds. pdf ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Journal System - Economic Problems • Chemical Abstracts (Lesk, pp. 203 -204) • begun in 1950 s, used to cost dozens of dollars per year, and invidual chemists subscribed • today, it costs $17, 400 / year. • Okerson & Stubbs, 1992 • university book purchases down 15% 1986 -1991 • journals/faculty 14 -> 12 in same period • by year 2017, libraries would buy nothing at all! ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
from Lesk, http: //community. bellcore. com/lesk/columbia/session 1/ figure 9. 2 in text ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Journal System - Coverage Problems • But journals only cover a fraction of available STI • approximately 100 K domestic, unrestricted STI technical reports (grey literature) produced annually (Esler & Nelson, 1998) • Print journals, by definition, cannot provide access to non-report STI • software, datasets, etc. ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Electronic Journals? • An experiment that most scholars agree is good, is the eventual path, and is a great idea for everyone else’s papers. . . • until tenure is given based on publications in electronic journals, they will not be fully accepted ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Many DL Projects Are “Journal Centric” • Many DL projects (JSTOR, TULIP, etc. ) are focused on automating the traditional journal methods • this is acceptable for archiving past issues, but seems unsatisfying for future STI ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
My Prediction for Journals • Highly specialized titles will go completely electronic, driven by the rising cost and static readership • economics and academic acceptance will determine when this happens • “Popular” titles with broader appeal will exist in a hybrid format, both paper and electronic version • “subscribers” are likely to receive the value added material (soft copy, additional materials, etc. ) ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
Common Shortcomings of Current DLs • Focused on journals, despite their decreasing to some fields • Inadequate treatment of grey literature, the grist of technical exchange • Non-document STI (software, datasets, etc. ) not handled ODU CS 695 Fall 2005 Michael L. Nelson mln@cs. odu. edu
3ae45f0c1bcd510fa6447dc28f57a848.ppt