d3fea2bf19d50fd30dba231ec1c7a5b0.ppt
- Количество слайдов: 45
OCLC Online Computer Library Center Interoperability Standards & Searching Multiple Repositories Ralph Le. Van/OCLC Ray Denenberg/Library of Congress
The Problem How do I provide a common interface for my users? How do I combine results from multiple sources?
How do I provide a common interface for my users? How do I convert my queries into the Content Provider’s (CP’s) queries? How do I ask for 10 records? How do I ask for more records? How do I interpret their response?
How do I convert my queries into the CP’s queries? My user said “author=twain and title=huck finn” Google expects: +twain +”huck finn” Z 39. 50: twain/1=1003; 4=2 “huck finn”/1=4; 4=1 and Lucene: creator: twain and title. Phrase: ”huck finn”
How do I ask for 10 records? Amazon won’t let you Red. Light. Green: MAXRECORDS=n British Library: records=n
How do I ask for more records? Amazon: page=n Red. Light. Green: STARTINDEX=n British Library: start=n
How do I interpret their response? How many records did I retrieve? Did something go wrong? How do I convert the CP’s records into something my users will recognize?
How many records did I retrieve? Amazon: <a href="/gp/search/ref=sr_nr_i_0/002 -20191168269663? %5 Fencoding=UTF 8&keywords=pratchett& rh=i%3 Aaps%2 Ck%3 Apratchett%2 Ci%3 Astripbooks &page=1">Books</a><span class="narrow. Value"> (334)</span> Red. Light. Green: <b>Viewing: </b> 1 -10 of 239 results British Library <opensearch: total. Results>190</opensearch: total. Resu lts>
Did Something Go Wrong? Red. Light. Green: <span class=small. Text>We didn't find any matches for <b>dog and</b>. </span> British Library: <item > <title >Nothing found due to an error</title> <description >Too many hits. Refine your request. </description></item>
How do I convert the records? Amazon: <table class="searchresults" border="0" width="100%" cellpadding="0" cellspacing="0"> <tr><td width="100%" class="searchitem" id="Td: 0"> <table border="0" width="100%" cellpadding="0" cellspacing="0"><tr valign="top"> <td> <table class="n 2" border="0" cellpadding="0" cellspacing="0"> <tr> <td class="image. Column" width="88"><table border="0" cellpadding="0" cellspacing="0"> <tr><td align="center" width="80"> <a href="http: //www. amazon. com/gp/product/0060815221/sr=8 -1/qid=1142436987/ref=pd_bbs_1/002 -2019116 -8269663? %5 Fencoding=UTF 8"><img src="http: //ec 1. images-amazon. com/images/P/0060815221. 01. _PIsitb-st-arrow, Top. Left, -14_SCTHUMBZZZ_. jpg" width="55" alt="Thud! (Discworld, Book 32)" height="82" border="0" /></a> </td><td width="8"></td></tr></table></td> <td class="data. Column"><table cellpadding="0" cellspacing="0" border="0"><tr><td> <a href="http: //www. amazon. com/gp/product/0060815221/sr=8 -1/qid=1142436987/ref=pd_bbs_1/002 -2019116 -8269663? %5 Fencoding=UTF 8"><span class="sr. Title">Thud! (Discworld, Book 32)</span></a> by Terry Pratchett (<span class="binding">Hardcover</span> - Sep 13, 2005)</td></tr> <tr><td class="brand. Link"><span class="alias. Name">Books: </span> <a href="/gp/search/ref=sr_nr_seeall_1/002 -20191168269663? %5 Fencoding=UTF 8&keywords=pratchett&rh=i%3 Aaps%2 Ck%3 Apratchett%2 Ci%3 Astripbooks">See all 334 items</a></td></tr> <tr><td><span class="price. Type"><a href="http: //www. amazon. com/gp/product/0060815221/sr=8 -1/qid=1142436987/ref=pd_bbs_1/002 -20191168269663? %5 Fencoding=UTF 8">Buy new</a>: </span> <span class="listprice">$24. 95</span> <span class="saleprice">$15. 72</span> <span class="price. Type"> <a href="http: //www. amazon. com/gp/offer-listing/0060815221/sr=8 -1/qid=1142436987/ref=pd_bbs_1/002 -2019116 -8269663? %5 Fencoding=UTF 8">Used & new</a> </span> from <span class="otherprice">$3. 76</span> <span class="avail">Usually ships in 24 hours</span> </td></tr><td colspan="2"><table cellpadding="0" cellspacing="0" border="0"> <tr><td class="excerpt. Start"><span class="excerpt. Lead">Excerpt from</span> <a href="/gp/reader/0060815221/ref=sib_aps_pg/002 -20191168269663? %5 Fencoding=UTF 8&keywords=pratchett&p=S 00 E&check. Sum=y 3 gl. B 4 NEGJ 6 Ql 3 i. AWFd 6 te. Zpt. AJmys 3 Uu 8 CCW 9387%252 BA%253 D">page 2</a>: " <span class="excerpt">. . . Terry <b>Pratchett</b> "Most of the news is. . . </span>" </td></tr> <tr><td class="excerpt. See. More"><a href="/gp/reader/0060815221/ref=sib_aps_ref/002 -2019116 -8269663? %5 Fencoding=UTF 8&keywords=pratchett&v=searchinside">See more references</a> to <span class="excerpt. User. Input">pratchett</span> in this book. </td></tr><td style="padding-top: 5 px; paddingbottom: 8 px; "><span style="font-weight: bold; color: #339933; ">Surprise me!</span> <a href="http: //www. amazon. com/gp/reader/0060815221/ref=sib_aps_sup/002 -2019116 -8269663? %5 Fencoding=UTF 8&p=random">See a random page</a> in this book. </td></tr></table></td></tr></table> </td></tr></table></td> </tr>
Converting Records Cont. Red. Light. Green: <td class="highlightcell"><span class="title. Text"><b><a title="View more information about this title. " href="ucw. servlets. UCWController? ACTION=EDITION& WORKID=21537371& LANGUAGE=ENG& M ATERIAL=books& FROMRSLT=3& FROMWORK=1& lang=english">Hogfather</a></b>, by Terry Pratchett 3 editions published between 1996 and 1998 in English. < br>Primary Subject: Discworld Imaginary Place - Fiction <img src="/ucwprod/web/images/green. gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) &#x. A; and availability (how many libraries have a copy of the title). "/><img src="/ucwprod/web/images/white. gif" height="3" width="1"/><img src="/ucwprod/web/images/gray. gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) &#x. A; and availability (how many libraries have a copy of the title). "/><img src="/ucwprod/web/images/white. gif" height="3" width="1"/></span></td></tr></table><table xmlns="http: //www. w 3. org/TR/REC-html 40" border="0" cellpadding="0" cellspacing="0" width="100%"><tr><td class="recordsepcell" colspan="2"><img src="/ucwprod/web/images/clear. gif" height="1"/></td></tr></table><table xmlns="http: //www. w 3. org/TR/REC-html 40" border="0" cellpadding="3" cellspacing="0" width="100%"><tr valign="top"><td width="25" align="right" class="highlightcell"><span class="title. Text">2. </span></td>
Converting Records Cont. British Library: <item ><title >Thud! / Terry Pratchett. </title> <link >http: //catalogue. bl. uk/F/-? func=directdocset&doc_number=013220851&l_base=BLL 01& from=A 9 Open. Search</link> <description > Pratchett, Terry. ; London : Doubleday, 2005. . ISBN 0385608675 (hbk. ) : £ 17. 99. (Added : 20050614 )</description></item>
How do I combine results from multiple sources? Things you might want the server to do for you: – Common Record Format – Common Sort Order – Common Rank Order
Functional Matrix Request Record Starting Point Request Number of Records Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response Record Count In Response Records In Known Schema
The Old Solutions Screen Scraping Private API’s Z 39. 50
Screen Scraping A query has to be generated and embedded in a CP specific URL Code has to be written to examine the HTML returned by a CP Prone to breakage – Web sites change formatting frequently Every site is unique – Separate code to be maintained for every site
Private API’s Often only a slight improvement over screen scraping Provides documentation on how to construct the URL Might provide documentation on how to construct the query Might guarantee a stable response format Still requires unique code for each site
Z 39. 50 Guarantees a standard request and response But… – Not HTTP or HTML • Binary encoding over raw TCP/IP – Complicated • 11 services • 7 extended services – Easy to be compliant and not interoperable – Unfriendly • The response to a protocol error was to drop the connection
Why Use A Standard API? Defined requests and responses Reusable code across sites Open Source code
The New Solutions Open. Search 1. 1 MXG – Levels 0 -2 SRU
Open. Search 1. 1 From Wikipedia – Open. Search is a collection of technologies that allow publishing of search results in a format suitable for syndication. It is a way for search engines to publish their search results in a standard and accessible format
Open. Search 1. 1 (cont. ) Defines a Description Record with information about the CP – Short. Name and Long. Name – Description – Tags – URL template Example: http: //herbie. bl. uk: 9080/opensearch. xml
Open. Search 1. 1 (cont. ) URL Template – Server Indicates how to specify Open. Search request parameters – Parameters not specified in the template are unavailable – The only mandatory parameter is {search. Terms} <Url type="application/rss+xml" template="http: //herbie. bl. uk: 9080/cgibin/OSxml 1. cgi/? q={search. Terms}&start ={start. Index? }&records={count? }&form at=rss" />
Open. Search 1. 1 (cont. ) Request Parameters – {search. Terms} – {count} – {start. Index} – {start. Page} – {language} – {output. Encoding} – {input. Encoding}
Open. Search 1. 1 (cont. ) Uses RSS 2. 0 with a few extra elements for the response – RSS define title, description and link elements – Open. Search adds the total. Results, start. Index, items. Per. Page, link and Query elements http: //herbie. bl. uk: 9080/cgibin/OSxml 1. cgi/? q=levan&format=rss
Functional Matrix OS 1. 1 Request Record Starting Point ● Request Number of Records ○ Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response ○ Record Count In Response ○ Records In Known Schema ○ Key: ●==Full Support ○==Limited Support
Cool Feature The RSS mechanism in Open. Search provides the ability to have persistent and periodic queries!
NISO Meta. Search XML Gateway MXG has been designed to provide a low implementation barrier to content providers that want to make their databases available to metasearch engines. Interoperability across content providers was explicitly not a goal of MXG
MXG Levels of Support Level 0: Requests are simple URL’s using any query grammar and responses are XML records Level 1: Adds a description record for the database Level 2: Support a limited subset of a standard query grammar: CQL
MXG Request Version (mandatory) Query (mandatory) Start. Record Maximum. Records http: //alcme. oclc. org/MXG/search/ORPub s? version=1. 1&query="levan"&start. Rec ord=1&maximum. Records=10
MXG Response <? xml version="1. 0" ? > <search. Retrieve. Response xmlns="http: //www. loc. gov/zing/srw/"> <version>1. 1</version> <number. Of. Records>10</number. Of. Records> <records> … </records> <next. Record. Position>1</next. Record. Position> <echoed. Search. Retrieve. Request> <version>1. 1</version> <query>" stuff" </query> </echoed. Search. Retrieve. Request> </search. Retrieve. Response>
MXG Response Records <record> <record. Schema> info: srw/schema/1/dc-v 1. 1 </record. Schema> <record. Packing>xml</record. Packing> <record. Data> … </record. Data> <record. Position>1</record. Position> </record>
MXG Response record. Data <srw_dc: dc xmlns="http: //www. w 3. org/TR/xhtml 1/strict" xmlns: dc="http: //purl. org/dc/elements/1. 1/" xmlns: srw_dc="info: srw/schema/1/dc-v 1. 1"> <dc: identifier>rrl 1234</dc: identifier> <dc: title>Dog and Cat</dc: title> </srw_dc: dc>
MXG Error Messages <diagnostics> <diagnostic xmlns="http: //www. loc. gov/zing/srw/diagnostic/"> <uri>info: srw/diagnostic/1/51</uri> <details>66 ntqk</details> </diagnostic> </diagnostics> http: //www. loc. gov/z 3950/agency/zing/srw/diagnosticslist. html
Functional Matrix MXG Level 0 Request Record Starting Point ● Request Number of Records ● Request Record Schema ○ Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages ● XML Response ● Record Count In Response ● Records In Known Schema ● Key: ●==Full Support ○==Limited Support
MXG Level 1 Add a description record for the database http: //www. loc. gov/z 3950/agency/zing/srw/explain. html http: //alcme. oclc. org/MXG/search/ORPubs
Functional Matrix MXG Level 1 Request Record Starting Point ● Request Number of Records ● Request Record Schema ● Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages ● XML Response ● Record Count In Response ● Records In Known Schema ● Key: ●==Full Support ○==Limited Support
MXG Level 2 Support a limited subset of a standard query grammar: CQL Supports indexes and Booleans http: //www. loc. gov/z 3950/agency/zing/cql/ http: //alcme. oclc. org/srw/search/ORPublications? version=1. 1&q uery=dc. author=levan&maximum. Records=1
Functional Matrix MXG Level 2 Request Record Starting Point ● Request Number of Records ● Request Record Schema ● Defined Query Grammar ○ Specify Sort Order Specify Ranking Order Diagnostic Messages ● XML Response ● Record Count In Response ● Records In Known Schema ● Key: ●==Full Support ○==Limited Support
SRU MXG Level 2 Plus: – Full Query Grammar (CQL) – Full Sort Specification
CQL: Common Query Language Loosely based on CCL Search Boolean & Proximity Operators Index Sets & Indexes String Indexes vs. Keyword Indexes Truncation Characters ‘*’, ‘#’ & ‘? ’ Relations: ‘=‘, all, any, exact, within Example: dc. title=“harry potter” or bib 1. isbn=123 -456 -78 x
Sort sort. Keys parameter with the following comma separated values specified: – Xpath (path to the element to be sorted on) – Schema (that the xpath comes from) – Ascending (value is 1==true or 0==false, default==true) – Case. Sensitive (value is 1==true or 0==false, default==false) – missing. Value (values are omit, abort, high. Value or low. Value, default==high. Value) e. g. &sort. Keys=title, onix, 0
Functional Matrix SRU Request Record Starting Point ● Request Number of Records ● Request Record Schema ● Defined Query Grammar ● Specify Sort Order ● Specify Ranking Order ○ Diagnostic Messages ● XML Response ● Record Count In Response ● Records In Known Schema ● Key: ●==Full Support ○==Limited Support
Cool Feature Combining SRU response data and echoed data with javascript and stylesheets allows for thin, browser based, clients http: //alcme. oclc. org/MXG/search/ORPub s? version=1. 1&query="levan"&start. Rec ord=1&maximum. Records=10
Functional Matrix OS 1. 1 MXG MXG SRU L 0 L 1 L 2 Request Record Starting Point ● ● ● Request Number of Records ○ ● ● ● ○ ● Request Record Schema Defined Query Grammar Specify Sort Order ● Specify Ranking Order ○ Diagnostic Messages ● ● XML Response ○ ● ● Record Count In Response ○ ● ● Records In Known Schema ○ ● ● Key: ●==Full Support ○==Limited Support
d3fea2bf19d50fd30dba231ec1c7a5b0.ppt