Скачать презентацию CIS 550 Handout 7 — XPATH and XQuery Скачать презентацию CIS 550 Handout 7 — XPATH and XQuery

5c7d7d389056971295b7e721d7e0b054.ppt

  • Количество слайдов: 27

CIS 550 Handout 7 -- XPATH and XQuery CIS 550 Handout 7 Fall 2001 CIS 550 Handout 7 -- XPATH and XQuery CIS 550 Handout 7 Fall 2001 1

URLs -- XPath • http: //www. w 3. org/TR/xpath This is the “recommendation”. Dense. URLs -- XPath • http: //www. w 3. org/TR/xpath This is the “recommendation”. Dense. Few examples. Difficult to extract the “big picture” from the morass of detail • http: //www. zvon. org/xxl/XPath. Tutorial/ General/examples. html A tutorial with some simple examples. Maybe too simple. There are lots of tutorials on the web. CIS 550 Handout 7 Fall 2001 2

URLs -- XQuery • http: //www. w 3. org/TR/xquery/ The basic recommendation. Plenty of URLs -- XQuery • http: //www. w 3. org/TR/xquery/ The basic recommendation. Plenty of examples, so work through these first. • http: //www. w 3. org/TR/query-semantics/ A formal semantics for XQuery. Despite its forbidding title, it is remarkably readable. It also discusses a type system for XQuery. • http: //www. w 3. org/TR/xmlquery-use-cases A bunch of example queries and their solution in XQuery (not surprising, since XQuery is Turingcomplete!) CIS 550 Handout 7 Fall 2001 3

How to Identify nodes in a Tree -- Regular Path Expressions db depts dept How to Identify nodes in a Tree -- Regular Path Expressions db depts dept mgr name In the normal syntax of regular expressions: emps db. emps. emp emp name db. _*. name db. (depts. dept. mgr |emps. emp) “Mary” “John” “Bill” N. B. Regular path expressions have nothing to do with regular expresions in DTDs CIS 550 Handout 7 Fall 2001 4

More examples With the DTD: <!ELEMENT PERSON (NAME, FATHER, MOTHER)> <!ELEMENT MOTHER (PERSON? )> More examples With the DTD: … the regular path expression (PERSON. MOTHER)* identifies matrilineal ancestry XPATH is a “superset of a subset” of regular path expressions. (It cannot express this set of nodes. ) However, it is not limited to moving “down” the tree. CIS 550 Handout 7 Fall 2001 5

 • XPath some nodes from a given Primary goal = to permit to • XPath some nodes from a given Primary goal = to permit to access document • XPath main construct : axis navigation • An XPath path consists of one or more navigation steps, separated by / • A navigation step is a triplet: axis + node-test + list of predicates • Examples – /descendant: : node()/child: : author[parent/attribute: : booktitle = “XML”][2] • XPath also offers some shortcuts – no axis means child – // º /descendant-or-self: : node()/

XPath- child axis navigation • author is shorthand for child: : author. Examples: – XPath- child axis navigation • author is shorthand for child: : author. Examples: – aaa -- all the child nodes labeled aaa (1, 3) – aaa/bbb -- all the bbb grandchildren of aaa children (4) – */bbb all the bbb grandchildren of any child (4, 6) context node 1 4 bbb aaa 5 2 3 ccc aaa 6 bbb aaa 7 ccc –. -- the context node – / -- the root node CIS 550 Handout 7 Fall 2001 7

XPath- child axis navigation (cont) – /doc -- all the doc children of the XPath- child axis navigation (cont) – /doc -- all the doc children of the root –. /aaa -- all the aaa children of the context node (equivalent to aaa) – text() -- all the text children of the context node – node() -- all the children of the context node (includes text and attribute nodes) –. . -- parent of the context node –. // -- the context node and all its descendants – // -- the root node and all its descendants – //para -- all the para nodes in the document – //text() -- all the text nodes in the document – @font the font attribute node of the context node CIS 550 Handout 7 Fall 2001 8

Predicates [2] -- the second child node of the context node chapter[5] -- the Predicates [2] -- the second child node of the context node chapter[5] -- the fifth chapter child of the context node [last()] -- the last child node of the context node chapter[title=“introduction”] -- the chapter children of the context node that have one or more title children whose string-value is “introduction” (the string-value is the concatenation of all the text on descendant text nodes) – person[. //firstname = “joe”] -- the person children of the context node that have in their descendants a firstname element with string-value “Joe” – From the XPath specification: NOTE: If $x is bound to a node set then $x = “foo” does not mean the same as not ($x != “foo”). – – CIS 550 Handout 7 Fall 2001 9

Unions of Path Expressions • employee | consultant -- the union of the employee Unions of Path Expressions • employee | consultant -- the union of the employee and consultant nodes that are children of the context node • For some reason person/(employee|consultant) --as in regular path expressions -- is not allowed • However person/node()[boolean(employee|consultant)] is allowed!! • From the XPATH specification: – The boolean function converts its argument to a boolean as follows: • a number is true if and only if it is neither positive or negative zero nor Na. N • a node-set is true if and only if it is non-empty • a string is true if and only if its length is non-zero • an object of a type other than the four basic types is converted to a boolean in a way that is dependent on that type CIS 550 Handout 7 Fall 2001 10

Axis navigation • So far, nearly all our expressions have moved us down the Axis navigation • So far, nearly all our expressions have moved us down the by moving to child nodes. Exceptions were – – . -- stay where you are / go to the root // all descendants of the root. // all descendants of the context node • All other expressions have been abbreviations for child: : … e. g. child: : para. child: is an example of an axis • XPath has several axes: ancestor, ancestor-or-self, attribute, child, descendant-or-self, followingsibling, namespace, parent, preceding-sibling, self – Some of these (self, parent) describe single nodes, others describe sequences of nodes. CIS 550 Handout 7 Fall 2001 11

XPath Navigation Axes (merci, Arnaud Sahuguet) ancestor preceding-sibling following-sibling self child preceding attribute following XPath Navigation Axes (merci, Arnaud Sahuguet) ancestor preceding-sibling following-sibling self child preceding attribute following namespace descendant

XPath abbreviated syntax (nothing) @ //. . / child: : attribute: : /descendant-or-self: : XPath abbreviated syntax (nothing) @ //. . / child: : attribute: : /descendant-or-self: : node() descendant-or-self: : node parent: : node() (document root)

XPath • Reasonably widely adopted -- in XML-Schema and query languages. • Neither more XPath • Reasonably widely adopted -- in XML-Schema and query languages. • Neither more expressive nor less expressive than regular path expressions (can’t do (ab)* ) • Particularly messy in some areas: – defining order of results – overloading of operations, • e. g. [chapter/title = “Introduction”] • why not [ “Introduction” IN chapter/title] ? CIS 550 Handout 7 Fall 2001 14

XQuery proposed by Chamberlin, Robbie and Florescu (from the authors’ slides) • Leverage the XQuery proposed by Chamberlin, Robbie and Florescu (from the authors’ slides) • Leverage the most effective features of several existing and proposed query languages • Design a small, clean, implementable language • Cover the functionality required by all the XML Query use cases in a single language • Write queries that fit on a slide CIS 550 Handout 7 Fall 2001 15

XQuery = XPath + “comprehension” syntax bind variables • XML -QL where <pattern> in XQuery = XPath + “comprehension” syntax bind variables • XML -QL where in use variables construct • Quilt CIS 550 Handout 7 bind variables for x in y in … use variables where return Fall 2001 16

Examples from XQuery List the titles of books published by Morgan Kaufmann in 1998. Examples from XQuery List the titles of books published by Morgan Kaufmann in 1998. FOR $b IN document("bib. xml")//book WHERE $b/publisher = "Morgan Kaufmann" AND $b/year = "1998" RETURN $b/title XPath expressions in orange CIS 550 Handout 7 Fall 2001 17

Examples from XQuery (cont) List each publisher and the average price of its books. Examples from XQuery (cont) List each publisher and the average price of its books. FOR $p IN distinct(document("bib. xml")//publisher) LET $a : = avg( document("bib. xml")//book[publisher = $p]/price) RETURN {$p/text()} {$a} LET binds a variable to a value. It does not cause an iteration. Does this create a (well-formed) XML document? CIS 550 Handout 7 Fall 2001 18

Examples from XQuery (cont) List the publishers who have published more than 100 books. Examples from XQuery (cont) List the publishers who have published more than 100 books. { FOR $p IN distinct(document("bib. xml")//publisher) LET $b : = document("bib. xml")//book[publisher = $p] WHERE count($b) > 100 RETURN $p } What about efficiency? CIS 550 Handout 7 Fall 2001 19

Examples from XQuery (cont) Invert the structure of the input document so that each Examples from XQuery (cont) Invert the structure of the input document so that each distinct author element contains a sequence of book-titles. { FOR $a IN distinct(document("bib. xml")//author) RETURN {$a/text()} { FOR $b IN document("bib. xml")//book[author = $a] RETURN $b/title } } CIS 550 Handout 7 Fall 2001 20

More Examples (Quilt)(from http: //db. cis. upenn. edu/Kweelt/use. Cases/R/Q 1. qlt ) Relational data More Examples (Quilt)(from http: //db. cis. upenn. edu/Kweelt/use. Cases/R/Q 1. qlt ) Relational data -- two DTDs: ]> ]> CIS 550 Handout 7 Fall 2001 21

The data <items> <bids> <item_tuple> <itemno>1001</itemno> <description>Red Bicycle</description> <offered_by>U 01</offered_by> <start_date>1999 -01 -05</start_date> <end_date>1999 The data 1001 Red Bicycle U 01 1999 -01 -05 1999 -01 -20 40 1002 Motorcycle U 02 1999 -02 -11 1999 -03 -15 500 U 02 1001 35 99 -01 -07 U 04 1001 40 99 -01 -08 CIS 550 Handout 7 Fall 2001 22

(" src="https://present5.com/presentation/5c7d7d389056971295b7e721d7e0b054/image-23.jpg" alt="Query 1 FUNCTION date() { "1999 -02 -01" } simple function definitions (" /> Query 1 FUNCTION date() { "1999 -02 -01" } simple function definitions ( FOR $i IN document("items. xml")//item_tuple WHERE $i/start_date LEQ date() AND $i/end_date GEQ date() dates are formatted so AND contains($i/description, "Bicycle") that lexicographic RETURN ordering gives the right result $i/itemno , $i/description SORTBY (itemno) ) CIS 550 Handout 7 Fall 2001 23

1003" src="https://present5.com/presentation/5c7d7d389056971295b7e721d7e0b054/image-24.jpg" alt="Output from Q 1 1003" /> Output from Q 1 1003 Old Bicycle 1007 Racing Bicycle CIS 550 Handout 7 Fall 2001 24

Query Q 2 For all bicycles, list the item number, description, and highest bid Query Q 2 For all bicycles, list the item number, description, and highest bid (if any), ordered by item number. ( FOR $i IN document("items. xml")//item_tuple LET $b : = document("bids. xml")//bid_tuple[itemno = $i/itemno] WHERE contains($i/description, "Bicycle") RETURN $i/itemno , $i/description , IF ($b) THEN Num. Format("#####. ##", max(-1, $b/bid)) ELSE "" SORTBY (itemno) ) lots of coercion CIS 550 Handout 7 Fall 2001 25

Output from Q 2 <result> <item_tuple> <itemno> 1001 </itemno> <description> Red Bicycle </description> <high_bid> Output from Q 2 1001 Red Bicycle 55 1003 Old Bicycle 20 1007 Racing Bicycle 225 1008 Broken Bicycle CIS 550 Handout 7 Fall 2001 26

Query Q 3 Find cases where a user with a rating worse (alphabetically greater Query Q 3 Find cases where a user with a rating worse (alphabetically greater than "C" ) offers an item with a reserve price of more than 1000. ( FOR $u IN document("users. xml")//user_tuple, $i IN document("items. xml")//item_tuple WHERE $u/rating GT 'C' AND $i/reserve_price GT 1000 Comparing sets with singletons AND $i/offered_by = $u/userid Same rules as in XPath? In this RETURN case the DTD gives uniqueness $u/name/text(), $u/rating/text(), $i/description/text(), $i/reserve_price ) CIS 550 Handout 7 Fall 2001 27