Скачать презентацию Data Models and Query Languages CSE 590 DB Скачать презентацию Data Models and Query Languages CSE 590 DB

f739bc816281ebcdb36c97b9ee75ee84.ppt

  • Количество слайдов: 21

Data Models and Query Languages CSE 590 DB, Winter 1999 Theory of Databases Zack Data Models and Query Languages CSE 590 DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999 1

The Relational Model z Database consists of relations, where columns are attributes and rows The Relational Model z Database consists of relations, where columns are attributes and rows are tuples z Declarative queries using an algebraic or logic language (or SQL, which we shall ignore) New. Movies Name The Phantom Menace Patch Adams Stepmom attribute Rating PG PG-13 New. Movies(Name, Rating, Director) Director George Lucas Tom Shadyac Chris Columbus arity = 3, cardinality = 3 tuple 2

Schema z Each attribute in a relation has a type: New. Movies(Name: String, Rating: Schema z Each attribute in a relation has a type: New. Movies(Name: String, Rating: Rating. Domain, Director: String), where Rating. Domain = {G, PG-13, R, NC-17} (In many cases, we will omit the type, and assume the type most appropriate to the expression) z Relation schema: relation name, attributes, and types z Relation instance: a set of tuples for a given schema z Database schema: set of relation schemas z Database instance: relation instance for every relation schema 3

More on Tuples z Formally, a tuple is a mapping from attribute names to More on Tuples z Formally, a tuple is a mapping from attribute names to (correctly typed) values: y Director -> “George Lucas” y Name -> “Stepmom” z May specify a tuple using values & schema notation: New. Movies(“The Phantom Menace”, “PG”, “George Lucas”) z Can have constraints on attribute values: y Integrity constraints y Keys y Foreign keys y Functional dependencies 4

Relational Algebra All operators take one or more sets of tuples as input, and Relational Algebra All operators take one or more sets of tuples as input, and produce a new set as output y. Basic set operators ( , , —, but normally no complement) y. Selection ( ) y. Projection ( ) y. Cartesian Product ( ) y. Join ( ) y. Division ( ) 5

Set Operations z Sets must be compatible! y. Same number of attributes y. Same Set Operations z Sets must be compatible! y. Same number of attributes y. Same types/domains z R 1 R 2 = {all tuples in either R 1 or R 2} z R 1 R 2 = {all tuples in both R 1 and R 2} z Complement? ? ? z Negation: R 1 — R 2 = {all tuples in R 1 not in R 2} z Query without unions is conjunctive 6

Selection Chooses a subset of tuples satisfying a predicate New. Movies Name The Phantom Selection Chooses a subset of tuples satisfying a predicate New. Movies Name The Phantom Menace Patch Adams Stepmom Rating PG PG-13 Director George Lucas Tom Shadyac Chris Columbus Rating = “PG-13” (New. Movies) Name Patch Adams Stepmom Rating PG-13 Director Tom Shadyac Chris Columbus 7

Projection Chooses a subset of attributes, removes duplicates New. Movies Name The Phantom Menace Projection Chooses a subset of attributes, removes duplicates New. Movies Name The Phantom Menace Patch Adams Stepmom Rating PG PG-13 Director George Lucas Tom Shadyac Chris Columbus Name, Director (New. Movies) Name The Phantom Menace Patch Adams Stepmom Director George Lucas Tom Shadyac Chris Columbus 8

Cartesian Product z Combine two relations - all pairings of tuples New. Movie Name Cartesian Product z Combine two relations - all pairings of tuples New. Movie Name Phantom Menace Patch Adams Stepmom Director George Lucas Tom Shadyac Chris Columbus New. Movie Metro. Shows Name Director Phantom Menace George Lucas Patch Adams Tom Shadyac … … Metro. Shows Name Patch Adams Stepmom Patch Adams … Time 7: 00 9: 40 6: 50 9: 00 7: 00. . . 9

Join z Combine tuples from two relations by predicate. If predicate is equality, remove Join z Combine tuples from two relations by predicate. If predicate is equality, remove duplicate attribute. New. Movie Name Phantom Menace Patch Adams Stepmom Director George Lucas Tom Shadyac Chris Columbus Metro. Shows Name Patch Adams Stepmom Time 7: 00 9: 40 6: 50 9: 00 New. Movie. Name = Metro. Shows. Name Metro. Shows Name Patch Adams Stepmom Director Tom Shadyac Chris Columbus Time 7: 00 9: 40 6: 50 9: 00 Equivalent to: Name, Director, Time ( New. Movie. Name = Metro. Shows. Name (New. Movie Metro. Shows)) 10

Division z Only returns results from dividend which match attributes of all of tuples Division z Only returns results from dividend which match attributes of all of tuples in divisor (“for all”) Customers name Johnson Smith Lindsay Green branch Downtown Brighton Redwood Brighton Downtown Branches branch Brighton Downtown Customers Branches name Green 11

Logic-Based Query Languages z Tuple-relational calculus y. Based on first-order predicate calculus, but imposes Logic-Based Query Languages z Tuple-relational calculus y. Based on first-order predicate calculus, but imposes restrictions (e. g. , no function symbols) y. Equivalent in power to relational algebra z Datalog y. More powerful than tuple-relational calculus Supports recursion and thus transitive closure y. Equivalent to set of Horn clauses 12

Predicates & Atoms z Relations represented by predicates z Tuples represented by atoms: Flight(“Alon”, Predicates & Atoms z Relations represented by predicates z Tuples represented by atoms: Flight(“Alon”, “British Airways”, 1234, “Seattle”, “Israel”, “ 1/9/1999”, “ 10: 00”) z Arithmetic atoms: X < 100, X + Y + 5 > Z * 2 z In certain cases, negated atoms: not Flight(“Zack”, “British Airways”, 1234, “Seattle”, “Israel”, “ 1/9/1999”, “ 10: 00”) 13

Datalog Rules & Programs z “Pure” datalog rules: head : - atom 1, atom Datalog Rules & Programs z “Pure” datalog rules: head : - atom 1, atom 2, atom 3 where all atoms are non-negated and relational z Datalog program is set of datalog rules z Conjunctive query has single rule: Hanks. Director(D) : - Actor. In(“Hanks”, M) & Directed. By(D, M) z Disjunctive query: Hanks. Director(D) : - Actor. In(“Hanks”, M) & Directed. By(D, M) Hanks. Director(D) : - Produced. By(“Hanks”, M) & Directed. By(D, M) 14

Intension & Extension z Distinction between EDB & IDB: y. Extensional DB xwhat’s stored Intension & Extension z Distinction between EDB & IDB: y. Extensional DB xwhat’s stored in the database xcan only be in the body of a Datalog rule y. Intensional DB xresult of a datalog rule xcan be in head or body of a Datalog rule 15

Evaluating Rules z Evaluating rules: y. Start with EDB, iteratively derive facts for IDB’s Evaluating Rules z Evaluating rules: y. Start with EDB, iteratively derive facts for IDB’s y. Repeat until cannot derive any new facts: x. Consider every possible assignment from database to variables in body x. If each atom in body is made true by assignment, add tuple for the head into the head’s relation z Conjunctive queries are inexpensive z Disjunctive are more expensive (even though they’re Horn clauses) 16

Transitive Closure z Suppose we to know all possible destination airports reachable from Sea. Transitive Closure z Suppose we to know all possible destination airports reachable from Sea. Tac, given an EDB Flight(From, To): SFO SEA LAX LGA DEN JFK We need to express the query: Find all airports reachable from SEA 17

Recursion in Datalog z Program: Dest(X, Y) : - Flight(X, Y) Dest(X, Y) : Recursion in Datalog z Program: Dest(X, Y) : - Flight(X, Y) Dest(X, Y) : - Dest(X, Z), Flight(Z, Y) z Evaluate unknown # of times until fixpoint ç Flight: {(SEA, SFO), (SEA, LAX), (SEA, JFK), (LAX, LGA), (JFK, LGA), (LGA, DEN)} Dest 0: {} ç Dest 1: Dest 0 {(SEA, SFO), (SEA, LAX), (SEA, JFK)} ç Dest 2: Dest 1 {(SEA, LGA), (LAX, DEN), (JFK, DEN)} ç Dest 3: Dest 2 {(SEA, DEN)} ç No more changes, so stop (minimum fixpoint) 18

Built-in Predicates z Rules may include atoms with built-in predicates: Expensive(X) : - Product(X, Built-in Predicates z Rules may include atoms with built-in predicates: Expensive(X) : - Product(X, Y, P) & P > 100 z But need to restrict use of built-in atoms: P(X) : - R(X) & X < Y æ When is X < Y? z We require that every variable from a built-in atom appear in a relational atom 19

Negated Subgoals z Restrict forms: P(X, Y) : - Between(X, Y, Z) & NOT Negated Subgoals z Restrict forms: P(X, Y) : - Between(X, Y, Z) & NOT Direct(X, Z) z Bad: Q(X, Y) : - From(X) & NOT To(Y) z Bad but fixable: P(X) : - From(X) & NOT Connects(X, Y) z Rewrite as: Connects’(X) : - Connects(X, Y) P(X) : - End(X) & NOT Connects’(X) 20

Stratified/Stratisfiable Rules z Predicate P depends on predicate Q if Q appears negated in Stratified/Stratisfiable Rules z Predicate P depends on predicate Q if Q appears negated in a rule defining P z If cycle in dependency graph, program is not stratifiable: p(X) : - r(X) & NOT q(X) : - r(X) & NOT p(X) z Suppose r has tuple {1} Intuitively, stratification provides a way of executing program P as sequence of subprograms P 1 … Pn defining IDB relations w/o “forward references”. Results independent of stratification. 21