
c8d89b26418cb3ef01b9e51d83805d34.ppt
- Количество слайдов: 23
SQL queries subqueries and joining RHS – SOC
SQL query – beyond one table • So far, we have only applied queries to a single table • It is possible – and very useful – to make queries applied to several tables • Can answer more complex questions… • …but queries also tend to become more complex themselves RHS – SOC 2
SQL query – beyond one table • We use a movie information database as example Movie movieid title country prodyear genre oscars Actor Casting movieid actorid RHS – SOC actorid name country birth living oscars 3
SQL query – beyond one table actorid name country birth 1 John Wayne USA 04 -06 -1911 No 2 2 Chow-Yun Fat HK 12 -11 -1959 Yes 0 3 Karl Stegger Denmark 30 -01 -1922 No 0 4 Jean Reno France 10 -10 -1949 Yes 0 5 Julia Roberts USA 28 -09 -1964 Yes 1 6 Natalie Portman Israel 02 -07 -1980 Yes 0 7 Marc Duret France 01 -08 -1966 Yes 1 RHS – SOC living oscars 4
SQL query – beyond one table movieid actorid 1 1 4 4 6 1 5 5 2 1 7 5 4 2 4 5 6 2 6 6 5 2 7 6 6 3 1 7 2 3 2 7 4 3 3 7 5 RHS – SOC 5
SQL query – subqueries • Suppose we wish to answer a (complex) question like this: • How many movies are from the country that Marc Duret is from • This question cannot be answered by a single query (as we know it…) • Information is spread out in multiple tables RHS – SOC 6
SQL query – subqueries SELECT country FROM Actor WHERE name = ’Marc Duret’ Result is ’France’ SELECT COUNT(title) AS Movie. Count FROM Movie WHERE Movie. country = ’France’ used as input NB ! RHS – SOC 7
SQL query – subqueries • We can run these two queries ”manually”, or use output from first query as input to second query – not optimal… • Better to make first query a subquery to the second query RHS – SOC 8
SQL query – subqueries Outer query SELECT COUNT(title) AS Movie. Count FROM Movie WHERE country = (SELECT country FROM Actor WHERE name = ’Marc Duret’) Inner query RHS – SOC 9
SQL query – subqueries • Result from inner query (the subquery) is used as input to outer query • The inner query produces a result table – just as any other query – which is then used by the outer query • Very convenient – but some complications – Name clash – Multiple results RHS – SOC 10
SQL query – subqueries • When we query more than one table, field names from different tables might be identical – a name clash • We can qualify a field name by prefixing it with the table name – Movie. country (country field in Movie table) – Actor. country (country field in Actor table) RHS – SOC 11
SQL query – subqueries • What if a subquery produces more than one result? SELECT country FROM Actor WHERE (oscars > 0) RHS – SOC Result: USA France 12
SQL query – subqueries • In such case, we must use the IN keyword SELECT COUNT(title) AS Movie. Count FROM Movie WHERE country IN (SELECT country FROM Actor WHERE (oscars > 0)) Try formulating this query in plain English… RHS – SOC 13
Exercise 6 – SQL queries • • Use the Movie. Information database, defined in exercise 1 Add records to the Actor table, as defined in the presentation Add records to the Casting table, as defined in the presentation With the data in place, run the below queries on the database – SELECT title, prod. Year, oscars FROM Movie WHERE oscars > (SELECT SUM(oscars)/5 FROM Movie) – SELECT * FROM Movie WHERE prod. Year > (SELECT MAX(prod. Year) FROM Movie WHERE (oscars > 0)) • Now formulate queries yourself, in order to retrieve the below data: – Get Name and Country for actors from the same country that ’E. T’ was made in – Get Title and Oscars for movies that have won at least as many Oscars as John Wayne has won – Get Name and Birth for actors having won more than twice the average of Oscars won for all actors RHS – SOC 14
SQL query – joining • Another approach to multi-table queries is to use join • To ”join” tables is to make a kind of ”multiplication” of the tables • Simple case: Join without any conditions: SELECT * FROM Movie, Actor RHS – SOC 15
SQL query – joining • If the previous query is run against the Movie. Information database, we get a result with 49 rows… • 49 = 7 x 7. All combinations of Movie records (7) and Actor records (7) • All fields from both tables are included • This is rarely what we want… RHS – SOC 16
SQL query – joining • In almost all cases, we wish to pick out those rows where certain fields match • Example: For each movie, find the title of the movie, and the names of the actors appearing in the movie • We already have this information in the Casting table (20 records), but only as identifiers… RHS – SOC 17
SQL query – joining • We clearly need information from all three tables in the Movie. Information database, so first attempt at a query could be: SELECT * FROM Movie, Casting, Actor • Result: 980 records with a lot of fields… RHS – SOC 18
SQL query – joining • First of all, we only need the Title field from the Movie table, and the Name field from the Actor table SELECT Movie. Title, Actor. Name FROM Movie, Casting, Actor • Result: 980 records with correct fields RHS – SOC 19
SQL query – joining • Next, we only need those records where the relevant information match • We are trying to ”substitute” identifiers in the Casting table with real data (titles and names) • In order to get the right titles and names, the identifiers for this data must match! RHS – SOC 20
SQL query – joining • Match identifier for titles and names: SELECT Movie. Title, Actor. Name FROM Movie, Casting, Actor WHERE ((Movie. Id = Casting. Movie. Id) AND (Actor. Id = Casting. Actor. Id)) • Result: 20 records with correct fields! RHS – SOC 21
SQL query – joining • This is a quite common ”pattern” for multitable queries – Tables representing relations only contain identifiers (keys in other tables) – ”Real data” is contained in tables representing entities (identifiers are keys) – Obtaining ”real data” for relations requires use of join, with matching of key fields RHS – SOC 22
Exercise 7 – SQL queries • • Use the Movie. Information database, defined in exercise 1, with data in all three tables With the data in place, run the below queries on the database 1) SELECT Movie. TITLE, COUNT(Actor. Id) AS Roles FROM Movie, Actor, Casting WHERE (Actor. COUNTRY = 'France') AND (Casting. ACTORID = Actor. ACTORID) AND (Casting. MOVIEID = Movie. MOVIEID) GROUP BY Movie. TITLE 2) SELECT Movie. Genre, SUM(Actor. Oscars) AS Total. Oscars FROM Movie, Actor, Casting WHERE (Actor. Living = 'Yes') AND (Casting. ACTORID = Actor. ACTORID) AND (Casting. MOVIEID = Movie. MOVIEID) GROUP BY Movie. Genre • Now formulate queries yourself, in order to retrieve the below data: – Get actor name, country and movie title for those actors appearing in a movie made in their own country – Get actor name and movie title for those actors that have not won an Oscar, but have appeared in Oscar-winning movies RHS – SOC 23
c8d89b26418cb3ef01b9e51d83805d34.ppt