0b4968edc9694f8fee839e977dec3de8.ppt
- Количество слайдов: 50
Introduction to databases Bent Thomsen
Who uses databases? • Everybody! – Every time you use your credit card, you (indirectly) use a database – When you book a flight you use a database – When you take out insurance or a mortgage, you use a database – When you contact public services – When you buy something online
Classes of database users • Workers on the scene – End user: The person who makes queries, updates and extracts reports – Database administrator: The person responsible for managing the database system – Database designer – Application programmer: Design and implement transactions for end-users
Database users (cont. ) • Workers behind the scene – Database designers and implementers – Tool developers – Operators and maintenance personnel • Run and maintain the hardware and software
Types of databases • Large database systems – Oracle, DB 2, Informix, Ingres, MS SQLServer – Huge amounts of data – high throughput • Small(ish) database systems – Excell Data. Lists – My. SQL – MS Access – Small companies, clubs, scientific data
When may you need/want to use a database? • To access data relevant to your project – E. g. KMS database of roads in Denmark • To store and retrieve data from scientific experiments • To store, organize and retrieve data from questionnaires • To organize your reference library • To manage the membership of your sports club
Concepts of Database – A database is a collection of records that are organized for a particular purpose – A student record book with addresses and grades – An inventory with software, books, and hardware – A field is a basic fact (or data element). • Name, address, … – A record is a set of fields. • A student, a book – A table is a set of records. – A database consists of one or more tables. – A primary key is a field that identify a single record. • In the primary key field, every record has a unique number. (Student ID Number)
Parts of a database Record Tables Attribute/Field • Records become “rows” • Attributes/fields become “columns” • Rules determine the relationship between the tables and tie the data together to form a database
Creating a database • What information are we trying to store? • How do we describe the information? • Phone Book/Contact entries – – – – – Name Address Company Phone Number URL/Web Page Age Height (in meters) Birthday When we added the entry
Data Types • Binary – Database specific binary objects – Pictures, digital signatures, etc. • Boolean – True/False values • Character – Fixed width or variable size • Numeric – Integer, Real (floating decimal point), Money • Temporal – Time, Date, Timestamp
Phone Book/Contact Record Name Address Company Phone Number URL/Web Page Age Height Birthday When we added the entry Character Character Integer Real (float) Date Timestamp
An example Contacts Name Company Address Phone 1 Phone 2 Phone 3 Zip. Code Joe ABC 123 5532 2234 3211 12345 Jane XYZ 456 3421 Chris PDQ 789 2341 14454 6655 14423
Example (cont. ) Contacts Id Name Company Address Phone Zip. Code 1 Joe ABC 123 5532 12345 1 Joe ABC 123 2234 12345 1 Joe ABC 123 3211 12345 2 Jane XYZ 456 3421 14454 3 Chris PDQ 789 2341 14423 3 Chris PDQ 789 6655 14423 Could be a Data. List table in Excell!
What is a relational database? • Originally developed by E. F. Codd in 1970 • Organizes data into tables where each item is a row and the attributes of the item are in columns. • Different from “flat file” databases because you can define “relationships” between items in different tables.
A non-relational database STAFF
A relational version LOCATIONS ROLES STAFF
Kinds of Relationships • “One to One” – One row of a table matches exactly to another • One person, one id number, one address • “One to Many” – One row of a table matches many of another • One person, many phone numbers • “Many to Many” – One row may match many of another or many rows match one row of another
Normal forms • E. F. Codd in 1972 wrote a paper on “Further Normalization of the Data Base Relational Model” • Normal forms reduce the amount of redundancy and inconsistent dependency within databases. • Codd proposed three normal forms and through the years two more have been added. Summarized from Barry Wise’s article on Database Normalization http: //www. phpbuilder. com/columns/barry 20000731. php 3? page=1
The Zero Form • No rules have been applied • Where most people start (and stop) • No room for growth • Usually wastes space Contacts Name Company Address Phone 1 Phone 2 Phone 3 Zip. Code Joe ABC 123 5532 2234 3211 12345 Jane XYZ 456 3421 Chris PDQ 789 2341 14454 6655 14423
First Normal Form • Eliminate repeating columns in each table • Create a separate table for each set of related data • Identify each set of related data with a primary key Contacts Id Name Company Address Phone Zip. Code 1 Joe ABC 123 5532 12345 1 Joe ABC 123 2234 12345 1 Joe ABC 123 3211 12345 2 Jane XYZ 456 3421 14454 3 Chris PDQ 789 2341 14423 3 Chris PDQ 789 6655 14423 Benefits: Now we can have infinite phone numbers or company addresses for each contact. Drawback: Now we have to type in everything over and over again. This leads to inconsistency, redundancy and wasting space. Thus, the second normal form…
Second Normal Form People • Create separate tables for sets of values that apply to multiple records • Relate these tables with a “foreign key”. Id Name Company Address Zip 1 Joe ABC 12345 2 Jane XYZ 456 14454 3 Chris PDQ 789 14423 Phone. Numbers Phone. ID Id Phone 1 1 5532 2 1 2234 3 1 3211 4 2 3421 5 3 2341 6 3 6655
Third Normal Form • Eliminate fields that do not depend on the primary key. Phone. Numbers Phone. ID Id Phone 1 1 5532 2 1 2234 3 1 3211 4 2 3421 5 3 3 6655 Id Name Address. ID 1 Joe 1 2 Jane 2 3 Chris 3 2341 6 People Address. ID Company Address Zip 1 ABC 12345 2 XYZ 456 14454 3 PDQ 789 14423 Is this enough? Codd thought so… What about “many to many”?
• Fourth relationship, Form Normal In a “many to many” independent entities cannot be stored in the same table. Phone. Numbers Phone. ID 1 5532 2 People Phone 2234 Id Name Address. ID 3 3211 1 Joe 1 4 3421 2 Jane 2 5 2341 3 Chris 3 6 6655 Phone. Relations Phone. Rel. ID Phone. ID 1 Address Id 1 1 Address. ID Company Address Zip 2 1 ABC 12345 3 1 3 2 XYZ 456 14454 4 2 4 3 PDQ 789 14423 5 6 3 6
Fifth Normal Form • The “very esoteric” one that is probably not required to get the most out of your database. • “The original table must be reconstructed from the tables into which it has been broken down. ” • The rule ensures that you have not created any extraneous columns and all the tables are only as large as they need to be.
The Normal Forms • First Form – Eliminate replicated data in tables – Create separate tables for each set of related data – Identify each set of related data with a primary key • Second Form – Create separate tables for sets of values that apply to multiple records – Relate the tables with a foreign key • Third Form – Eliminate fields that do not depend on the primary key • Fourth Form – In many-to-many relationships, independent entities cannot be stored in the same table
Why normalize? • • Increases the integrity of the data Reduces redundancy Improves efficiency Although normalization can be hard, it is worth it in the long run.
What do I need to remember? • Keep normalization in mind. • Don’t replicate data in a table. • If you break the rules, know why you are breaking the rules and do it for a good reason.
All you need to know about SQL in 30 minutes (or less) SQL = Structured Query Language
Basic SQL Commands • • • Creating tables with CREATE Adding data with INSERT Viewing data with SELECT Removing data with DELETE Modifying data with UPDATE Destroying tables with DROP
Creating tables with CREATE • Generic form CREATE TABLE tablename ( column_name data_type attributes…, … ) • Table and column names can’t have spaces or be “reserved words” like TABLE, CREATE, etc.
Phone Book/Contact Record Name Address Company Phone Number URL/Web Page Age Height Birthday When we added the entry Character Character Integer Real (float) Date Timestamp
Phone Book/Contact Table CREATE TABLE contacts ( Name VARCHAR(40), Address VARCHAR(60), Company VARCHAR(60), Phone VARCHAR(11), URL VARCHAR(80), Age INT, Height FLOAT, Birthday DATE, When. Entered TIMESTAMP ); Plan your tables very carefully! Once created, they are difficult to change!
Phone Book/Contact Table CREATE TABLE contacts ( Contact. ID INT PRIMARY KEY, Name VARCHAR(40), Address VARCHAR(60), Company VARCHAR(60), Phone VARCHAR(11), URL VARCHAR(80), Age INT, Height FLOAT, Birthday DATE, When. Entered TIMESTAMP ); If you are going to use the relational nature of a database, don’t forget you need to have a unique way to access records! There is a way to make the key automatically increment, so you don’t have to worry about which one is next.
Adding data with INSERT • Generic Form INSERT INTO tablename (column_name, …) VALUES (value, …)
Inserting a record into ‘contacts’ INSERT INTO contacts (contactid, name, address, company, phone, url, age, height, birthday, whenentered) VALUES (1, ‘Joe’, ’ 123 Any St. ’, ’ABC’, ’ 800 -555 -1212’, ‘http: //abc. com’, 30, 1. 9, ’ 6/14/1972’, now());
Inserting a partial record INSERT INTO contacts (contactid, name, phone) VALUES (2, ’Jane’, ’ 212 -555 -1212’);
Viewing data with SELECT • Generic Form SELECT column, … FROM table, … WHERE condition GROUP BY group_by_expression HAVING condition ORDER BY order_expression • The most used command • Probably the most complicated also • If used improperly, can cause very long waits because complex computations
A few simple SELECTs • SELECT * FROM contacts; – Display all records in the ‘contacts’ table • SELECT contactid, name FROM contacts; – Display only the record number and names • SELECT DISTINCT url FROM contacts; – Display only one entry for every value of URL.
Refining selections with WHERE • The WHERE “subclause” allows you to select records based on a condition. • SELECT * FROM contacts WHERE age<10; – Display records from contacts where age<10 • SELECT * FROM contacts WHERE age BETWEEN 18 AND 35; – Display records where age is 18 -35
Additional selections • The “LIKE” condition – Allows you to look at strings that are alike • SELECT * FROM contacts WHERE name LIKE ‘J%’; – Display records where the name starts with ‘J’ • SELECT * FROM contacts WHERE url LIKE ‘%. com’; – Display records where url ends in “. com”
Removing data with DELETE • Generic Form DELETE FROM table WHERE condition; DELETE FROM contacts WHERE age<13;
Modifying data with UPDATE • Generic Form UPDATE table SET column=expression WHERE condition; UPDATE contacts SET company=‘AOL’ WHERE company=‘Time Warner’;
Destroying tables with DROP • Generic Form DROP TABLE tablename; DROP TABLE contacts;
More about SELECT
“Normal Forms” and SELECT • Good database design using the normal forms requires data to be separated into different tables • SELECT allows us to join the data back together • We can use “views” to create virtual tables
Joining together tables • SELECT name, phone, zip FROM people, phonenumbers, address WHERE people. addressid=addressid AND people. id=phonenumbers. id; Phone. Numbers Phone. ID Id Phone 1 1 5532 2 1 2234 3 1 3211 4 2 3421 5 3 2341 6 3 6655 People Id Name Address. ID 1 Joe 1 2 Jane 2 3 Chris 3 Address. ID Company Address Zip 1 ABC 12345 2 XYZ 456 14454 3 PDQ 789 14423
General form of SELECT/JOIN SELECT columns, … FROM left_table join_type JOIN right_table ON condition; SELECT name, phone FROM people JOIN phonenumbers ON people. id=phonenumbers. id;
ORDER BY • The “ORDER BY” clause allows you to sort the results returned by SELECT * FROM contacts ORDER BY company; SELECT * FROM contacts ORDER BY company, name;
Views • You can use “CREATE VIEW” to create a virtual table from a SELECT statement. CREATE VIEW contactview AS (SELECT name, phone, zip FROM people, phonenumbers, address WHERE people. id=phonenumbers. id AND people. addressid=addressid);
Finishing off • SQL is a large language – You have seen some basic commands – Takes time and effort to learn – Mainly needed for powerusers – Lots of stuff we haven’t talked about • E. g. security, multiple users, program accessing DB, … – Many database systems now use graphical user interfaces instead of SQL directly
0b4968edc9694f8fee839e977dec3de8.ppt