
1c1ab02574638c09cb511bd2569f7000.ppt
- Количество слайдов: 72
DAMA, 2001 December. ORMv. ER 1 What’s Wrong With ER Modeling? Gordon C. Everest Carlson School of Management University of Minnesota
Problems and Solutions ORMv. ER 2 OBJECTIVES FOR THIS PRESENTATION: • Show several PROBLEMS with ER modeling schemes, (actually, any “record-based” modeling scheme). • Identify the ROOT CAUSE of the problem To stop there would be irresponsible, so… • Show you a better way – a SOLUTION using Object Role Modeling (ORM) • NOT asking you to abandon what you have learned about data modeling and are doing in practice • BUT to defer thinking in terms of entity records, and to begin doing data modeling at a richer, more conceptual level
Data Modeling DMOD 3 What is the Dominant Data Modeling Scheme today? What’s Wrong with ER Modeling? BEFORE WE CAN ANSWER THAT: Why Do Data Modeling? How do we do Data Modeling? Why do we need Normalization?
Database Design DMOD 4 Objective: (WHAT we are trying to do) TO ACCURATELY AND COMPLETELY MODEL SOME PORTION OF THE REAL WORLD UNIVERSE OF DISCOURSE (Uo. D) OF INTEREST TO SOME ORGANIZATION OR COMMUNITY OF USERS.
Logical Database Design Objective, Principles, Benefits DMOD 5 · OBJECTIVE of LOGICAL DATABASE DESIGN: TO ACCURATELY AND COMPLETELY MODEL SELECTED PORTIONS OF THE REAL WORLD OF INTEREST TO A COMMUNITY OF USERS. • USERS (COLLECTIVELY) WILL ALWAYS KNOW MORE ABOUT A DATA STRUCTURE THAN THE SYSTEM KNOWS, OR THAN COULD BE DEFINED TO THE SYSTEM. • WHAT IS NOT FORMALLY DEFINED TO THE SYSTEM, THE SYSTEM CANNOT MANAGE. . . THE USERS MUST! • THEREFORE, NEED TO CAPTURE RICH SEMANTICS WITH COMPREHENSIVE DATA MODELING and DEFINITION, INCLUDING INTEGRITY CONSTRAINTS AND OPERATIONS. FOR ==> GREATER QUALITY & RELIABILITY IN DATA ==> GREATER USER CONFIDENCE. ==> HIGHER USER / DEVELOPER EFFICIENCY Let the ‘system’ do it!
Purpose of Data Modeling (WHY we do it) DMOD DUAL, CONFLICTING PURPOSES DRIVE THE PROCESS: 6 USE R • Facilitate Human Communication, Understanding, & Validation – capture and present meaning, the semantics of a model – direct representation of only essential model semantics PRESENTATION CHARACTERISTICS: – scoping and presenting subparts of a Model – unfolding presentation at different levels of abstraction or detail – visual prominence in proportion to semantic importance SECONDARY: • Basis for Implementation - defining & creating a Database – complete in all the necessary details – construction/generation able to be fully automated SCHEMA DATABASE
Modeling DMOD 7 (Re). present. (ation) present Reality (mental models) MODELING PROCESS MODEL Re. present Knowledge in the head Knowledge in the world Knowledge externalized, formalized, shared. What drives or guides the process?
The Modeling Process DMOD 8 METHODOLOGY: Steps/Tasks + Milestones + Deliverables Real World Universe of Discourse + MODELING SCHEME perception selection/filtering REPRESENTATIONAL FORMS: Narrative, Graphical Diagram, Formal Language Statements (the Syntax) Context Constructs Composition Constraints MODELING PROCESS MODEL
A Data Modeling “Scheme” DMOD 9 DEFINES the: • Context • Constructs (ENTITIES, OBJECTS) • Collections, Compositions, Connections (RELATIONSHIPS) • Constraints, Characteristics WE LOOK FOR IN THE “REAL” WORLD Uo. D or Domain of Interest and USE IN BUILDING A DATA MODEL.
Data Modeling Constructs DMOD 10 What to look for: Relative emphasis differentiates Data Modeling approaches ENTITY RELATIONSHIP (OBJECT) IDENTIFIER ATTRIBUTE characteristics [ FOREIGN KEY ] characteristics
Student-Course Database - Table Diagram DMOD 11 Diagram of the Schema: COURSE Course# Title Description Credits INSTRUCTOR SSN Last. Name First. Name Address Phone Dept COURSE OFFERING Course# Year Term Section Building Room Days Time Start Control Enrollment Instructor SSN What if you move the arrow head to the other end of the arc? STUDENT Student ID Name Address Major GPA REGISTRATION Course ID Student ID Grade LEGEND: ENTITY NAME (upper case) Identifier (bold face) Attributes (not bold face) Foreign Key Identifier M: 1 relationship
Student-Course Database – Populated DMOD 12 Actual instances of data values: COURSE: Course# Title Credits ACC 101 Intro Accounting 4 ENG 101 English Composition 4 MIS 101 Intro MIS 4 MIS 103 Intro Database 4 MIS 403 Advanced Database 2 … COURSE OFFERING: CRSO# Course# Year Term Sect Room 1004 MIS 101 2000 Fall 001 1 142 1017 MIS 101 2001 Spr 002 2 224 3001 MIS 103 2000 Fall 001 2 207 … INSTRUCTOR: Instr. ID Name 33741 Allen, Lillian 85959 Boyd, Don 64578 Carlis, John 11248 Davis, Gordon 77004 Everest, Gordon 55432 Fine, Alan … Dept Eng ACC CSci IDS IDS Secondary (Composite) Key STUDENT: Student. ID Name Major GPA 1111111 Able, Emma MIS 3. 4 2222222 Bright, Sue MIS 3. 9 3333333 Challenger, X ACC 2. 7 4444444 Dummie, Noe ACC 3. 2 5555555 Everest, Monty MIS 3. 8 … Instr. ID 11248 55432 77004 Enroll 48 60 27 REGISTRATION: CRSO# Student. ID 1004 4444444 1017 3333333 3001 1111111 3001 2222222 3001 5555555 3001 7777777 … Grade B+ B A A A
Data Modeling – Schema Diagram DMOD 13 THINKING ABOUT ATTRIBUTES: Record-Based: ENTITY IDENTIFIER ATTRIBUTE . . .
Essentials of ER Modeling / Diagramming DMOD 14 ENTITY 1 1 RELATIONSHIP M ENTITY 2 Adding Attributes, omitting the Diamond: identifier ENTITY 2 ====== IDentifier 1 ----------Attribute 1. 1 Attribute 1. 2 Attribute 1. 3 : ====== IDentifier 2 ----------Attribute 2. 1 Attribute 2. 2 Attribute 2. 3 Foreign. ID 1 : ENTITY 1 ENTITY 2 ENTITY IDENTIFIER ATTRIBUTE . . . Attribute 1 Attribute 2 Attribute 3 relationship> ENTITY 1 Attribute
ORMv. ER 15 What’s wrong with ER Modeling? ____
ER / Record-based Modeling DMOD 16 VALUE DOMAIN . . . roles TABLE: ID X ATTRIBUTES. . . A B C D CLUSTERING of ATTRIBUTES into RECORDS/RELATIONS – NOT a necessary or desirable first step – gets us into trouble: if too much, must decompose to normalize
Record-based Design ORMv. ER 17 WHAT SEMANTICS ARE PRESUMED BY THE FOLLOWING RECORD STRUCTURE? X A B C • What does it say about X ? • What does it say about A ? • What does it say about the relationship X – A ? • What does it say about the relationship A – B ? There at least 14 distinct semantic statements you can make in answering these questions! • Do we know it is in Third Normal Form (3 NF)? How?
Record-based Design ORMv. ER 18 WHAT DOES IT SAY ABOUT X A X? B C
Record-based Design ORMv. ER 19 WHAT DOES IT SAY ABOUT X A A? B C
Record-based Design ORMv. ER 20 WHAT DOES IT SAY ABOUT THE RELATIONSHIP X A B C X–A ?
Record-based Design ORMv. ER 21 REPRESENTING THE RELATIONSHIP X A A N B D. . . C X–A ?
Record-based Design ORMv. ER 22 WHAT DOES IT SAY ABOUT THE RELATIONSHIP X A B C A–B ?
Record-based Design ORMv. ER 23 REPRESENTING COMPLEX RELATIONSHIPS AMONG X A A . . . C B ? B X, A, & B. A? . . . Separately consider the relationship between A and B. What if it is many-to-many? What if other information is functionally dependent on A–B ?
Record-based Design - Compound Key ORMv. ER 24 WHAT IS PRESUMED BY THE FOLLOWING RECORD STRUCTURE? X Y A B C
Major Data Modeling Schemes DMOD 25 (1) SINGLE FILE (E-A) FLAT FILE “TABLE” HIERARCHICAL - nested repeating groups e. g. , COBOL (M) MULTIFILE (E-R → E-A-R) NETWORK - hierarchical records RELATIONAL (E-A-[R]) - flat records (O) NO FILE (O-R) (No Clustering of Data Items into Records) NIAM/“Binary” Modeling ORM (Object-Role Modeling - Halpin) Everest DM 4 p. 121. RECORDBASED (Clustered Data Items)
Data Modeling Schemes DMOD 26 CLASSIFIED by Degree of Clustering: • No clustering – NIAM/ORM - Nijssen, Halpin • Clustering to One Level => Atomic Data Values – – – – Relational Modeling - Codd ER Modeling - Chen Extended ER (EER) - Teorey Information Engineering (IE) – Clive Finkelstein -> James Martin Oracle (Designer*2000) - Barker IDEF 1 X - Appleton, US Gov’t, ERwin (tool), Bruce (book) • Nested Objects – – – Hierarchical data structure (single file; COBOL) CODASYL Network (ANSI NDL) Nested Relations Semantic Object Modeling (SOM) – Kroenke, Salsa (tool) Object Modeling (UML) – Rational Rose (tool) – ANSI SQL: 1999
Data Modeling Schemes – Clustered DMOD 27 HIERARCHIC special case NETWORK ER single file nested repeating groups implicit hierarchical relationships multifile, hierarchical record defined relationships => semantic/ OBJECT models Focus on E & R, hidden record structure Usually flat records [optionally with attributes] Defined relationships (general M: N) Usually restricted to binary relationships Multifile; flat records only RELATIONAL Relationships as foreign keys so no M: N relationships
Taxonomy of “Clusterered” Data Structures DMOD 28 Single File Multiple Files Flat SINGLE FLAT FILE (“TABLE”) RELATIONAL (“TABLES”) Nested HIERARCHICAL FILE (CODASYL) NETWORK Intra-Record Structure Clustered
Stages of Data Modeling DMOD 29 Start at the highest Conceptual Level! USE R Domain Knowledge CONCEPTUAL ORM ER CLUSTERED “LOGICAL” Attribs in Records RELATIONAL • Objects Multi. Valued, • Obj. ID’s Nested - - -> Flat (1 NF) PHYSICAL • Roles/Relships Ternaries - - -> Binary only • Implementation • (Fnl. Dep) in/for a DBMS M: N - - - - -> 1: Many only NO clustering • Denormalize Normalized (2, 3, 4) Primary Keys (for performance) => NO “attributes” Relationships - - -> Foreign Keys + triggers, stored w/attributes procedures Sub/Sup. Types SCHEMA DATABASE
Data Modeling - Representation Stages DMOD A SECOND CUT: 30 USE R NEW • Conceptual (ORMHALPIN/NIJSSEN SUMMFULTON UDMCDMTG) – only what the user knows or needs to know – functional dependencies fully represented – Elementary Facts - no clustering of “attributes” into “records” • Clustered (ERCHEN EERTEOREY SDMMc. LEOD SOMKROENKE SQL: 99 ANSI UML) – identifiers (attributes or dependent relationships) – keep: M: N, ternary relationships, super/subtypes, attributed relationships, multi-valued items/rgroups • “Logical” (RELATIONALCODD SQLANSI ) – flat files/tables; – stored identifiers; – 3 NF (decompose) – resolve: M: N, ternary, super/subtype relationships – foreign keys to represent relationships SCHEMA • Denormalize (Recluster) - for performance • Physical (IMPLEMENTATION in a DBMS) – triggers, stored procedures, user code to DATABASE
Data Modeling Schemes - ER DMOD 31 • ENTITIES, that have ATTRIBUTES, and participate in RELATIONSHIPS. • Originated with Peter Chen, 1976, TODS (1: 1) • Notation has evolved, many variations – Drop diamond; attributes inside entity box or suppressed. • • • No standard syntax notation (but similar semantics) Common: attributes clustered into entity records. Most popular today Weak entity Association entity Relationship naming: one name, direction unstated, thus ambiguous; need direction (>) or rule (eg. left to right). EMPLOYEE Emp. No M Emp. Name works in … 1 Unit. No DEPT Name …
Data Modeling Schemes - Oracle DMOD 32 • • In Oracle Designer*2000 tool (R. Barker, A-W, 1990) A flavor of ER modeling ENTITY in rounded box; optionally ATTRIBUTES inside ATTRIBUTE flags: # - [part of] identifier * - mandatory o – optional • RELATIONSHIPS: - binary only - two names at end from which to be read - optional ---, mandatory —, many - identifying ———, fixed ——— EMPLOYEE Emp. No (#) Emp. Name (*) Address (o) works in employs DEPT
Data Modeling Schemes - IE DMOD 33 • • • Information Engineering (1970’s) Due to Clive Finkelstein, adapted by James Martin Used in several tools: IEF, IEW/ADW/Cool, ICES, … Widely used, many variations, no single standard ENTITIES: in boxes, optionally with ATTRIBUTES, in or out RELATIONSHIPS: - usually binary only - many ——— , at most one ——— - optional ——— (at the “other” end) - mandatory, at least one ——— EMPLOYEE DEPT
Data Modeling Schemes – IDEF 1 X DMOD 34 • U. S. Air Force/Defense (1970’s), Appleton e. Xtensions • • • NIST (U. S. Govt) standard – 1993; revised in IDEF 1 X 97; IEEE - 1998 Book by T. Bruce, 1992; Used in ERwin (now from CA), Visio, … Widely used in and for U. S. Govt work, some outside Some Relational restrictions: Foreign Keys, thus no M: N “Unnecessarily complex, confusing, and forgettable” - Halpin • ENTITY: independent , dependent • ATTRIBUTE flags: - Alternate Key - (AKi), Foreign Key (FK) • - optional (O) – mandatory is default RELATIONSHIPS: - binary only, “child” ——— (may be arbitrary) - First Name always read toward the child - identifying —— , non identifying ----EMPLOYEE - “cardinality” on child: P - one or more, Z - zero or one, n - exactly n DEPT Emp. No ----- Parent is optional (some allow many parents) Emp. Name SS# (AK 1) Address (O) Unit. No (FK) employs/ works in Dept. No Dept. Name :
Forming a Relational Data Structure RELSQL 35 Some rules: • Define a TABLE or “Relation” for each Entity type – Types of Entities: base/reference, dependent (“weak”), association/intersection, event/transaction – Assumes mutually exclusive (non-overlapping) populations • SINGLE-VALUED ITEMS (“flat” tables) – If multivalued or nested repeating group of items, put into a separate table • IDENTIFIER for every table (entity “integrity”) • FOREIGN IDENTIFIERS to represent all relationships 1: M - stored in the child / dependent entity 1: 1 - should probably merge into one table M: N - must introduce an association/intersection table • NORMALIZE to second and third normal form – important for good design – but not enforced by RDBMS. . . WHY?
Functional Dependency in Relationships RELSQL 36 Basis for Database functionally dependent on is Normalization. X A f (X) A determines X A A is dependent on X, and the Relationship is exclusive on A, multiple on X. Clustered into a Record/table for entity of X: X A … A for each X. There can be multiple Xs for a given A. There can be different As for the Xs. There can only be one
Database Normalization RELSQL 37 Start with ENTITIES, their IDENTIFIERS (unique keys) and their ATTRIBUTE FIELDS (facts about each entity). i. e. , start with data items clustered into records/tables. PROBLEM: we may do it wrong; cluster too much; some items in the wrong place, which can lead to redundancy & update anomalies. Any Flat File is a Relation, but… not all Relations are “well-formed. ” • NORMALIZATION is the test – a set of rules to perform internal validation of a data model • Record DECOMPOSITION is the remedy. – Removing attributes from the entity record, and placing them in a different, often a new entity record (1) First Normal Form: no multivalued items or rgroups. (2) Second Normal Form: no partial dependencies. (3) Third Normal Form: no transitive dependencies. “Every non-key data item must be single-valued, and dependent upon the key, the whole key, and nothing but the key… so help me Codd. ”
Anomalies RELSQL 38 Resulting from (clues to) poor database design: EMPLOYEE# o • • • EMPNAME SKILL PROFICIENCY … BOSSNAME DEPT# DEPTNAME and BOSSNAME stored redundantly if EMPLOYEE moves to another DEPT#, DEPTNAME and BOSSNAME would also change, needing update. If a DEPTNAME (or BOSSNAME) for a DEPT changes, must update all occurrences, else inconsistency. To delete a DEPT you must also delete all its EMPLOYEEs (unless null foreign keys allowed!) If you delete the last EMPLOYEE in a DEPT, you also delete that DEPT (unless null keys allowed!…multiple? ) No place to insert a DEPT# and its DEPTNAME, if there are no EMPLOYEEs there.
Summary of all Normal Forms RELSQL 39 GIVEN: – a set of attributes, clustered into tables/records with identifiers – all functional dependencies on the attributes • No multi-valued, non-key attributes (1 NF) • No partial dependencies on non-key attributes (2 NF) • No transitive dependencies in non-key attributes (3 NF) • No partial or transitive dependencies within any key (EKNF, BCNF), i. e. , consider all candidate keys. • No multiple, independent multi-valued attributes in the same table (4 NF) • No join dependencies, i. e. , a relation can be reconstructed without loss of information by joining some of its projections (5 NF). • No more than one table with the same key (“minimal”). • No transitive dependencies across tables (“optimal”). NOTE: number order is artificial, i. e. , there is no necessary sequence to the normal forms.
Normalization – Testing your Understanding RELSQL 40 Assuming that GIVEN: X A X A A is single valued with respect to X (i. e. 1 NF). Could you have a violation of: (if not, why not? ) 2 NF? 3 NF? 4 NF? B What does this diagram mean? How does this differ from diagram above, if any? MUST DISTINGUISH THE PRIMARY KEY.
Representing a M: N Relationship DMOD 41 Another Pattern: EMPLOYEE PROJECT • If you cannot store multiple Projects (or Project IDs) in an Employee record, or multiple Employees (or Employee IDs) in a Project record (as is the case in a Relational Database), then … you must introduce an “Intersection Entity” between them to represent the Many-to-Many Relationship. EMPLOYEE EMPL-ID PROJECT PROJ-ID • The Intersection Entity also provides the place to store additional attributes of the relationship e. g. , Hours Worked, Rate of Pay, … What is the problem with this representation?
Representing a Ternary Relationship DMOD 42 While we can develop a consistent notation for binary relationships, ternary relationships are a problem. EMPLOYEE SKILL PROFICIENCY • If one of the entities is single valued, is it really ternary? Or “attributed” binary? • What lends uniqueness to each instance of the relationship? • How to verbalize the relationship? Which order? • How to represent Multiplicity / Exclusivity ? • How to represent Dependency? Must have all 3?
What’s Wrong with ER Modeling? ORMv. ER 43 I will show you still a more excellent way – PAUL, I Cor 12. 31 N
Record-based Design B ORMv. ER 44 WHAT DOES THIS “RECORD” REPRESENT? X X A X B X C A B C Design minimal records with at most one non-key domain. Now what do these “records” represent? Perhaps Codd was right in naming it a _____! Avoids spurious associations, e. g. , A – B … Could there be any violations of normal forms? What about the representation of the entity What if A is related to other “entities”? X?
Transform Record-based (ER) Design ORMv. ER 45 TO REALLY REPRESENT THE ENTITY DOMAINS X A B C A X C X B B X Object Role Model: C
Data Modeling ORMv. ER 46 THINKING ABOUT ATTRIBUTES: Record-Based (ER): ENTITY IDENTIFIER ATTRIBUTE Object-Role (ORM): ENTITY ATTRIBUTE ENTITY . . . ENTITY (id) ENTITY ENTITIES have ATTRIBUTES / DESCRIPTORS by playing roles in relationships with other entities.
Record-Based Modeling ORMv. ER 47 GIVEN TWO FACTS (conceptually): • one about the CITY a PERSON lives in • another about the CITY a PERSON works in ASSUME: • every person has to live and work in a city • each person can live and work in only one city at a time • not interested in anything more about persons or cities EXAMPLE: • Gordon Everest lives in Falcon Heights and * works in Minneapolis DIAGRAM A CONCEPTUAL DATA MODEL – to represent this information (a database to contain these facts)
Record-Based Data Model ORMv. ER 48 for PERSON lives in / works in a CITY • What is the entity and what is the attribute? • Would it make any sense to say (to a novice layperson - a user): – CITY was an "attribute" of PERSON? • Doing more than is necessary at the conceptual level PERSON Person. ID [key] Live. City Work. City • • • cannot have CITY and CITY as attributes of PERSON column/attribute name reflects " entity + role " CITY as an entity/object is lost (not its own table) what if there is a CITY where no one lives or works some add concept of a DOMAIN
Object-Role Model ORMv. ER 49 for PERSON lives in / works in CITY lives in CITY PERSO N (id) (name) works in FORML language statements: • PERSON lives in CITY • Every PERSON lives in some CITY • Each PERSON lives in at most one CITY • . . . for works in FACT
Record-Based Modeling ORMv. ER 50 for an additional fact. • A PERSON makes sales calls in multiple CITIES DIAGRAM the extended conceptual data model • can you add an attribute "Sales. Call. Cities" to PERSON? FLAT Record-Based Modeling is even worse: • create a new table SALESCALLS with a compound key – Is this a real entity in the conceptual view? EXTEND THE OBJECT-ROLE DATA MODEL
Record-Based Data Modeling ORMv. ER 51 DISADVANTAGES: • no way to capture the conceptual view directly • must mentally map from conceptual view to the "logical" (record-based) view – by structural groupings of attributes and relationships • must choose unique, arbitrary names – for attributes in a record; for spurious new "entities" • • cannot reuse attributes in the same table must do your own normalization hides or ignores inter-attribute relationships creates (implies) spurious inter-attribute relationships
Object-Role (ORM) Data Modeling ORMINTRO 52 THE ESSENTIAL DIFFERENCE: • Three main constructs. . rolled into. . Two main constructs Record based modeling: ENTITY NIAM/ORM modeling: ? ? What to call it? ATTRIBUTE OBJECT ENTITY ENTRIBUTE! RELATIONSHIP Role in RELATIONSHIP
Data Modeling Terminology ORMINTRO 53 O-R E-R ("conceptual") ("logical") COBOL/DBTG ("physical" implementation) ENTITY (TYPE) RECORD TYPE ATTRIBUTE DATA ITEM OBJECT (ELEMENT) FACT SENTENCE INSTANCE RECORD IDENTIFIER PREDICATE RELATIONSHIP CONSTRAINT CHARACTERISTI CS RELATIONAL RELATION TABLE COLUMN FIELD ROW TUPLE KEY "SET" FOREIGN KEY CONSTRAINT
Fact Sentence - Verbalize ORMODLG 54 • A Fact = a Predicate + Object(s) => Sentence • THINK: Objects playing Roles in a Relationship • Naming: object instances versus object types – e. g. “Ann” is an instance of “Person” • Arity - the number of object “holes” in the Predicate – UNARY: - “Ann smiles” only 2 states: true/false, present/absent, yes/no making the closed world assumption – BINARY: - “Ann likes to run” most common has an inverse- “Running is liked by Ann” Inverse name is never the same (else symmetric, handled differently) – TERNARY: - “Ann married Bob in 1967” with types: - “PERSON married PERSON in YEAR” verbalizing can be difficult with more than 2 (sequence problem)
Symbolize: ORM Constructs ORMODLG 55 • OBJECT (ENTITY, CONCEPT) - NOUN … in an ellipse • PREDICATE (RELATIONSHIP) - verb = role name …in a box – unary, binary, ternary, +++ Binary Predicate: PREDICATE OBJECT 1 role 12 role 21 OBJECT 2 Elementary Binary Fact Sentence: works in employs PERSON Verbalization: “PERSON works in DEPARTMENT” “DEPARTMENT employs PERSON” DEPARTMENT
Adding ORM Constraints ORMODLG 56 works in employs PERSON DEPARTMENT Verbalization: “PERSON works in DEPARTMENT” “DEPARTMENT employs PERSON” DEPENDENCY (MANDATORY): “PERSON must work in some DEPARTMENT” EXCLUSIVITY (UNIQUENESS): “PERSON works in at most one DEPARTMENT”
ORMINTRO Methodology Steps in OR Modeling 57 • • • Familiarize with real world Universe of discourse Verbalize sentences of elementary facts Symbolize build the conceptual ORM model diagram Constrain the roles in predicates Validate the conceptual data model • Map into neutral, record-based, logical tables • Refine the table definitions • Generate physical database definition for target DBMS
Visio. Modeler Architecture ORMINTRO 58 FORML fact sentences Population Quick Facts Tables FACT EDITOR VERBALIZER CONCEPTUAL DATA MODEL BUILD DICTIONARY "REPOSITORY" "LOGICAL" DATA MODEL DIAGRAMMER correct VALIDATE (CHECK) refine (TABLES) BROWSER GENERATE PHYSICAL DATABASE STRUCTURE & DEFINITION for a target DBMS
Levels of Abstraction in NIAM/ORM DMODPRE 59 REMOVING (generally in order of importance): 1. Lexical Object Types (LOTS); Value Object Types 2. “Terminal” Object types – equivalent to / become “attributes” IF: – play only functionally dependent roles (often only one role) i. e. One: Many relationships; (disjunctive) mandatory (implied) 3. Common Object Types - generic value domains / ref. modes 4. “Event” Object Types 5. Dependent (“weak”) Object Types - Subtypes, Objectified Facts 6. User-defined priority levels on Object Types 7. Constraints and Reference Modes 8. Predicates
Sample, Simple ORM Data Model DMODPRE 60 earns EMPLOYEE (number) works in BOSS supervises SALARY (dollars) paid to DEPT (number) employs is headed by reports to superior to ac LIMIT may spend up to of spending for "Employee. Skill!" { 1000. . 9999 } possesses <=5 possessed by SKILL (code) has DESCRIPTION (name) is of { 1. . 10 } with proficiency of assigned to A major criticism of NIAM / ORM, both by protagonists and proponents, is that it is too detailed, a bottom-up design, BUT… ER Diagrams usually omit the details of attributes and most constraints. So, present the model using top-down abstractions. RATING Remove "Terminal" (M: 1) Objects
ORM Abstractions DMODPRE 61 • Removing "Terminal" (M: 1) Objects { 2000. . 2999 } EMPLOYEE DEPT works in employs (number) BOSS (number) supervises is headed by reports to superior to ac "Employee. Skill!" { 1000. . 9999 } possesses possessed by <=5 SKILL (code) Remove Constraints and Reference Modes
ORM Abstractions DMODPRE 62 • Removing Constraints and Reference Modes EMPLOYEE works in employs BOSS DEPT supervises is headed by reports to superior to possesses possessed by SKILL Remove Less Important Objects & Predicates – Subtypes, Objectified Predicates, Reflexive Relationships
ORM Abstractions DMODPRE 63 • Removing Less Important Objects & Predicates – Subtypes, Objectified Predicates, Reflexive Relationships EMPLOYEE works in employs supervises is headed by SKILL Remove Predicates DEPT
ORM Abstractions DMODPRE 64 • Removing Predicates EMPLOYEE SKILL DEPT . . . Leaving BASE Entities! A Top-Level Abstract Conceptual Data Model an ER Diagram ? ! ! !
Language Design Criteria ORMQURY 65 See: Halpin, “Conceptual Queries”. • Semantic Strength, Expressiveness – Able to model all relevant details in the domain – The range of queries that can be expressed – The “ 100% Principle” • Semantic Clarity – Ease of Understanding and Use; intuitive – Unambiguous, i. e. , only one possible meaning • Semantic Relevance – Only relevant information need be stated – Not dependent on artificial or spurious expressions • Semantic Stability, Independence – How well the model/query retains its original intent in the face of changes to the underlying application
Conceptual Query Language ORMQURY 66 See: Halpin, “Conceptual Queries”. • Con. Quer – Based on ORM – Need not be familiar with ORM or its notation “user can construct a query without any prior knowledge of the schema” but… – In the form of a textual outline Indentation is significant – Implemented in Visio Active. Query Object pick list – drag to the query window Roles pick list – drag to the query window – Projection – items to display marked with a tick ( ) – Mapping to SQL
Sample Con. Query (1) ORMQURY 67 See: Halpin, “Conceptual Queries”. “List Employees who live in the City that is the Location of Branch 52” Employee (number) Branch (number) City lives in / is located in / is location of / has City. Name U State Employee [number] / is in (code) +– lives in City +– is location of Branch [number =] 52 NOTE: City acts as a Join object type (the common “attribute”), i. e. Employee and Branch are joined through City. Semantic clarity (+), semantic relevance (+), semantic stability (+).
SQL for Sample Con. Query (1) ORMQURY 68 See: Halpin, “Conceptual Queries”. “List Employees who live in the City that is the Location of Branch 52” Employee Branch (number) is located in / is location of / has City. Name In SQL: (Where are the tables? ) SELECT FROM WHERE and City lives in / (number) U State (code) / is in Employee. Number Employee, Branch Employee. City. Name = Branch. City. Name Employee. State. Code = Branch. State. Code Branch. Number = 52 Could you do this in Access using the Query Form? Semantic clarity (-), semantic relevance (-), semantic stability (-) Suppose an Employee could live in more than one City? ? ? Suppose we now wish to record the Population of Cities? ? ?
Problems with ER Modeling - Summary ORMv. ER 69 • Too much clustering; attributes in the wrong place • Ignores (presumes) intra-record structure (that is, inter-attribute relationships) • Human modeler is responsible for normalization remedy is always record decomposition • Attribute migration… to become an entity - modeler must distinguish attributes and entities • Naming columns = domain + role, loses domain objects • Modeling dilemma: – Complete representation of an entity object - more clustering – Full normalization (1 NF) – decomposition, less clustering • Indirect representation of M: N relationships – Introduces artificial “new” entities • Difficulty representing Ternary relationships • Stability of the query language (SQL)
At the Root, ORMv. ER 70 What’s wrong with ER Modeling? CLUSTERING Gordon C. Everest Carlson School of Management University of Minnesota
Why NIAM/OR Modeling? ORMINTRO 71 • roots in both LOGIC & LINGUISTICS • based on one modeling construct: the fact sentence • more expressive, understandable - diagrams & verbalization • diagrams can be populated with actual data samples • abstraction levels equivalent to E-R modeling • more, richer semantics (than E-R, EER, IDEF 1 X) • capture and represent all functional dependencies • avoids normalization problems with record-based modeling • better meets criteria for good data modeling • organizations that switched wouldn’t go back to E-R • direction of Standards (SUMM, UDM, . . . ) • now supported with a viable PC-based CASE tool
Resources on ORMv. ER 72 BOOK: • Terry Halpin (now from Microsoft), Information Modeling and Relational Databases: From Conceptual Analysis to Logical Design, Morgan Kaufmann Publishers, San Francisco, 2001, 763 pages. WEB SITE for my course: • http: //webfoot. csom. umn. edu/faculty/everest/idsx 431 – with ORM intro and further reading – Info. Modeler software download – Usage Notes SPRING CLASSES: • IDSc 6431 (for MBAs) • IDSc 4431 (for CSOM Undergrads) • IDSc 4131 (for CCE and others) TRAINING and CONSULTING: • In. Concept, Inc. , Lake Elmo, MN www. inconcept. com
1c1ab02574638c09cb511bd2569f7000.ppt