College of Computing and InformaticsAssignment 1
Deadline: Sunday 09/04/2023 @ 23:59
[Total Mark for this Assignment is 8]
Student Details:
Name: ###
ID: ###
CRN: ###
Instructions:
• You must submit two separate copies (one Word file and one PDF file) using the Assignment Template on
Blackboard via the allocated folder. These files must not be in compressed format.
• It is your responsibility to check and make sure that you have uploaded both the correct files.
• Zero mark will be given if you try to bypass the SafeAssign (e.g. misspell words, remove spaces between
words, hide characters, use different character sets, convert text into image or languages other than English
or any kind of manipulation).
• Email submission will not be accepted.
• You are advised to make your work clear and well-presented. This includes filling your information on the cover
page.
• You must use this template, failing which will result in zero mark.
• You MUST show all your work, and text must not be converted into an image, unless specified otherwise by
the question.
• Late submission will result in ZERO mark.
• The work should be your own, copying from students or other resources will result in ZERO mark.
• Use Times New Roman font for all your answers.
Question One
Pg. 01
Learning
Outcome(s):
Explain the
concepts and
architectures
involved in the
database
development.
Question One
2 Marks
Use your own words to discuss the following:
“Using spreadsheets or typical computer-based filing systems looks a reasonable choice
for organizing personal data. However, it is not sufficient to use such systems when the
amount of data becomes huge. In such case, it may be time to implement a database.”
Answer:
Question Two
Pg. 02
Learning
Outcome(s):
Explain the
concepts and
architectures
involved in the
database
development.
Question Two
2 Marks
Define five mini-world Entities and five Relationships for a hospital database
system.
Answer:
Question Three
Pg. 03
Learning
Outcome(s):
Create EntityRelationship
model, Relational
model, and write
SQL queries.
Question Three
2 Marks
Draw an ER diagram for the following car sharing system:
In the car sharing system, a CarMatch application has record of anyone who would like
to share his/her car, known as a CarSharer. An Administrator registers all the potential
CarSharers and with their first name, last name, home address, and date of birth. Each
CarSharer is also assigned a unique id. A CarSharer can take policies issued by the
Insurance Company. Each policy has a number, premium, and a start date. A CarSharer
needs to know the start and destination address of each Journey.
Answer:
Question Four
Pg. 04
Learning
Outcome(s):
Question Four
2 Marks
Consider the following ER Diagram and fill the table by identifying the ER constructs in the
Create Entity-
diagram.
Relationship
model, Relational
model, and write
SQL queries.
Answer:
Constructs
Strong Entities
Weak Entities
Relationships
Key Attributes
Names
الجامعة السعودية االلكترونية
الجامعة السعودية االلكترونية
26/12/2021
College of Computing and Informatics
Data Science Pre-Master Program
IT244
Introduction to Database
IT244
Introduction to Database
Part 1
Module 1: Database Architecture and
Classification
Contents
1. Database System Concepts and Architecture
2. Classification of Database Management Systems
Weekly Learning Outcomes
1. Explain Database System Concepts and Architecture.
2. Explain types of Databases.
Required Reading
1. Chapter 2: Database System Concepts and
Architecture
2. Chapter 2: Classification of Database Management
Systems
(Fundamentals of Database Systems, Global Edition,
Recommended
Reading
7th Edition (2017)
by Ramez Elmasri & Shamkant
Database
System Concepts and Architecture:
Navathe)
https://cs.uwaterloo.ca/~tozsu/courses/CS338/lectures/14%20DB%20System.pdf
NoSQL databases: https://courses.cs.vt.edu/cs4604/Spring21/pdfs/12-nosql.pdf
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
• Database System Concepts and Architecture
Data Models
• Data Model:
– A set of concepts to describe the structure of a database, the
operations for manipulating the data, and the constraints that the data
should follow.
• Data Model Structure and Constraints:
– Data Model constructs define the database structure
– Data model constructs often include: data elements and their data
types (often called attributes); grouping of related elements into
entities (also called objects or records or tuples); and relationships
among entities
– Constraints specify restrictions on the stored data; the data that
satisfies the constraints is called valid data
• Data Model Operations:
– These operations are used for specifying database retrievals and
updates by referring to the constructs of the data model.
– Operations on the data model may include basic model
operations (e.g. generic insert, delete, update) and user-defined
operations (e.g. compute_student_gpa, update_inventory)
Categories of Data Models
• Conceptual (high-level, semantic) data models:
– Provide concepts that are close to the way many users perceive
data.
• (Also called entity-based or object-based data models.)
• Physical (low-level, internal) data models:
– Provide concepts that describe details of how data is stored in
the computer. These are usually specified in an ad-hoc manner
through DBMS design and administration manuals
• Implementation (representational) data models:
– Provide concepts that fall between the above two, used by many
commercial DBMS implementations (e.g. relational data models
used in many commercial systems).
Database Schema versus Database
State(1)
• Database Schema:
– The description of a database.
– Includes descriptions of the database structure, relationships, data types, and
constraints
• Schema Diagram:
– An illustrative display of a database schema
• Schema Construct:
– A component of the schema or an object in the schema, e.g., STUDENT,
COURSE, Name
• Database State:
– The actual data stored in a database at a particular moment in time. This
includes the collection of all the data in the database.
– Also called a database instance, occurrence, or snapshot.
Database Schema vs. Database State(2)
• Initial Database State:
– Refers to the database state when it is initially loaded into
the system.
• Valid State:
– A state that satisfies the structure and constraints of the
database.
• Distinction
– The database schema changes very infrequently.
– The database state changes every time the database is
updated.
• Schema is also called intension.
• State is also called extension.
Example of a Database Schema
Three-Schema Architecture(1)
• Proposed to support DBMS characteristics of:
– Program-data independence.
– Support of multiple views of the data.
• Not explicitly used in commercial DBMS products, but has
been useful in explaining database system organization
• Defines DBMS schemas at three levels:
– Internal schema at the internal level to describe physical storage
structures and access paths (e.g. indexes).
• Typically uses a physical data model.
– Conceptual schema at the conceptual level to describe the structure
and constraints for the whole database for a community of users. Uses
an implementation (or a conceptual) data model.
– External schemas at the external level to describe the various user
views.
• Usually uses the same data model as the conceptual schema.
Three-Schema Architecture (3)
• Mappings among schema levels are needed to
transform requests and data.
– Users and programs refer to an external schema, and are
mapped by the DBMS to the internal schema for execution.
– Data extracted from the internal DBMS level is reformatted
to match the user’s external view (e.g. formatting the
results of an SQL query for display as a Web page)
Data Independence
• Logical Data Independence:
– The capacity to change the conceptual schema without having to
change the external schemas and their associated application
programs.
• Physical Data Independence:
– The capacity to change the internal schema without having to change
the conceptual schema.
– For example, the internal schema may be changed when certain file
structures are reorganized or new indexes are created to improve
database performance
• When a schema at a lower level is changed, only the mappings
between this schema and higher-level schemas need to be
changed in a DBMS that fully supports data independence.
• The higher-level schemas themselves are unchanged.
– Hence, the application programs need not be changed since they refer
to the external schemas.
DBMS Languages(DDL)
• Data Definition Language (DDL):
– Used by the DBA and database designers to specify the
conceptual schema of a database.
– In many DBMSs, the DDL is also used to define internal
and external schemas (views).
– Theoretically, separate storage definition language
(SDL) and view definition language (VDL) can used to
define internal and external schemas. In practice:
• SDL is typically realized via DBMS commands provided to the
DBA and database designers
• VDL is typically part of the same language as DDL
DBMS Languages (DML)
• Data Manipulation Language (DML):
– Used to specify database retrievals and updates
– DML commands (data sublanguage) can be embedded in a generalpurpose programming language (host language), such as COBOL, C,
C++, or Java
• A library of functions can also be provided to access the DBMS from a
programming language
– Alternatively, stand-alone DML commands can be applied directly
(called a query language).
• Types of DML:
– High-Level
Declarative
(Set-oriented,
Non-procedural)
Languages, such as the relational language SQL
• Specify “what” data to retrieve rather then “how” to retrieve it
• May be used in a standalone way or may be embedded in a programming
language
– Low Level or Procedural (Record-at-a-time) Languages:
• Must be embedded in a programming language
• Need programming language constructs such as looping
DBMS Interfaces
• Stand-alone query language interfaces
– Example: Typing SQL queries directly through the DBMS interactive SQL
interface (e.g. SQL*Plus in ORACLE)
• Programmer interfaces for embedding DML in programming
languages
• User-friendly interfaces (often Web-based)
– Menu-based, forms-based, graphics-based, etc.
• Menu-based, popular for browsing on the web
• Forms-based, designed for naïve users
• Graphics-based
• Supports Point and Click, Drag and Drop, etc.
• Natural language: requests in written English
• Combinations of the above:
• For example, both menus and forms used extensively in Web database interfaces
DBMS Programming Language
Interfaces
• Programmer interfaces for embedding DML in a
programming language:
– Embedded Approach: e.g. embedded SQL (for C, C++,
etc.), SQLJ (for Java)
– Procedure Call Approach: e.g. JDBC for Java, ODBC for
other programming languages
– Database Programming Language Approach: e.g.
ORACLE has PL/SQL, a programming language based
on SQL; language incorporates SQL and its data types
as integral components
Other DBMS Interfaces
• Speech as Input and Output
• Web Browser as an interface
• Parametric interfaces, e.g., bank tellers using function
keys.
• Interfaces for the DBA:
• Creating user accounts, granting authorizations
• Setting system parameters
• Changing schemas or storage structures/access paths (physical
database)
Database System Utilities
• To perform certain functions such as:
– Loading data stored in files into a database; includes data
conversion tools.
– Backing up the database periodically on tape.
– Reorganizing database file structures.
– Report generation utilities.
– Performance monitoring utilities.
– Other functions, such as sorting, user monitoring, data
compression, etc.
Other Tools
• Data dictionary/repository:
– Used to store schema descriptions and other information such as
design decisions, application program descriptions, user
information, usage standards, etc.
– Active data dictionary is accessed by DBMS software and
users/DBA.
– Passive data dictionary is accessed by users/DBA only.
• Application Development Environments and CASE
(Computer-aided software engineering) tools often have
a database design component
• Examples:
– PowerBuilder (Sybase)
– JBuilder (Borland)
– JDeveloper 10G (Oracle)
Typical DBMS Component Modules
DBMS Architectures
• Centralized DBMS Architecture:
– Combines everything into single computer system, including:
DBMS software, hardware, application programs, and user
interface processing software.
– User can still connect through a remote terminal – however, all
processing is done at centralized site (computer).
• Basic 2-tier Client-Server Architecture: Specialized Server
nodes with Specialized functions
– Print server
– File server
– DBMS server
– Web server
– Email server
• Client nodes can access the specialized servers as needed
A Physical Centralized Architecture
Logical two-tier client server architecture
DBMS Server
• Provides database query and transaction services to the clients
• Relational DBMS servers are often called SQL servers, query
servers, or transaction servers
• Applications running on clients utilize an Application Program
Interface (API) to access server databases via standard
interface such as:
– ODBC: Open Database Connectivity standard
– JDBC: for Java programming access
• Client and server must install appropriate client module and
server module software for ODBC or JDBC
Client nodes
• Provide appropriate interfaces through a client
software module to access and utilize the various
server resources.
• Clients may be PCs or Workstations (or even diskless
machines) with the client software installed.
• Connected to the servers via a network.
– LAN: local area network
– wireless network
Two Tier Client-Server DBMS
Architecture
• A program running at a client may connect to several
DBMSs (also called data sources).
• In general, data sources can be files or other nonDBMS software that manages data.
• Client focuses on user interface interactions and only
accesses database when needed.
• In some cases (e.g. some object DBMSs), more
functionality is transferred to clients (e.g. data
dictionary functions, optimization and recovery
across multiple servers, etc.)
Three Tier Client-Server DBMS
Architecture
• Common for Web applications
• Third intermediate layer (middle tier) called Application Server
or Web Server:
– Stores the web connectivity software and the business logic part
of the application
– Accesses and updates data on the database server
– Acts like a conduit for sending partially processed data between
the database server and the client.
• Three-tier Architecture Can Enhance Security:
– Database server only accessible via middle tier
– Clients cannot directly access database server
Three-tier client-server architecture
• Classification of Database Management Systems
Classification of DBMSs
•
Based on the data model used
– Traditional: Relational, Hierarchical, Network.
– Emerging: Object-oriented, Object-relational, NOSQL, Key-value.
•
Based on allowed users
– Single-user (typically used with personal computers) vs. Multi-user (most DBMSs).
– Centralized (uses a single computer with one database) vs. Distributed (uses
multiple computers, multiple databases)
•
Based on number of sites
– A DBMS is centralized if the data is stored at a single computer site.
– A distributed DBMS (DDBMS) can have the actual database and DBMS software
distributed over many sites connected by a computer network.
•
Based on type of DBMS software
– Homogeneous DDBMSs use the same DBMS software at all the sites.
– Heterogeneous DDBMSs can use different DBMS software at each site.
This may further lead to federated DBMS.
•
Big data systems (NOSQL systems)
–
–
–
–
Key-value model
Document-based model
Graph-based model
Column-based model
Cost considerations for DBMSs
• Cost Range: from free open-source systems to configurations
costing millions of dollars
• Examples of free relational DBMSs: MySQL, PostgreSQL, others
• Commercial DBMSs offer additional specialized modules, e.g.
time-series module, spatial data module, document module,
XML module
– These offer additional specialized functionality when purchased
separately
– Sometimes called cartridges (e.g., in Oracle) or blades
• Different licensing options: site license, maximum number of
concurrent users (seat license), single user, etc.
History of Data Models(1)
1. Network Model
2. Hierarchical Model
3. Relational Model
4. Object-oriented Data Models
5. Object-Relational Models
• Network Model:
– The first network DBMS was implemented by Honeywell in 196465 (IDS System).
– Adopted heavily due to the standard support by CODASYL
(Conference on Data Systems Languages) (CODASYL – DBTG report
of 1971).
– Later implemented in a large variety of systems – IDMS (Cullinet now Computer Associates), DMS 1100 (Unisys), IMAGE (H.P.
(Hewlett-Packard)), VAX -DBMS (Digital Equipment Corp., next
COMPAQ, now H.P.).
Example of Network Model Schema
Network Model
• Advantages:
– Can model complex relationships among records and represents
semantics of add/delete on the relationships.
– Can handle most situations for modeling using record types and
relationship types.
– Language is navigational; uses constructs like FIND, FIND
member, FIND owner, FIND NEXT within set, GET, etc.
• Programmers can do optimal navigation through the database.
• Disadvantages:
– Navigational and procedural nature of processing requires
programming access
– Intermixes storage structures with conceptual modeling
relationships
– Database contains a complex array of pointers that thread
through a set of records.
• Little scope for automated “query optimization”
Hierarchical Model
• Hierarchical Data Model:
– Initially implemented in a joint effort by IBM and North American Rockwell
around 1965. Resulted in the IMS family of systems.
– IBM’s IMS product had a very large customer base worldwide
– Hierarchical model was formalized based on the IMS system
– Other systems based on this model: System 2k (SAS inc.)
• Advantages:
– Can implement certain tasks very efficiently
– Easy to store hierarchically organized data, e.g., organization (“org”) charts
• Disadvantages:
– Navigational and procedural nature of processing
– Difficult to store databases where multiple relationships exist among the
data records
– Little scope for “query optimization” by system (programmer must
optimize the programs)
– Language is procedural: Uses constructs like GET, GET UNIQUE, GET NEXT,
GET NEXT WITHIN PARENT, etc.
Relational Model
• Relational Model:
– Proposed in 1970 by E.F. Codd (IBM), first commercial systems in
early 1980s.
– Now in many commercial products (e.g. DB2, ORACLE, MS SQL
Server, SYBASE, INFORMIX).
– Several free open source implementations, e.g. MySQL,
PostgreSQL
– Currently most dominant for developing database applications.
– SQL relational standards: SQL-89 (SQL1), SQL-92 (SQL2), SQL-99,
SQL3, SQL-2008
– Chapters 3 through 6 describe this model in detail
Object-oriented Models
• Object-oriented Data Models:
– Allow databases to be used seamlessly with object-oriented
programming languages.
– Can store persistent objects created in O-O Programming
Languages such as C++ (e.g., in OBJECTSTORE or VERSANT), and
Smalltalk (e.g., in GEMSTONE).
– Other experimental systems include O2, ORION (at MCC – then
ITASCA), IRIS (at H.P.- used in Open OODB).
– Object Database Standard: ODMG-93, ODMG-version 2.0,
ODMG-version 3.0.
Object-Relational Models
• Object-Relational Models:
– Relational systems incorporated concepts from object
databases leading to object-relational.
– Exemplified in the latest versions of Oracle-10i, DB2, and
SQL Server and other DBMSs.
– Standards started in SQL-99 and enhanced in SQL-2008.
Main Reference
1. Chapter 2: Database System Concepts and
Architecture
2. Chapter 2: Classification of Database Management
Systems
(Fundamentals of Database Systems, Global Edition,
Additional
References
7th Edition
(2017) by Ramez Elmasri & Shamkant
Navathe)
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/1-intro.pdf
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/12-nosql.pdf
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
Thank You
الجامعة السعودية االلكترونية
الجامعة السعودية االلكترونية
26/12/2021
College of Computing and Informatics
Data Science Pre-Master Program
IT244
Introduction to Database
IT244
Introduction to Database
Part 2
Module 1: Relational Model
Contents
1. The Relational Data Model and Relational Database
Constraints
2. The Relational Algebra
Weekly Learning Outcomes
1. Create a Relational model of a Database.
2. Describe Relational Algebra.
Required Reading
1. Chapter 5: The Relational Data Model and Relational
Database Constraints
2. Chapter 8: The Relational Algebra
(Fundamentals of Database Systems, Global Edition,
7th Edition (2017) by Ramez Elmasri & Shamkant
Navathe)
Recommended Reading
Relational Model and Relational Algebra:
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/2-dbmodel.pdf
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
• The Relational Data Model and Relational Database
Constraints
Relational Model Concepts and its Origin
• The formal relational Model of Data is based on the concept of a
Relation
– Has a formal mathematical foundation provided by set theory and first
order predicate logic
• In practice, there is a standard model based on SQL (Structured
Query Language)
• There are several important differences between the formal model
and the practical model, as we shall see
• The model was first proposed by Dr. E.F. Codd of IBM Research in
1970 in the following paper:
– “A Relational Model for Large Shared Data Banks,” Communications of
the ACM, June 1970
• The above paper caused a major revolution in the field of database
management
• Dr. Codd earned the coveted ACM Turing Award in 1981
Informal Definitions (1)
• Informally, a relation looks like a table of values (see Figure 3.1 on next
slide).
• A relation contains a set of rows.
• The data elements in each row represent certain facts that correspond to
a real-world entity or relationship
– In the formal model, rows are called tuples
• Each column has a column header that gives an indication of the meaning
of the data items in that column
– In the formal model, the column header is called an attribute name (or just
attribute)
Informal Definitions (2)
• Key of a Relation:
– Each row (tuple) in the table is uniquely identified by the
value of a particular attribute (or several attributes
together)
• Called the key of the relation
– In the STUDENT relation, SSN is the key
– If no attributes posses this uniqueness property, a new
attribute can be added to the relation to assign unique
row-id values (e.g. unique sequential numbers) to
identify the rows in a relation
• Called artificial key or surrogate key
Formal Definitions – Relation Schema
• Relation Schema (or description) of a Relation:
– Denoted by R(A1, A2, …, An)
– R is the name of the relation
– The attributes of the relation are A1, A2, …, An
– n is the cardinality of the relation
• Example:
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
– CUSTOMER is the relation name
– The CUSTOMER relation schema (or just relation) has four
attributes: Cust-id, Cust-name, Address, Phone#
• Each attribute has a domain or a set of valid values.
– For example, the domain of Cust-id can be 6 digit numbers.
Formal Definitions – Tuple
• A tuple is an ordered set of values (enclosed in angled
brackets ‘< … >’)
• Each value is derived from an appropriate domain.
• A row in the CUSTOMER relation is a 4-tuple and would consist
of four values, for example:
–
– Called a 4-tuple because it has 4 values
– In general, a particular relation will have n-tuples, where n is the
number of attributes for the relation
• A relation is a set of such tuples (rows)
Formal Definitions – Domain
• A domain of values can have a logical definition:
– Example: “USA_phone_numbers” are the set of 10 digit phone numbers
valid in the U.S.
• A domain also has a data-type or a format defined for it.
– The USA_phone_numbers may have a format: (ddd)ddd-dddd where each
d is a decimal digit.
– Dates have various formats such as year, month, date formatted as
yyyy-mm-dd, or as dd:mm:yyyy etc.
• The attribute name designates the role played by a domain in a relation:
– Used to interpret the meaning of the data elements corresponding to
that attribute
– Example: The domain Date may be used to define two attributes “Invoicedate” and “Payment-date” with different meanings (roles)
Formal Definitions – State of a Relation
• Formally, a relation state r(R) is a subset of the Cartesian product
of the domains of its attributes
– Each domain contains the set of all possible values the attribute can
take.
– The Cartesian product contains all possible tuples from the attribute
domains
– The relations state r(R) is the subset of tuples that represent valid
information in the mini-world at a particular time
• Formally (see Figure 3.1),
– Given relation schema R(A1, A2, ………., An)
– Relation state r(R) dom(A1) X dom(A2) X ….X dom(An)
• r(R): is a specific state (or “instance” or “population”) of relation R
– this is a set of tuples (rows) in the relation at a particular moment
in time
– r(R) = {t1, t2, …, tn} where each ti is an n-tuple
– ti = where each vj element-of dom(Aj)
Formal Definitions – Example
• Let R(A1, A2) be a relation schema:
– Let dom(A1) = {0, 1}
– Let dom(A2) = {a, b, c}
• Then: The Cartesian product dom(A1) X dom(A2) contains all
possible tuples from these domains:
{ , , , , , }
• The relation state r(R) dom(A1) X dom(A2)
• For example: One possible state r(R) could be { , ,
}
– This state has three 2-tuples: , ,
Relation Definitions Summary
Informal Terms
Formal Terms
Table
Relation
Column Header
Attribute
All possible Column Values or Data Type
Domain
Row
Tuple
Table Definition
Schema of a Relation
Populated Table
State of the Relation
Characteristics of a Relation (1)
• Ordering of tuples in a relation r(R):
– The tuples are not considered to be ordered, because a relation is a set of
tuples
• Ordering of attributes in a relation schema R (and of values within
each tuple):
– The attributes in R(A1, A2, …, An) and the values in each t= are
considered to be ordered
– However, a more general definition of relation does not require attribute ordering
– In this case, a tuple t = { , …, } is an unordered set of n
pairs – one pair for each of the relation attributes (see Figure 3.3)
Characteristics of Relations (2)
• Values in a tuple:
– All values are considered atomic (indivisible).
– Each value must be from the domain of the attribute for that
column
• If tuple t = is a tuple (row) in the relation state r of R(A1,
A2, …, An)
• Then each vi must be a value from dom(Ai)
– A special null value is used to represent values that are unknown
or inapplicable to certain tuples.
• Notation:
– We refer to component values of a tuple t by:
• t[Ai] or t.Ai
• This is the value vi of attribute Ai for tuple t
– Similarly, t[Au, Av, …, Aw] refers to the subtuple of t containing
the values of attributes Au, Av, …, Aw, respectively in t
Relational Integrity Constraints
• Constraints are conditions that must hold on all valid relation
states.
• Constraints are derived from the mini-world semantics
• There are three main types of built-in constraints in the
relational model:
– Key constraints
– Entity integrity constraints
– Referential integrity constraints
• Another implicit constraint is the domain constraint
– Every value in a tuple must be from the domain of its attribute
(or it could be null, if allowed for that attribute)
Key Constraints (1)
• Superkey SK of R:
– Is a set of attributes SK of R with the following condition:
• No two tuples in any valid relation state r(R) will have the same
value for SK
• That is, for any two distinct tuples t1 and t2 in r(R), t1.SK t2.SK
• This condition must hold in any valid state r(R)
• Key (also called Candidate key) K of R:
– Is a “minimal” superkey
– Formally, a key K is a superkey such that removal of any
attribute from K results in a set of attributes that is not a
superkey (or key) any more (does not possess the superkey
uniqueness property)
– Hence, a superkey with one attribute is always a key
Key Constraints (2)
• Example: Consider the CAR relation schema:
– CAR(State, Reg#, SerialNo, Make, Model, Year)
– CAR has two keys (determined from the mini-world constraints):
• Key1 = {State, Reg#}
• Key2 = {SerialNo}
– Both are also superkeys of CAR
– However, {SerialNo, Make} is a superkey but not a key.
• In general:
– Any key is a superkey (but not vice versa)
– Any set of attributes that includes a key is a superkey
– A minimal superkey is also a key
Key Constraints (3)
• If a relation has several keys, they are called candidate keys;
one is chosen to be the primary key; the others are called
unique (or secondary) keys
– The primary key attributes are underlined.
• Example: Consider the CAR relation schema:
– CAR(State, Reg#, SerialNo, Make, Model, Year)
– We choose License_number (which contains (State, Reg#)
together) as the primary key – see Figure 3.4
• The primary key value is used to uniquely identify each tuple in
a relation
– Provides the tuple identity
– Also used to reference the tuple from other tuples
• General rule: Choose the smallest-sized candidate key (in
bytes) as primary key
– Not always applicable – choice is sometimes subjective (as in
Figure 3.4 – see next slide)
Relational Database Schema
• Relational Database Schema:
– A set S of relation schemas that belong to the same database.
– S is the name of the whole database schema
– S = {R1, R2, …, Rn}
– R1, R2, …, Rn are the names of the individual relation schemas within the
database S
– Figure 3.5 shows a COMPANY database schema with 6 relation schemas
Example of Relational Database State
• Next slide show an example of a COMPANY database state
(Figure 3.6)
– Each relation has a set of tuples
• The tuples in each table satisfy key and other constraints
• If all constraints are satisfied by a database state, it is called a valid
state
– The database state changes to another state whenever the
tuples in any relation are changed via insertions, deletions,
or updates
Entity Integrity Constraint
• Entity Integrity:
– The primary key attributes PK of each relation schema R in
S cannot have null values in any tuple of r(R).
• This is because primary key values are used to identify the
individual tuples.
• t.PK null for any tuple t in r(R)
• If PK has several attributes, null is not allowed in any of these
attributes
– Note: Other attributes of R may be also be constrained to
disallow null values (called NOT NULL constraint), even
though they are not members of the primary key.
Referential Integrity Constraint (1)
• A constraint involving two relations
– The previous constraints (key, entity integrity) involve a single
relation.
• Used to specify a relationship among tuples in two relations:
– The referencing relation and the referenced relation.
• Tuples in the referencing relation R1 have attributes FK (called
foreign key attributes) that reference the primary key
attributes PK of the referenced relation R2.
– A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1.FK = t2.PK
• Referential integrity can be displayed as a directed arc from
R1.FK to R2.PK – see Figure 3.7
Referential Integrity (or foreign key) Constraint
(2)
• Statement of the constraint
– For a particular database state, the value of the foreign key
attribute (or attributes) FK in each tuple of the referencing
relation R1 can be either:
• (1) An existing primary key (PK) value of a tuple in the referenced
relation R2, or
• (2) a null.
• In case (2), the FK in R1 should not be a part of its
own primary key, and cannot have the NOT NULL
constraint.
Other Types of Constraints
• Semantic Integrity Constraints:
– cannot be expressed by the built-in model constraints
– Example: “the max. no. of hours per employee for all
projects he or she works on is 56 hrs per week”
• A constraint specification language can be used to
express these
• SQL has TRIGGERS and ASSERTIONS to express some
of these constraints
Operations to Modify Relations (1)
• Each relation will have many tuples in its current relation state
• The relational database state is a union of all the individual relation
states at a particular time
• Whenever the database is changed, a new state arises
• Basic operations for changing the database:
– INSERT new tuples in a relation
– DELETE existing tuples from a relation
– UPDATE attribute values of existing tuples
• Integrity constraints should not be violated by the update
operations.
• Several update operations may have to be grouped together into a
transaction.
Operations to Modify Relations (2)
• Updates may propagate to cause other updates
automatically. This may be necessary to maintain
integrity constraints.
• In case of integrity violation, several actions can be
taken:
– Cancel the operation that causes the violation (RESTRICT or
REJECT option)
– Perform the operation but inform the user of the violation
– Trigger additional updates so the violation is corrected
(CASCADE option, SET NULL option)
– Execute a user-specified error-correction routine
INSERT operation
• INSERT one or more new tuples into a relation
• INSERT may violate any of the constraints:
– Domain constraint:
• if one of the attribute values provided for a new tuple is not of the
specified attribute domain
– Key constraint:
• if the value of a key attribute in a new tuple already exists in
another tuple in the relation
– Referential integrity:
• if a foreign key value in a new tuple references a primary key value
that does not exist in the referenced relation
– Entity integrity:
• if the primary key value is null in a new tuple
DELETE operation
• DELETE one or more existing tuples from a relation
• DELETE may violate only referential integrity:
– If the primary key value of the tuple being deleted is referenced
from other tuples in the database
• Can be remedied by several actions: RESTRICT, CASCADE, SET NULL
– RESTRICT option: reject the deletion
– CASCADE option: propagate the deletion by automatically deleting
the referencing tuples
– SET NULL option: set the foreign keys of the referencing tuples to
NULL (the foreign keys cannot have NOT NULL constraint)
– One of the above options must be specified during database
design for each referential integrity (foreign key) constraint
UPDATE operation
• UPDATE modifies the values of attributes in one or more
existing tuples in a relation
• UPDATE may violate domain constraint and NOT NULL
constraint on an attribute being modified
• Other constraints may also be violated:
– Updating the primary key (PK):
• Similar to a DELETE followed by an INSERT
• Need to specify similar options to DELETE
• The CASCADE option propagates the new value of PK to the foreign
keys of the referencing tuples automatically
– Updating a foreign key (FK) may violate referential integrity
– Updating an ordinary attribute (neither PK nor FK):
• Can only violate domain or NOT NULL constraints
Main Reference
1. Chapter 5: The Relational Data Model and Relational
Database Constraints
(Fundamentals of Database Systems, Global Edition,
7th Edition (2017) by Ramez Elmasri & Shamkant
Navathe)
Additional References
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/2-dbmodel.pdf
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
Thank You
الجامعة السعودية االلكترونية
الجامعة السعودية االلكترونية
26/12/2021
College of Computing and Informatics
Data Science Pre-Master Program
IT244
Introduction to Database
IT244
Introduction to Database
Module 2: Entity–Relationship Model-1
Contents
1. Using High-Level Conceptual Data Models for Database Design
2. A Sample Database Application
3. Entity Types, Entity Sets, Attributes, and Keys
4. Relationship Types, Relationship Sets, Roles, and Structural
Constraints
5. Weak Entity Types
6. Refining the ER Design for the COMPANY Database
Weekly Learning Outcomes
1. Explain the Entity-Relationship model.
Required Reading
1.
2.
3.
4.
5.
6.
Chapter 3: Using High-Level Conceptual Data Models for Database Design
Chapter 3: A Sample Database Application
Chapter 3: Entity Types, Entity Sets, Attributes, and Keys
Chapter 3: Relationship Types, Relationship Sets, Roles, and Structural Constraints
Chapter 3: Weak Entity Types
Chapter 3: Refining the ER Design for the COMPANY Database
(Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez
Elmasri & Shamkant Navathe)
Recommended Reading
Entity–Relationship Model-1:
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/2-ermodel1.pdf
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
• Using High-Level Conceptual Data Models for Database
Design
Overview of the Database Design
Process
• There are two main activities (see Figure 7.1 in the
next slide):
– Database schema design
– Application programs design
• Focus will be on database schema design
– Given the database requirements, design the
conceptual schema for a database
• Application programs design focuses on the programs
and interfaces that access and update the database
(considered part of software engineering discipline)
The Database Design Process
• Requirements Analysis and Specification is in the realm of
Systems Analysis and Design
• This and next chapter focus on Conceptual Design (see
Figure 7.1)
• Physical Design and Logical Design will not be
discussed here.
The Entity-Relationship (ER) Model
• ER model is a conceptual data model for database design
– Has an associated notation (ER schema diagrams) for
drawing/displaying the database schema
– Many variations of ER model exists
– Also, many extensions (E.g. EER model)
• Next slide (Figure 7.2) shows a complete ER schema diagram
for a COMPANY database
– We will explain gradually how this design is created
– First we introduce the requirements for the COMPANY database
– Then we present ER model concepts and diagrammatic notation
gradually, and design the schema step-by-step
• A Sample Database Application
Example: COMPANY Database (1)
• Create a database schema design based on the
following (simplified) requirements for a COMPANY
Database:
– The company is organized into DEPARTMENTs. Each
department has a name, number and an employee
who manages the department. We keep track of the
start date of the department manager. A department
may have several locations.
– Each department controls a number of PROJECTs. Each
project has a unique name, unique number and is
located at a single location.
COMPANY Database (cont.)
– The database will store each EMPLOYEE’s name, social
security number (unique for each employee), address,
salary, sex, and birthdate.
• Each employee works for one department, but may work on several
projects.
• We keep track of the number of hours per week that an employee
currently works on each project.
• We also keep track of the direct supervisor of each employee.
– An employee can have DEPENDENTs.
• For each dependent, the database keeps track of their first name,
sex, birthdate, and their relationship to the employee (child,
spouse, etc.).
• Entity Types, Entity Sets, Attributes, and Keys
ER Model Concepts
• Entities and Attributes
– Entities: Specific objects or things in the mini-world that are
represented in the database.
• Examples: the EMPLOYEE John Smith, the Research DEPARTMENT,
the ProductX PROJECT
– Attributes: Properties used to describe an entity.
• Examples: an EMPLOYEE entity may have the attributes Name, SSN,
Address, Sex, BirthDate
– Data values: A specific entity has a value for each of its
attributes.
• Example: An employee entity may have Name=’John Smith’,
SSN=’123456789′, Address =’731, Fondren, Houston, TX’, Sex=’M’,
BirthDate=’09-JAN-55‘
– Each attribute has a value set (or data type) associated with it –
e.g. integer, string, subrange, enumerated type, …
Types of Attributes (1)
• Simple attribute (sometimes called atomic):
– Each entity has a single value for the attribute. For example, the SSN
or Sex of an employee.
• Composite attribute (also called compound):
– The attribute may be composed of several components. For
example:
• Address(Apt#, House#, Street, City, State, ZipCode, Country), or
• Name(FirstName, MiddleName, LastName).
• Composition may form a hierarchy where some components are
themselves composite (Figure 7.4, next slide).
• Multi-valued attribute (also called repeating group or
collection):
– An single entity may have multiple values for that attribute. For
example, Color of a CAR or PreviousDegrees of a STUDENT.
• Denoted as {Color} or {PreviousDegrees}.
Types of Attributes (2)
• Composite and multi-valued attributes may be nested
(to any number of levels).
– Example: PreviousDegrees of a STUDENT can be a
composite multi-valued attribute denoted by
{PreviousDegrees (College, Year, Degree, Field)}
– Multiple PreviousDegrees values can exist for a
particular student
– Each has four subcomponent attributes:
• College, Year, Degree, Field
Entity Types and Key Attributes (1)
• Entities with the same basic attributes are grouped
(or typed) into an entity type.
– Examples: EMPLOYEE or PROJECT.
• Key attribute: an attribute of an entity type for which
each entity must have a unique (distinct) value.
– Example: SSN of EMPLOYEE, or PNUMBER of
PROJECT, or PNAME of PROJECT.
Entity Types and Key Attributes (2)
• A key attribute may be composite.
– Example: VehicleTagNumber (also known as
LicensePlateNo) of a CAR is a key with two
components (LicNumber, State).
• An entity type may have more than one key.
– The CAR entity type may have two keys:
• VehicleIdentificationNumber (popularly called VIN, unique
number stamped on each new car)
• VehicleTagNumber (Number, State)
• Each key is underlined in ER diagrams (see next
slides)
Displaying an Entity type
• In ER diagrams, the entity type name is displayed in a
rectangular box
• Attributes are displayed in ovals
– Each attribute is connected to its entity type
– Components of a composite attribute are connected to the
oval representing the composite attribute
– Each key attribute is underlined
– Multivalued attributes displayed in double ovals
• See CAR example (Figure 3.7(a)) on next slide
Entity Set
• Each entity type will have a collection of individual
entities stored in the database
– Called the entity set
– Previous slide (Figure 3.7(b) show three CAR entities in the
entity set for CAR
– Same name (CAR) refers to both entity type and entity set
– Object models (see Chapter 11) give different names to the
entity type and the entity set
– Entity set changes over time as entities are created and
deleted – represents current state of database
Initial Design of Entity Types for the
COMPANY Database Schema
• Based on the requirements, we can identify four initial
entity types in the COMPANY database:
– DEPARTMENT
– PROJECT
– EMPLOYEE
– DEPENDENT
• Initial design (Figure 3.8) on following slide, will be
refined into final design
• Initial attributes shown are derived from the
requirements description
– The company is organized into DEPARTMENTs. Each
department has a name, number and an employee
who manages the department. We keep track of the
start date of the department manager. A department
may have several locations.
– Each department controls a number of PROJECTs. Each
project has a unique name, unique number and is
located at a single location.
– The database will store each EMPLOYEE’s
name, social security number (unique for each
employee), address, salary, sex, and birthdate.
Each employee works for one department, but may work on
several projects.
We keep track of the number of hours per week that an employee
currently works on each project.
We also keep track of the direct supervisor of each employee.
– An employee can have DEPENDENTs.
For each dependent, the database keeps track of their first name,
sex, birthdate, and their relationship to the employee (child,
spouse, etc.).
• Relationship Types, Relationship Sets, Roles, and
Structural Constraints
Refining the initial design by introducing
Relationships
• The initial design is typically not complete
• Some aspects in the requirements will be
represented as relationships
• ER model has three main concepts:
– Entities (and their entity types and entity sets)
– Attributes (simple, composite, multivalued)
– Relationships (and their relationship types and
relationship sets)
• We introduce relationship concepts next
Relationships and Relationship Types
• A relationship relates two or more distinct entities, with a
specific meaning.
– For example, EMPLOYEE John Smith works on the ProductX
PROJECT, or EMPLOYEE Franklin Wong manages the Research
DEPARTMENT.
• Relationships of the same type are grouped or typed into a
relationship type.
– For example, the WORKS_ON relationship type in which EMPLOYEEs
and PROJECTs participate, or the MANAGES relationship type in
which EMPLOYEEs and DEPARTMENTs participate.
• The degree of a relationship type is the number of
participating entity types.
– Both MANAGES and WORKS_ON are binary relationships.
Relationship Type vs. Relationship Set
• Relationship Type:
– Is the schema description of a relationship
– Identifies the relationship name and the participating
entity types
– Also identifies certain relationship constraints
• Relationship Set:
– The current set of relationship instances represented in the
database
– The current state of a relationship type
Relationship Set
• A set of associations (or relationship instances)
between individual entities from the participating
entity sets:
– Example: Figure 3.9 (next slide) shows a relationship
set for WORKS_FOR
– {r1, r2, r3, r4, r5, r6, r7, …}
– Relationship instance r1=(e1, d1) means EMPLOYEE e1
WORKS_FOR DEPARTMENT d1
– Associates e1 with d1
Relationship Type
• Previous figure displayed the relationship set
• Each instance in the set relates individual participating
entities – one from each participating entity type
• In ER diagrams, we represent the relationship type as
follows:
– Diamond-shaped box is used to display a relationship type
– Connected to the participating entity types via straight lines
– Degree of a relationship type is the number of participating
entity types
Refining the COMPANY Initial Design
by Including Relationships
• By examining the requirements, attributes in the initial design
that refer to other entities are converted into relationships
(and removed from the entity types)
• Some of these relationship attributes (Figure 7.8, repeated on
next slide) are:
– The Department attribute of EMPLOYEE refers to the
DEPARTMENT entity that the employee WORKS_FOR
– The Manager attribute of DEPARTMENT refers to the EMPLOYEE
entity who MANAGES the DEPARTMENT
– The Supervisor attribute of EMPLOYEE refers to another
EMPLOYEE entity (this is called a recursive relationship)
– Several other similar attributes are converted into relationships –
can you identify those in next slide?
Refining the COMPANY Initial Design
by Including Relationships (2)
• Six relationship types are identified for the COMPANY
database schema (see Figure 3.2, repeated next slide)
• All are binary relationships (degree 2)
• Listed below with their participating entity types:
– WORKS_FOR (between EMPLOYEE, DEPARTMENT)
– MANAGES (also between EMPLOYEE, DEPARTMENT)
– CONTROLS (between DEPARTMENT, PROJECT)
– WORKS_ON (between EMPLOYEE, PROJECT)
– SUPERVISION (between EMPLOYEE (as subordinate), EMPLOYEE
(as supervisor))
– DEPENDENTS_OF (between EMPLOYEE, DEPENDENT)
Discussion on Relationship Types
• In the refined design, some attributes from the initial entity
types are refined into relationships:
– Manager of DEPARTMENT -> MANAGES
– Works_on of EMPLOYEE -> WORKS_ON
– Department of EMPLOYEE -> WORKS_FOR
– etc
• In general, more than one relationship type can exist between
the same participating entity types
– MANAGES and WORKS_FOR are distinct relationship types
between EMPLOYEE and DEPARTMENT
– Different meanings and different relationship instances.
Recursive Relationship Type
• A relationship type with the same entity type participating
twice in two distinct roles
• Example: the SUPERVISION relationship
• EMPLOYEE participates twice in two distinct roles:
– supervisor (or boss) role
– supervisee (or subordinate) role
– must distinguish the roles in a relationship instance
• Each relationship instance ri relates two distinct EMPLOYEE
entities (see Figure 3.11, next slide):
– One employee in supervisor role (labeled 1 in Fig. 3.11)
– One employee in supervisee role (labeled 2 in Fig. 3.11)
• Weak Entity Types
Weak Entity Types
• An entity type that does not have a key attribute on its own
• A weak entity must participate in an identifying relationship type with an
owner (or identifying) entity type
• Individual entities are identified by the combination of:
– A partial key of the weak entity type
– The particular entity they are related to in the identifying entity type
• Example (see Figure 7.2):
– A DEPENDENT entity is identified by the dependent’s first name, and
the specific EMPLOYEE with whom the dependent is related
– Name of DEPENDENT is the partial key
– DEPENDENT is a weak entity type
– EMPLOYEE is its identifying (owner) entity type via the identifying
relationship type DEPENDENT_OF
• Refining the ER Design for the COMPANY Database
Main Reference
1.
2.
3.
4.
5.
6.
Chapter 3: Using High-Level Conceptual Data Models for Database Design (3.1)
Chapter 3: A Sample Database Application (3.2)
Chapter 3: Entity Types, Entity Sets, Attributes, and Keys (3.3)
Chapter 3: Relationship Types, Relationship Sets, Roles, and Structural Constraints (3.4)
Chapter 3: Weak Entity Types (3.5)
Chapter 3: Refining the ER Design for the COMPANY Database (3.6)
(Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez
Elmasri & Shamkant Navathe)
Additional References
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/2-ermodel1.pdf
http://eds.a.ebscohost.com.sdl.idm.oclc.org/eds/pdfviewer/pdfviewer?vid=3&s
id=dc1635b3-88ab-4a84-814c-5908ea9c232c%40sdc-v-sessmgr01
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
Thank You
الجامعة السعودية االلكترونية
الجامعة السعودية االلكترونية
26/12/2021
College of Computing and Informatics
Data Science Pre-Master Program
IT244
Introduction to Database
IT244
Introduction to Database
Module 3: Entity–Relationship Model-2
Contents
1. ER Diagrams, Naming Conventions, and Design Issues
2. Example of Other Notation: UML Class Diagrams
3. Relationship Types of Degree Higher than Two
4. Another Example: A UNIVERSITY Database
5. Subclasses, Superclasses, and Inheritance
6. Specialization and Generalization
Weekly Learning Outcomes
1. Create an Entity-Relationship model.
Required Reading
1.
2.
3.
4.
5.
6.
Chapter 3: ER Diagrams, Naming Conventions, and Design Issues
Chapter 3: Example of Other Notation: UML Class Diagrams
Chapter 3: Relationship Types of Degree Higher than Two
Chapter 3: Another Example: A UNIVERSITY Database
Chapter 4: Subclasses, Superclasses, and Inheritance
Chapter 4: Specialization and Generalization
(Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez
Elmasri & Shamkant Navathe)
Recommended Reading
Entity–Relationship Model-2:
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/3-ermodel2.pdf
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
• ER Diagrams, Naming Conventions, and Design
Issues
Constraints on Relationships (1)
• Constraints on Relationship Types
– Two main types of constraints on binary relationships
– Cardinality Ratio (specifies maximum participation)
• One-to-one (1:1)
• One-to-many (1:N) or Many-to-one (N:1)
• Many-to-many (M:N)
– Existence Dependency Constraint (specifies minimum
participation) (also called participation constraint)
• zero (optional participation, not existence-dependent)
• one or more (mandatory participation, existence-dependent)
Constraints on Relationships (2)
• Cardinality ration specified by labeling 1, M, or N to
relationship lines in ER diagrams.
• See Figure 7.2, repeated in next slide
• Total participation specified by double line, partial
participation by single line.
• These constraints are derived from the real-world meaning
and characteristics of each relationship type
• In some ER diagrammatic notation, it is common to specify
cardinality ration and participation constraint jointly using
(min, max) notation
– Called (min, max) constraints or multiplicities
Displaying a Recursive Relationship Type
in ER Diagrams
• In a recursive relationship type.
– Both participations are same entity type in different roles.
– For example, SUPERVISION relationships between
EMPLOYEE (in role of supervisor or boss) and (another)
EMPLOYEE (in role of subordinate or worker).
• In ER diagram, need to display role names to
distinguish participations (see Figure 7.2).
• Role names can also be optionally displayed for other
relationship types
Attributes of Relationship Types
• A relationship type can have attributes:
– For example, HoursPerWeek of WORKS_ON
– Its value for each relationship instance describes the
number of hours per week that an EMPLOYEE works on a
PROJECT.
• A value of HoursPerWeek depends on a particular (employee,
project) combination
– Most relationship attributes are used with M:N
relationships
• In 1:N relationships, they can be transferred to the entity type on
the N-side of the relationship
Alternative (min, max) notation (1)
• Alternative way to specify relationship constraints; pecified on each
participation of an entity type E in a relationship type R
• Specifies that each entity e in E participates in at least min and at most
max relationship instances in R
• Default (no constraint): min=0, max=n (signifying no limits)
• Must have minmax, min0, max 1
• Derived from the knowledge of mini-world constraints
• Examples:
– A department has exactly one manager and an employee can manage
at most one department.
• Specify (0,1) for participation of EMPLOYEE in MANAGES
• Specify (1,1) for participation of DEPARTMENT in MANAGES
– An employee can work for exactly one department but a department
can have any number of employees.
• Specify (1,1) for participation of EMPLOYEE in WORKS_FOR
• Specify (0,n) for participation of DEPARTMENT in WORKS_FOR
Alternative (min, max) notation (2)
Important: In some popular diagrammatic notations, the
placement of (min, max) are reversed (placed on the other side
of the binary relationship) – for example, in UML class diagrams
(see later in this chapter)
Summary of ER Diagrams (1)
• Example of Other Notation: UML Class Diagrams
Alternative diagrammatic notation
• ER diagrams (as described here) is one popular
method for displaying database schemas
• Many other diagrammatic notations exist in the
literature and in various database design and
modeling tools
• Appendix A illustrates some of the alternative
notations that have been used
• UML class diagrams is representative of an alternative
way of displaying ER concepts that is used in several
automated design tools
UML Class Diagrams (1)
• UML (Universal Modeling Language) is a popular
language/methodology for object-oriented software design
• Part of software design is specifying classes using class
diagrams – this is somewhat similar to ER design
• Classes (similar to entity types) as displayed as large rounded
boxes with three sections:
– Top section includes entity type (class) name
– Second section includes attributes
– Third section includes class operations (operations are not in
basic ER model)
• Relationships (called associations) represented as lines
connecting the classes
– Other UML terminology also differs from ER terminology
• UML has many other types of diagrams for software design
(see Chapter 10)
UML Class Diagrams (2)
• Next slide (Figure 7.16) shows example of UML class diagrams
for the COMPANY database schema
• Multiplicities (similar to (min, max) constraints) placed on
opposite end when compared to our previous notation:
– Displayed as min..max
– * represents no maximum limit on participation (like N)
• Two kinds of relationships
– Association: Relationship between two independent objects;
displayed as lines
– Aggregation: Relationship between object and its parts;
displayed as lines with small diamond at object end
• Weak entity can be represented using concept of qualified
association/aggregation (discriminator similar to partial key)
• Relationship names are optional; relationship instances called
links; relationship attributes called link attributes
• Relationship Types of Degree Higher than Two
Relationships of Higher Degree
• Recall that degree of a relationship type is the number of
participating entities in each instance
• Relationship types of degree 2 are called binary, degree 3
are ternary, and degree n are n-ary
• Example: A relationship instance in SUPPLY (Figure 7.10 , next
slide) relates three entities (s, p, j) where s is a SUPPLIER, p a
PART, j a PROJECT such that s currently supplies part p to
project j (with Quantity items per month)
• In general, an n-ary relationship (where n > 2) is not
equivalent to n binary relationships
• Constraints are harder to specify for higher-degree
relationships (n > 2) than for binary relationships
Discussion of n-ary relationships (1)
• In general, 3 binary relationships can represent different
information than a single ternary relationship (see Figure
7.17a and b on next slide)
• If needed, the binary and n-ary relationships can all be
included in the schema design
• In some cases, a ternary relationship can be represented as a
weak entity type if the data model allows multiple identifying
relationships (and hence multiple owner entity types) (see
Figure 7.17c)
Discussion of n-ary relationships (2)
• If a particular binary relationship can be derived from
a higher-degree relationship at all times, then it is
redundant
• For example, TAUGHT_DURING binary relationship in
Figure 7.18 (see next slide) can be derived from the
ternary relationship OFFERS (based on the meaning
of the relationships)
• It all depends on the meaning of the relationships in
the real world
Displaying constraints on higher-degree
relationships
• The (min, max) constraints can be displayed on the edges –
however, they do not fully describe the constraints
• Displaying a 1, M, or N indicates additional constraints
– An M or N indicates no constraint
– A 1 indicates that an entity can participate in at most one
relationship instance that has a particular combination of the
other participating entities
• In general, both (min, max) and 1, M, or N are needed to
describe fully the constraints
Why Extended Entity-Relationship (EER)
Model?
• The basic ER model described so far does not support
specialization and generalization abstractions
• Next chapter illustrates how the ER model can be
extended with
– Type-subtype and set-subset relationships
– Specialization/Generalization Hierarchies
– Notation to display them in EER diagrams
• Another Example: A UNIVERSITY Database
Additional Examples (2)
Additional Examples (3)
Additional Examples (4)
• Subclasses, Superclasses, and Inheritance
Subclasses and Superclasses (1)
• An entity type may have additional meaningful subtypes (or
specializations) of its entities
– Example: EMPLOYEE may be further specialized into:
• SECRETARY, ENGINEER, TECHNICIAN, …
– Based on the EMPLOYEE’s Job
• MANAGER
– EMPLOYEEs who are managers
• SALARIED_EMPLOYEE, HOURLY_EMPLOYEE
– Based on the EMPLOYEE’s method of pay
Subclasses and Superclasses (2)
• EER diagrams extend ER diagrams to represent these
additional subgroupings, called subclasses or subtypes (see
Figure 8.1, next slide)
• Important Note: As with ER diagrams, there are many different
diagrammatic notations for each concept
• We show some alternatives in Appendix A
• UML class diagrams notation for subclasses is presented later
in this chapter
Subclasses and Superclasses (3)
• Each of these subgroupings (ENGINEER, TECHNICIAN,
MANAGER, SALARIED_EMPLOYEE, etc.) will hold a subset of
EMPLOYEE entities
• Each is called a subclass of EMPLOYEE
• EMPLOYEE is called the superclass
• The relationships are called superclass/subclass relationships:
– EMPLOYEE/SECRETARY
– EMPLOYEE/TECHNICIAN
– EMPLOYEE/MANAGER
– …
Subclasses and Superclasses (4)
• These are also called IS-A relationships
– Based on Knowledge Representation terminology in Artificial
Intelligence field
– SECRETARY IS-A EMPLOYEE, TECHNICIAN IS-A EMPLOYEE, ….
• Note: An entity that is member of a subclass represents the
same real-world entity as some member of the superclass:
– The subclass member is the same entity in a distinct specific role
– Entity cannot exist in database merely by being a member of a
subclass; it must also be a member of the superclass
– A member of the superclass can be optionally included as a
member of any number (zero or more) of its subclasses
Subclasses and Superclasses (5)
• Examples:
– A salaried employee who is also an engineer belongs to the two
subclasses:
• ENGINEER, and
• SALARIED_EMPLOYEE
– A salaried employee who is also an engineering manager belongs
to the three subclasses:
• MANAGER,
• ENGINEER, and
• SALARIED_EMPLOYEE
• It is not necessary that every entity in a superclass be a
member of some subclass
Attribute Inheritance in Superclass/
Subclass Relationships
• An entity that is member of a subclass inherits
– All attributes of the entity as a member of the
superclass
– All relationships of the entity as a member of the
superclass
• Example (Figure 8.1):
– SECRETARY (as well as TECHNICIAN, MANAGER,
ENGINEER, etc.) inherit the attributes Name, SSN, …,
from EMPLOYEE
– Every SECRETARY entity will have values for the
inherited attributes
• Specialization and Generalization
Specialization (1)
• Is the process of defining a set of subclasses of a
superclass – the set must be based upon some
distinguishing characteristics of the entities in the
superclass
– Example: {SECRETARY, ENGINEER, TECHNICIAN} is a
specialization of EMPLOYEE based upon job type.
• May have several specializations of the same superclass
• {SALARIED_EMPLOYEE, HOURLY_EMPLOYEE} is another
specialization of EMPLOYEE based on method of pay
Specialization (2)
• Specialization can be diagrammatically represented in EER diagrams
as (see Figure 8.1, repeated in next slide))
– The subclasses are connected to a circle that represents the
specialization (using lines with the subset symbol)
– The circle is also connected to the superclass
– Attributes of a subclass are called specific or local attributes.
• For example, the attribute TypingSpeed of SECRETARY
– The subclass can also participate in specific relationship types.
• For example, a relationship BELONGS_TO of HOURLY_EMPLOYEE
(see Figure 8.1)
Generalization
• Generalization is the reverse of the specialization process
• Several classes with common features are generalized into a
superclass;
– original classes become its subclasses
• Example (Figure 8.3, next slide): CAR, TRUCK generalized into
VEHICLE;
– Both CAR, TRUCK become subclasses of the superclass VEHICLE
because they have several common attributes.
– VEHICLE includes the common attributes
– Can view {CAR, TRUCK} as a specialization of VEHICLE
– Alternatively, we can view VEHICLE as a generalization of CAR
and TRUCK
Generalization vs. Specialization (1)
• Diagrammatic notations sometimes distinguish
generalization and specialization
– Arrow pointing to the generalized superclass
represents a generalization
– Arrows pointing to the specialized subclasses
represent a specialization
– We do not use this notation because it is often unclear
or subjective as to which process was used to reach
the final design
Generalization vs. Specialization (2)
• Data Modeling with Specialization and Generalization
– A superclass or subclass represents a type of entity, as
well as the collection (or set or grouping) of entities of
that type
– Subclasses and superclasses are displayed in rectangles
in EER diagrams (like entity types)
– We can call all entity types classes, whether they are
entity types, superclasses, or subclasses (objectoriented terminology)
Main Reference
1.
2.
3.
4.
5.
6.
Chapter 3: ER Diagrams, Naming Conventions, and Design Issues
Chapter 3: Example of Other Notation: UML Class Diagrams
Chapter 3: Relationship Types of Degree Higher than Two
Chapter 3: Another Example: A UNIVERSITY Database
Chapter 4: Subclasses, Superclasses, and Inheritance
Chapter 4: Specialization and Generalization
(Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez
Elmasri & Shamkant Navathe)
Additional References
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/3-ermodel2.pdf
http://eds.a.ebscohost.com.sdl.idm.oclc.org/eds/pdfviewer/pdfviewer?vid=4&sid=dc163
5b3-88ab-4a84-814c-5908ea9c232c%40sdc-v-sessmgr01
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
Thank You
الجامعة السعودية االلكترونية
الجامعة السعودية االلكترونية
26/12/2021
s
College of Computing and
Informatics Information Technology
IT403
Fundamentals of Database
IT403
Fundamentals of Databases
Module 4: Enhanced Entity-Relationship Model
Contents
1. Constraints and Characteristics of Specialization and
Generalization Hierarchies
2. Modeling of UNION Types Using Categories
3. A Sample UNIVERSITY EER Schema, Design Choices, and
Formal Definitions
4. Example of Other Notation: Representing Specialization and
Generalization in UML Class Diagrams
5. Relational Database Design Using ER-to-Relational Mapping
Weekly Learning Outcomes
1. Explain the Enhanced Entity–Relationship Model.
Required Reading
1.
2.
3.
4.
5.
Chapter 4: Constraints and Characteristics of Specialization and Generalization
Hierarchies
Chapter 4: Modeling of UNION Types Using Categories
Chapter 4: A Sample UNIVERSITY EER Schema, Design Choices, and Formal
Definitions
Chapter 4: Example of Other Notation: Representing Specialization and
Generalization in UML Class Diagrams
Chapter 9: Relational Database Design Using ER-to-Relational Mapping
(Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by
Ramez Elmasri & Shamkant Navathe)
Recommended Reading
Enhanced Entity–Relationship Model:
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/3-ermodel2.pdf
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
• Constraints and Characteristics of Specialization and
Generalization Hierarchies
Constraints on Specialization and
Generalization (1)
• Two basic constraints can apply to a specialization/
generalization:
– Disjointness Constraint: d (disjoint) vs. o (overlapping)
– Completeness Constraint: partial (double line to superclass)
vs. total (single line)
– Default is overlapping, partial
– Decision on which constraint to choose is based on situation
being modeled in mini-world
Constraints on Specialization and
Generalization (2)
• Disjointness Constraint:
– Specifies that the subclasses of the specialization must be
disjoint:
• an entity can be a member of at most one of the subclasses of the
specialization
– Specified by d in EER diagram (Figure 8.4)
– If not disjoint, specialization is overlapping:
• same entity may be a member of more than one subclass of the
specialization
– Specified by o in EER diagram (Figure 8.5)
Constraints on Specialization
and Generalization (3)
• Completeness Constraint:
– Total specifies that every entity in the superclass must be a
member of some (at least one) subclass
– Shown in EER diagrams by a double line connected to the
superclass (Figure 8.5)
– Partial allows an entity not to belong to any of the
subclasses
– Shown by a single line (Figure 8.4)
Constraints on Specialization
and Generalization (4)
• Hence, we have four types of specialization/
generalization:
– Disjoint, total
– Disjoint, partial
– Overlapping, total
– Overlapping, partial
• Note: Generalization usually is total because the
superclass is derived from the subclasses.
More Constraints on Specialization
and Generalization (1)
• If a boolean condition (predicate) can determine
exactly those entities that will become members of a
subclass,it is called predicate-defined (or conditiondefined) subclass:
– Condition is like a constraint that determines subclass
members
– Can display the predicate condition next to the line
attaching the subclass
More Constraints on Specialization
and Generalization (2)
• If one attribute defines conditions for all subclasses in a
specialization, it is called attribute-defined specialization:
– Attribute is called the defining attribute of the specialization
– Example: JobType is the defining attribute of the specialization
{SECRETARY, TECHNICIAN, ENGINEER} of EMPLOYEE (see Figure
8.4)
• If no condition determines membership, the subclass is called
user-defined specialization:
– Membership in a subclass is determined by the database users by
explicitly adding an entity to a subclass
Specialization/Generalization Hierarchies,
Lattices and Shared Subclasses (1)
• A subclass may itself have further subclasses specified
on it:
– forms a hierarchy or a lattice
• Hierarchy has a constraint that every subclass has only
one superclass (called single inheritance); this is
basically a tree structure
• In a lattice, a subclass can be subclass of more than
one superclass (called multiple inheritance) (see
Figure 8.6, next slide)
Specialization/Generalization Hierarchies,
Lattices and Shared Subclasses (2)
• In a lattice or hierarchy, a subclass inherits attributes not only
of its direct superclass, but also of all its ancestor superclasses:
–
all the way to the root class
• Can have:
–
–
–
specialization hierarchies or lattices, or
generalization hierarchies or lattices,
depending on how they were derived
• In general, can just use the term specialization (to stand for the
end result of either specialization or generalization)
Example of Specialization
• Figure 8.7 (next slide) shows an example specialization
of different types of PERSONs in a UNIVERSITY
database
– STUDENT_ASSISTANT is the only shared subclass
– Note: A shared subclass inherits attributes only once from a
common ancestor; in Figure 8.7, STUDENT_ASSISTANT
inherits PERSON attributes only once
Specialization/Generalization Hierarchies,
Lattices and Shared Subclasses (3)
• In specialization, start with an entity type and then
define subclasses of the entity type by successive
specialization
– called a top down conceptual refinement process
• In generalization, start with many entity types and
generalize those that have common properties
(attributes and relationships)
– Called a bottom up conceptual synthesis process
• In practice, a combination of both processes is usually
employed
• Modeling of UNION Types Using Categories
Categories (UNION TYPES-1)
• All of the superclass/subclass relationships we have seen thus
far have a single superclass
• A shared subclass is a subclass in:
– more than one distinct superclass/subclass relationships
– each relationships has a single superclass
– shared subclass leads to multiple inheritance
• In some cases, we need to model a single superclass/subclass
relationship with more than one superclass
• The superclasses can represent different entity types
• Such a subclass is called a category or UNION TYPE
Categories (UNION TYPES-2)
• Example: In a database for vehicle registration, a vehicle owner
can be a PERSON, a BANK (holding a loan on a vehicle) or a
COMPANY (see Figure 8.8, next slide)
– A category (UNION type) called OWNER is created to represent a
subset of the union of the three superclasses COMPANY, BANK,
and PERSON
– A category member must exist in at least one of its superclasses
• Difference from shared subclass, which is a:
– subset of the intersection of its superclasses
– shared subclass member must exist in all of its superclasses
• A Sample UNIVERSITY EER Schema, Design
Choices, and Formal Definitions
Example of EER Schema DiagramsUNIVERSITY database
Example of EER Schema DiagramsSMALL_AIRPORT database
• Example of Other Notation: Representing
Specialization and Generalization in UML Class
Diagrams
Alternative diagrammatic notations
• ER/EER diagrams are a specific notation for displaying
the concepts of the enhanced model diagrammatically
• DB design tools use many alternative notations for the
same or similar concepts
• One popular alternative notation uses UML class
diagrams
• see next slide (Figure 8.10) for UML class diagrams
notation
General Conceptual Modeling Concepts
• GENERAL DATA ABSTRACTIONS
– CLASSIFICATION and INSTANTIATION
– AGGREGATION and ASSOCIATION (relationships)
– GENERALIZATION and SPECIALIZATION
– IDENTIFICATION
• CONSTRAINTS
– CARDINALITY (Min and Max)
– COVERAGE (Total vs. Partial, and Exclusive (disjoint) vs.
Overlapping)
Summary, Formal Definitions of
EER Model (1)
• Specialization Z: Z = {S1, S2,…, Sn} is a set of subclasses
with same superclass G; hence, G/Si is a superclass
relationship for i = 1, …., n.
– G is called a generalization of the subclasses {S1, S2,…,
Sn}
– Z is total if we always have:
• S1 ∪ S2 ∪ … ∪ Sn = G;
• Otherwise, Z is partial.
– Z is disjoint if we always have:
• ( (Si ∩ S2) = empty-set ) for i ≠ j;
• Otherwise, Z is overlapping.
Summary, Formal Definitions of
EER Model (2)
• Subclass S of C is predicate defined if predicate (condition) p
on attributes of C is used to specify membership in S;
– that is, S = C[p], where C[p] is the set of entities in C that satisfy
condition p
• A subclass not defined by a predicate is called user-defined
• Attribute-defined specialization: if a predicate A = ci (where A is
an attribute of G and ci is a constant value from the domain of
A) is used to specify membership in each subclass Si in Z
– Note: If ci ≠ cj for i ≠ j, and A is single-valued, then the attributedefined specialization will be disjoint.
Summary, Formal Definitions of
EER Model (3)
• Category (UNION type) T
– A class that is a subset of the union of n defining
superclasses
D1, D2,…Dn, n>1:
• T ⊆ (D1 ∪ D2 ∪ … ∪ Dn)
– Can have a predicate pi on the attributes of Di to specify
entities of Di that are members of T.
– If a predicate is specified on every Di: T = (D1[p1] ∪ D2[p2]
∪…∪ Dn[pn])
• Relational Database Design Using ER-to-Relational
Mapping
Outline
• Schema Mapping (Logical Database Design) step of Database
Design
• ER-to-Relational Mapping Algorithm
– Step 1: Mapping of Regular Entity Types
– Step 2: Mapping of Weak Entity Types
– Step 3: Mapping of Binary 1:1 Relation Types
– Step 4: Mapping of Binary 1:N Relationship Types.
– Step 5: Mapping of Binary M:N Relationship Types.
– Step 6: Mapping of Multivalued attributes.
– Step 7: Mapping of N-ary Relationship Types.
• Mapping EER Model Constructs
– Step 8: Options for Mapping Specialization or Generalization.
– Step 9: Mapping of Union Types (Categories).
Data Model Mapping Phase of
Relational DB Design
• DB designers use ER/EER or other conceptual data model to
produce a conceptual schema design (independent from any
specific DBMS) during the Conceptual Database Design phase
• In Logical Database Design Phase (see Figure 7.1, next slide)
conceptual schema design is converted (Mapped) to the data
model of the DBMS
– Typically relational model, or object/object-relational models
– Data model mapping is usually automated or semi-automated in
many database design tools
• In this chapter, we study the various options for mapping ER/
EER model constructs to relational model constructs
– Object and object-relational mapping discussed in Chapter 11
Overview of ER-to-Relational
Mapping Algorithm
• We present the concepts of a general mapping algorithm
• Algorithm has 7 steps:
– Step 1: Mapping of regular (strong) entity types
– Step 2: Mapping of weak (dependent) entity types
– Steps 3, 4, 5: Mapping of binary relationship types of different
cardinality ratios (1:1, 1:N, M:N)
– Step 6: Mapping of multi-valued attributes
– Step 7: Mapping of n-ary relationship types, n > 2
• Example: We use the COMPANY ER schema diagram (Figure
9.1, next slide) to illustrate the mapping steps
• Additional steps (Steps 8, 9) for mapping EER model constructs
(specialization/generalization, UNION types) presented later
ER-to-Relational Mapping Algorithm (1)
• Step 1: Mapping of Regular Entity Types
– For each regular (strong) entity type E in the ER schema, create a
relation R that includes all the simple attributes (or simple
components of composite attributes) of E.
– Choose one of the key attributes of E as primary key for R.
– If the chosen key of E is composite, the set of simple attributes
that form it will together form the primary key of R.
• Example: We create the relations EMPLOYEE, DEPARTMENT, and
PROJECT in the relational schema corresponding to the regular
entity types in Figure 9.1
– SSN, DNUMBER, and PNUMBER are chosen as primary keys for the
relations EMPLOYEE, DEPARTMENT, and PROJECT (Figure 9.3(a),
next slide).
– Note: Additional attributes will be added to these tables in later
mapping steps
ER-to-Relational Mapping Algorithm (2)
• Step 2: Mapping of Weak Entity Types
– For each weak entity type W with owner entity type E, create a relation
R that includes all simple attributes (or simple components of
composite attributes) of W as attributes of R.
– Include as foreign key attribute(s) in R the primary key attribute(s) of
the relation(s) that corresponds to the owner entity type(s).
– The primary key of R is the combination of the primary key(s) of the
owner(s) and the partial key of the weak entity type W, if any.
• Example: Create the relation DEPENDENT in this step to
correspond to the weak entity type DEPENDENT.
– see Figure 9.3(b)
– Include the primary key SSN of the EMPLOYEE relation as a foreign key
attribute of DEPENDENT (renamed to ESSN in Fig.).
– The primary key of DEPENDENT is the combination {ESSN,
DEPENDENT_NAME} because DEPENDENT_NAME is the partial key of
DEPENDENT.
ER-to-Relational Mapping Algorithm (3)
• Step 3: Mapping of Binary 1:1 Relationship Types
■
For each binary 1:1 relationship type R in the ER schema, identify the
relations S and T that correspond to the entity types participating in R.
• Three possible approaches:
– Foreign Key approach: Choose one of the relations (say S) and include as
foreign key in S the primary key of T (it is better to choose an entity type
with total participation in R in the role of S).
■
■
■
Example (see Figure 9.2): 1:1 relationship MANAGES (Fig. 9.1) is mapped by
choosing DEPARTMENT to serve in the role of S (because its participation in
the MANAGES relationship type is total)
Mgr_SSN of DEPARTMENT is foreign key referencing EMPLOYEE
Attributes of MANAGES become attributes of DEPARTMENT
– Merged relation option: Merge the two entity types and the
relationship into a single relation (possible when both participations are
total).
– Cross-reference or relationship relation option: Set up a third relation R
for cross-referencing the primary keys of the two relations S and T
representing the entity types.
ER-to-Relational Mapping Algorithm (4)
• Step 4: Mapping of Binary 1:N Relationship Types
– For each regular binary 1:N relationship type R, identify the
relation S that represent the participating entity type at the Nside of the relationship type.
– Include as foreign key in S the primary key of the relation T that
represents the other entity type participating in R.
– Include any simple attributes of the 1:N relation type as attributes
of S.
• Examples (Figures 9.1, 9.2): 1:N relationship types are
WORKS_FOR, CONTROLS, and SUPERVISION.
– For WORKS_FOR we include the primary key DNUMBER of the
DEPARTMENT relation as foreign key in the EMPLOYEE relation
and call it DNO
– (cont. on next slide)
ER-to-Relational Mapping Algorithm (5)
• Examples (cont.):
– For CONTROLS, we include the primary key DNUMBER of
DEPARTMENT as foreign key in PROJECT and call it DNUM.
– For SUPERVISION, we include the primary key SSN of EMPLOYEE
as foreign key in EMPLOYEE itself and call it SuperSSN (this is a
recursive relationship)
• All three 1:N relationship examples (Figures 9.1, WORKS_FOR,
CONTROLS, and SUPERVISION) are mapped using the foreign
key option in Figure 9.2
– Can also use the cross-reference option (create a separate
relation that has the primary keys of both relations as foreign
keys).
ER-to-Relational Mapping Algorithm (6)
• Step 5: Mapping of Binary M:N Relationship Types
– For each regular binary M:N relationship type R, create a new relation S
to represent R.
– Include as foreign key attributes in S the primary keys of the relations that
represent the participating entity types; their combination will form the
primary key of S.
– Also include any simple attributes of the M:N relationship type (or simple
components of composite attributes) as attributes of S.
• Example: The M:N relationship type WORKS_ON (Figure 9.1) is
mapped by creating a relation WORKS_ON in the relational
database schema (Figure 9.3(c), Figure 9.2).
– The primary keys of PROJECT and EMPLOYEE are foreign keys in
WORKS_ON and renamed PNO and ESSN, respectively.
– Attribute HOURS in WORKS_ON represents the HOURS attribute of the
relation type.
– The primary key of WORKS_ON is the combination {ESSN, PNO}.
ER-to-Relational Mapping Algorithm (7)
• Discussion of Mapping of Binary Relationship Types (steps 3, 4, and 5):
– Foreign key option is preferred for 1:1 and 1:N relationships, but cannot be
used for M:N relationships.
– Relationship relation option can be used for any cardinality ratio, but the
primary key will be different:
• Combination of both foreign keys for M:N
• Either foreign key for 1:1
• Foreign key in the N-side relation for 1:N
– Attributes of relationship type are included in the relationship relation (for
cross-referencing option), or in the relation that includes the foreign key
(for foreign key option).
ER-to-Relational Mapping Algorithm (8)
• Step 6: Mapping of Multivalued attributes.
– For each multivalued attribute A, create a new relation R.
– This relation R will include an attribute corresponding to A, plus the
primary key attribute K (as a foreign key in R) of the relation that
represents the entity type that has A as an attribute.
– The primary key of R is the combination of A and K. If the multivalued
attribute is composite, we include its simple components.
• Example (Figure 9.3(d)): The relation DEPT_LOCATIONS is created.
– The attribute DLOCATION represents the multivalued attribute Locations of
DEPARTMENT (Figure 9.1), while DNUMBER is foreign key to the
DEPARTMENT relation (Figure 9.2).
– The primary key of DEPT_LOCATIONS is the combination of {DNUMBER,
DLOCATION}.
ER-to-Relational Mapping Algorithm (9)
• Step 7: Mapping of N-ary Relationship Types.
– For each n-ary relationship type R, where n>2, create a new
relationship relation S to represent R.
– Include as foreign key attributes in S the primary keys of the
relations that represent the participating entity types.
– Also include any simple attributes of the n-ary relationship type
(or simple components of composite attributes) as attributes of S.
• Example: The relationship type SUPPLY (Figure 7.17(a), next
slide)
– This can be mapped to the relation SUPPLY (Figure 9.4, following slide),
whose primary key is the combination of the three foreign keys {SNAME,
PARTNO, PROJNAME}
Main Reference
1.
2.
3.
4.
5.
Chapter 4: Constraints and Characteristics of Specialization and Generalization
Hierarchies
Chapter 4: Modeling of UNION Types Using Categories
Chapter 4: A Sample UNIVERSITY EER Schema, Design Choices, and Formal
Definitions
Chapter 4: Example of Other Notation: Representing Specialization and
Generalization in UML Class Diagrams
Chapter 9: Relational Database Design Using ER-to-Relational Mapping
(Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by
Ramez Elmasri & Shamkant Navathe)
Additional References
https://courses.cs.vt.edu/cs4604/Spring21/pdfs/3-ermodel2.pdf
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
Thank You
الجامعة السعودية االلكترونية
الجامعة السعودية االلكترونية
26/12/2021
College of Computing and Informatics
Data Science Pre-Master Program
IT244
Introduction to Database
IT244
Introduction to Database
Module 5: Basic SQL-1
Contents
1. Mapping EER Model Constructs to Relations
2. SQL Data Definition and Data Types
3. Specifying Constraints in SQL
Weekly Learning Outcomes
1. Explain ER-to-Relational Mapping.
2. Explain data definition and constraints in SQL.
Required Reading
1. Chapter 9: Mapping EER Model Constructs to Relations
2. Chapter 6: SQL Data Definition and Data Types
3. Chapter 6: Specifying Constraints in SQL
(Fundamentals of Database Systems, Global Edition, 7th
Edition (2017) by Ramez Elmasri & Shamkant Navathe)
Recommended Reading
SQL-I: https://courses.cs.vt.edu/cs4604/Spring21/pdfs/3-sql1.pdf
Learn to create database objects and manipulate data:
https://github.com/VTCourses/CS4604_Labs/tree/master/1.ddl_dml
This Presentation is mainly dependent on the textbook: Fundamentals of Database Systems, Global Edition, 7th Edition (2017) by Ramez Elmasri & Shamkant Navathe
• Mapping EER Model Constructs to Relations
Mapping EER Model Constructs to
Relations (1)
• We add two steps 8 and 9 to map EER model
constructs
– Step 8 is for mapping specialization/generalization and
subclasses
– Several options exist in step 8
– Step 9 is for mapping categories (UNION types)
– Step 9 can involve creating surrogate (artificial) key
attributes for the relation representing the UNION type
Mapping EER Model Constructs to
Relations (2)
• Step8: Options for Mapping Specialization (or
Generalization)
– Convert each specialization with m subclasses {S1, S2,….,Sm}
and generalized superclass C, where the attributes of C are
{k,a1,…an} and k is the (primary) key, into relational schemas
using one of the four following options:
• Option 8A: Multiple relations-Superclass and subclasses
• Option 8B: Multiple relations-Subclass relations only
• Option 8C: Single relation with one type attribute
• Option 8D: Single relation with multiple type (or mapping)
attributes
Mapping EER Model Constructs to
Relations (3)
• Option 8A: Multiple relations-Superclass and subclasses (see
Figure 9.5(a), next slide)
– Create a relation L for superclass C with attributes Attrs(L) = {k,a1,…an}
and PK(L) = k. Create a relation Li for each subclass Si, 1 < i < m, with
the attributes Attrs(Li) = {k} U {attributes of Si} and PK(Li)=k. This
option works for any specialization (total or partial, disjoint of overlapping).
• Option 8B: Multiple relations-Subclass relations only (see Figure
9.5(b), next slide)
– Create a relation Li for each subclass Si, 1 < i < m, with the attributes
Attr(Li) = {attributes of Si} U {k,a1…,an} and PK(Li) = k. This option only
works for a specialization whose subclasses are total (every entity in
the superclass must belong to (at least) one of the subclasses)
– Works best if subclasses are also disjoint
Mapping EER Model Constructs to
Relations (4)
• Option 8C: Single relation with one type attribute
– Create a single relation L with attributes Attrs(L) = {k,a1,…an} U
{attributes of S1} U…U {attributes of Sm} U {t} and PK(L) = k. The
attribute t is called a type (or discriminating) attribute that indicates
the subclass to which each tuple belongs
– Works for disjoint subclasses (see Figure 9.5(c))
• Option 8D: Single relation with multiple type attributes
– Create a single relation schema L with attributes Attrs(L) = {k,a1,…an}
U {attributes of S1} U…U {attributes of Sm} U {t1, t2,…,tm} and PK(L) =
k. Each ti, 1 < I < m, is a Boolean type attribute indicating whether or
not a tuple belongs to the subclass Si.
– Works for overlapping subclasses (see Figure 9.5(d))
Mapping EER Model Constructs to
Relations (5)
• Mapping of Shared Subclasses (Multiple Inheritance)
– A shared subclass, such as STUDENT_ASSISTANT, is a subclass of
several classes, indicating multiple inheritance.
– These classes must all have the same key attribute; otherwise,
the shared subclass would be a category.
– We can apply any of the options discussed in Step 8 to a shared
subclass, subject to the restriction discussed in Step 8 of the
mapping algorithm.
– In Figure 9.6 (next slide), option 8D is used for the shared
subclass STUDENT_ASSISTANT (from Figure 8.7).
Mapping EER Model Constructs to
Relations (6)
• Step 9: Mapping of Union Types (Categories).
– For mapping a category whose defining superclasses have
different keys, it is customary to specify a new (artificial)
key attribute, called a surrogate key, when creating a
relation to correspond to the category.
– In Figure 9.7 (next slide), the relation OWNER corresponds to the
OWNER category (from Figure 8.8). The primary key of the
OWNER relation is the surrogate key, which we called OwnerId.
– Unique values of OwnerId can be created by the system (similar
to ObjectId in Object databases)
– Useful to add an attribute OwnerType to OWNER relation to
indicate if a record represents a BANK, COMPANY, or PERSON
(not shown in Figure 9.7)
• SQL Data Definition and Data Types
Basic SQL
• Overview of SQL
• Data Definition Language (DDL) for Creating a Relational Database Schema
– CREATE statements
– Specifying Constraints
• Basic Retrieval Queries in SQL
– SELECT … FROM … WHERE … statements
• Basic Database Modification in SQL
– INSERT, DELETE, UPDATE statements
Overview of SQL
• SQL is a standard, comprehensive language, based on
the relational model
• SQL includes capabilities for many functions:
– DDL statements for creating schemas and specifying data
types and constraints
– Statements for specifying database retrievals
– Statements for modifying the database
– Statements for specifying views, triggers, and assertions
– Many other capabilities
• We introduce the first three capabilities in this chapter
SQL Features in This Chapter
• Basic SQL DDL
– Includes the CREATE statements
– Has a comprehensive set of SQL data types
– Can specify key, referential integrity, and other constraints
• Basic Retrieval Queries in SQL
– SELECT … FROM … WHERE … statements
• Basic Database Modification in SQL
– INSERT, DELETE, UPDATE statements
SQL: Origins and History
• Evolution of the SQL Standard
– First standard approved in 1989 (called SQL-89 or SQL1)
– Comprehensive revision in 1992 (SQL-92 or SQL2)
– Third big revision in 1999 (SQL-99, a trimmed-down version of a
more ambitious revision known as SQL3)
– Other revisions known as SQL:2003, SQL:2006, SQL:2008
• Origins of SQL
– Originally called SEQUEL (Structured English Query Language),
then SQL (Structured Query Language)
– Developed at IBM Research for experimental relational DBMS
called System-R in the 1970s
SQL Data Definition
• CREATE statement can be used to:
– Create a named database schema
– Create individual database tables and specify data
types for the table attributes, as well as key, referential
integrity, and NOT NULL constraints
– Create named constraints
• Other commands can modify the tables and
constraints
– DROP and ALTER statements
CREATE TABLE (1)
• In its simplest form, specifies a new base relation by giving it a
name, and specifying each of its attributes and their data types
(INTEGER, FLOAT, DECIMAL(i, j), CHAR(n), VARCHAR(n), DATE,
and other data types)
• A constraint NOT NULL may be specified on an attribute
CREATE TABLE DEPARTMENT (
DNAME
VARCHAR(15) NOT NULL,
DNUMBER
INT
NOT NULL,
MGRSSN
CHAR(9)
NOT NULL,
MGRSTARTDATE DATE );
CREATE TABLE (2)
• CREATE TABLE can also specify the primary key, UNIQUE keys, and
referential integrity constraints (foreign keys)
• Via the PRIMARY KEY, UNIQUE, and FOREIGN KEY phrases
CREATE TABLE DEPARTMENT (
DNAME
ARCHAR(15)NOT NULL,
DNUMBER INT NOT NULL,
MGRSSN CHAR(9) NOT NULL,
MGRSTARTDATE DATE,
PRIMARY KEY (DNUMBER),
UNIQUE (DNAME),
FOREIGN KEY(MGRSSN) REFERENCES EMPLOYEE);
Example: The COMPANY Database
• Figure 3.7 (next slide) shows the COMPANY database
schema diagram
• Referential integrity constraints shown as directed edges
from foreign key to referenced relation
• Primary key attributes of each table underlined
Example: The COMPANY DDL
• Figure 4.1 (following two slides) shows example DDL for
creating the tables of the COMPANY database
• Circular reference problem:
– In Figure 3.7, some foreign keys cause circular references:
• EMPLOYEE.Dno → DEPARTMENT.Dnumber
• DEPARTMENT.Mgr_ssn → EMPLOYEE.Ssn
– One of the tables must be created first without the FOREIGN KEY
in the CREATE TABLE statement
• The missing FOREIGN KEY can be added later using ALTER TABLE
Continued next page…
The CREATE SCHEMA Statement
• A DBMS can manage multiple databases
• DBA (or authorized users) can use CREATE SCHEMA to have
distinct databases; for example:
CREATE SCHEMA COMPANY AUTHORIZATION 'Smith';
• Each database has a schema name (e.g. COMPANY)
• User 'Smith' will be “owner” of schema, can create tables,
other constructs in that schema
• Table names can be prefixed by schema name if multiple
schemas exist (e.g. COMPANY.EMPLOYEE)
• Prefix not needed if there is a “default schema” for the user
Basic SQL Data Types (1)
• Basic numeric data types:
– Integers: INTEGER (or INT), SMALLINT
– Real (floating point): FLOAT (or REAL), DOUBLE PRECISION
– Formatted: DECIMAL(i, j) (or DEC (i, j) or NUMERIC(i, j))
specify i total decimal digits, j after decimal point
• i called precision, j called scale
• Basic character string data types:
– Fixed-length: CHAR(n) or CHARACTER(n)
– Variable-length: VARCHAR(n) or CHAR VARYING(n) or
CHARACTER VARYING(n)
SQL Data Types (2)
• Basic boolean data types:
– BIT(n), BIT VARYING (n)
• Large object data types:
– Binary large objects: BLOB(n)
– Can be used to store attributes that represent images, audio,
video, or other large binary objects
– Character large objects: CLOB(n)
– Can be used attributes that store articles, news stories, text files,
and other large character objects
SQL Data Types (3)
• DATE data type:
– Standard DATE formatted as yyyy-mm-dd
– For example, DATE '2010-01-22'
– Older formats still used in some systems, such as 'JAN-22-2010'
– Values are ordered, with later dates larger than earlier ones
• TIME data type:
– Standard TIME formatted as hh:mm:ss
– E.g., TIME '15:20:22' represents 3.20pm and 22 seconds
– TIME(i) includes and additional i-1 decimal digits for fractions of
a second
– E.g., TIME(5) value could be '15:20:22.1234'
SQL Data Types (4)
• TIMESTAMP data type:
– A DATE combined with a TIME(i), where the default i=7
– For example, TIMESTAMP '2010-01-22 15:20:22.123456'
– A different i>7 can be specified if needed
• INTERVAL represents a relative YEAR/MONTH or
DAY/TIME (e.g. 2 years and 5 months, or 5 days 20
hours 3 minutes 22 seconds)
• Other SQL data types exist; we presented the most
common ones
• Specifying Constraints in SQL
Specifying Referential Integrity Options
and Naming of Constraints
• We can specify RESTRICT (the default), CASCADE, SET NULL or
SET DEFAULT on referential integrity constraints (foreign keys)
• Separate options for ON DELETE, ON UPDATE
• Figure 4.2 (next slide) gives some examples
• A constraint can be given a constraint name; this allows to DROP
the constraint later
• Figure 4.2 illustrates naming of constraints
Specifying DOMAIN Constraints Using
the CHECK Clause
• Another type of constraint…