What You Need to Know Before
You Start

Starts 4 June 2026 03:52

Ends 4 June 2026

00 Days
00 Hours
00 Minutes
00 Seconds
course image

Data Management for Analytics Part 1

Master database fundamentals, ER modeling, UML, relational design, and normalization to build robust data management systems supporting analytics and AI applications.
Northeastern University via Coursera

Northeastern University

26 Courses


Northeastern is a globally recognized research university with campuses in Boston and globally. It provides an experiential learning system that encourages students to learn from real-world experience.

9 hours 6 minutes

Optional upgrade avallable

Intermediate

Progress at your own speed

Free Online Course (Audit)

Optional upgrade avallable

Overview

This course will offer you an opportunity to learn the fundamental concepts and emerging technologies in database design and modeling and database systems. It presents a balanced theory-practice focus and covers entity relationship model and UML model, relational model, and relational databases.

By the end of this part 1 course on data analytics, you will have a foundational understanding of the theory and applications of database management to support data analytics, data mining, machine learning, and artificial intelligence.

Syllabus

  • Fundamental Concepts of Database Management
  • In this module, we will introduce the fundamental concepts of database management, review applications of database technology, and define key concepts. We will also contrast the file-based approach to data management with the database approach. Finally, we will examine the elements of a database system and the advantages of database design.
  • Architecture and Categorization of Database Management Systems (DBMSs)
  • In this module, we take a quick look at what is under the hood of a database management system. We will examine the key components of DBMS architecture and how these components work together for data storage, processing, and management. We also check how DBMSs can be categorized based on data models, degree of simultaneous access, architecture, and usage.
  • Conceptual Data Modeling, Part 1
  • In this module, we first review the database design process from conceptual and logical to physical database design and elaborate on the data requirements of a business process. We then introduce the Entity Relationship (ER) model for conceptual data modeling. The fundamental building blocks of the ER model include entity types, attribute types, and relationship types. We discuss attribute type details such as domains, key attribute types, simple versus composite attribute types, single-valued versus multi-valued attribute types, and derived attribute types. For relationship types, we also examine the degree and roles, cardinalities, weak entity types, and ternary relationship types. Various examples are included for clarification.
  • Conceptual Data Modeling, Part 2
  • In this module, we will learn three additional semantic data modeling concepts: specialization/generalization, categorization, and aggregation. These concepts enhance and extend the ER model discussed in the previous module. We will introduce an alternative conceptual model: the Unified Modeling Language (UML) class diagram. The UML is a modeling language that assists in the specification, visualization, construction, and documentation of artifacts of a software system. The UML can offer case diagrams, sequence diagrams, package diagrams, deployment diagrams, etc. Here we use the UML for conceptual data modeling.
  • Organizational Aspects of Data Management
  • In this module, we focus on some organizational aspects of data management, including the DBMS catalog, the roles of metadata, and metadata modeling. We also discuss data quality, data governance, and different roles in data management. By the end of this module, you will understand the proper management of data and the corresponding data definitions. Data management entails proper management of data and the corresponding data definitions or metadata. The objective of data management is to ensure that (meta-)data is of good quality, and thus a key resource, for data analytics tasks and effective and efficient managerial decision-making.
  • Relational Model
  • As discussed in the previous modules, designing a database takes multiple steps. Once the conceptual data model is finalized, the next step is to map the conceptual data model to a logical data model by the database designer during the logical design step. Note that, unlike the conceptual data model, the logical data model is associated with the data model used by the implementation DBMS environment. In other words, a logical data model is intended for a specific type of DBMS. Since the top ten DBMSs in use are usually dominated by relational DBMSs such as Oracle, MySQL (open-source), Microsoft SQL Server, etc., we will focus on the relational model that can be used as a logical data model for relational DBMSs.
  • Normalization of Relational Model and Mapping of the EER Model to Relational Model
  • This module first presents an overview of the insertion, deletion, and update anomalies in an unnormalized relational model and discusses informal normalization guidelines. Two key concepts used in the normal forms are defined and examined: functional dependency and prime attribute type along with various special cases of function dependency, including full versus partial, transitive, trivial, and multivalued dependencies. The process and the formal procedures for the normalization of a relational model are discussed in detail via the first normal form (1 NF), the second normal form (2 NF), the third normal form (3 NF), the Boyce-Codd normal form (BCNF), and the fourth normal form (4 NF).

Taught by

Xuemin Jin


Subjects

Computer Science