What You Need to Know Before
You Start

Starts 8 June 2025 02:08

Ends 8 June 2025

00 days
00 hours
00 minutes
00 seconds
course image

Data Munging to Wrangling - 7 Steps to Mastering Data Preparation for Data Science

Discover essential techniques for data preparation in AI/ML projects. Learn to source, clean, and transform data effectively, enhancing the quality and predictive power of machine learning models.
PASS Data Community Summit via YouTube

PASS Data Community Summit

2544 Courses


1 hour 16 minutes

Optional upgrade avallable

Not Specified

Progress at your own speed

Conference Talk

Optional upgrade avallable

Overview

Discover essential techniques for data preparation in AI/ML projects. Learn to source, clean, and transform data effectively, enhancing the quality and predictive power of machine learning models.

Syllabus

  • Introduction to Data Preparation
  • Importance of data preparation in AI/ML
    Overview of the 7-step process
  • Step 1: Data Sourcing
  • Identifying data needs
    Exploring various data sources
    Data collection techniques
  • Step 2: Data Understanding
  • Exploring data structure and content
    Statistical data exploration
    Identifying data outliers and anomalies
  • Step 3: Data Cleaning
  • Handling missing data
    Techniques for dealing with noise and errors
    Data deduplication methods
  • Step 4: Data Transformation
  • Data normalization and standardization
    Feature scaling and selection
    Encoding categorical variables
  • Step 5: Data Enrichment
  • Data integration from multiple sources
    Augmentation techniques
    Use of external datasets for enrichment
  • Step 6: Data Reduction
  • Dimensionality reduction techniques
    Feature extraction and selection
    Data summarization
  • Step 7: Data Validation and Testing
  • Ensuring data quality and integrity
    Data validation techniques
    Creating and using validation datasets
  • Conclusion and Best Practices
  • Recap of key techniques and tools
    Tips for efficient data preparation
    Common pitfalls and how to avoid them
  • Practical Project
  • Apply the 7-step process on a real-world dataset
    Present findings and insights

Subjects

Conference Talks