Was Sie vorher wissen sollten
bevor Sie beginnen

Beginnt 4 June 2026 11:53

Endet 4 June 2026

00 Tage
00 Stunden
00 Minuten
00 Sekunden
course image

Data Selection - Data Challenges when Training Generative Models

Scalable Parallel Computing Lab, SPCL @ ETH Zurich via YouTube

Scalable Parallel Computing Lab, SPCL @ ETH Zurich

6076 Kurse


1 hour

Optionales Upgrade verfügbar

Not Specified

Lernen Sie in Ihrem eigenen Tempo

Free Video

Optionales Upgrade verfügbar

Übersicht

Lehrplan

  • Introduction to Data Selection in Generative Model Training
  • Importance of Data Selection
    Overview of Generative Models
  • Filtering Methods for Pre-training
  • Data Quality Assessment
    Data Deduplication Techniques
    Noise Reduction Strategies
  • Strategic Data Selection Techniques
  • Importance Sampling
    Submodular Optimization Approaches
    Active Learning for Data Curation
  • Optimal Transport Approaches for Fine-tuning
  • Principles of Optimal Transport
    Applications in Model Fine-tuning
    Case Studies in Reduced Data Requirements
  • Balancing Data Efficiency and Model Performance
  • Trade-offs in Data Selection
    Performance Metrics and Evaluation
  • Case Studies and Industry Applications
  • Real-world Examples
    Success Stories and Lessons Learned
  • Tools and Frameworks for Data Selection
  • Overview of Available Tools
    Practical Exercises and Tutorials
  • Future Trends and Research Directions
  • Emerging Techniques in Data Selection
    Opportunities for Innovation
  • Conclusion and Recap
  • Key Takeaways
    Final Thoughts on Data Selection for Generative Models
  • Practical Project
  • Design a Data Selection Pipeline
    Implement Filtering and Fine-tuning Strategies

Fachgebiete

Computer Science