Was Sie vorher wissen sollten
bevor Sie beginnen

Beginnt 4 June 2026 03:12

Endet 4 June 2026

00 Tage
00 Stunden
00 Minuten
00 Sekunden
course image

Sample-based Learning Methods

Delve into the world of Sample-based Learning Methods with a comprehensive course offered by the University of Alberta on Coursera. This course dives deep into algorithms that master near-optimal policies through trial and error interactions with their environment, showcasing the power of learning directly from an agent's personal experience. Uncov.
University of Alberta via Coursera

University of Alberta

6 Kurse


The University of Alberta is a premier research institution situated in Edmonton, Canada. It is renowned for its outstanding teaching, research, innovation and its commitment to community involvement.

Nicht angegeben

Optionales Upgrade verfügbar

Alle Niveaus

Lernen Sie in Ihrem eigenen Tempo

Free

Optionales Upgrade verfügbar

Übersicht

Delve into the world of Sample-based Learning Methods with a comprehensive course offered by the University of Alberta on Coursera. This course dives deep into algorithms that master near-optimal policies through trial and error interactions with their environment, showcasing the power of learning directly from an agent's personal experience.

Uncover the essentials of intuitively simple yet potent Monte Carlo methods and the intricacies of temporal difference learning methods, including the renowned Q-learning.

Embark on a journey to understand how to merge model-based planning with temporal difference updates to significantly boost the learning process. By the course's completion, participants will have gained the ability to:

  • Comprehend the nuances of Temporal-Difference learning and Monte Carlo methods for estimating value functions based on sampled experiences.
  • Recognize the critical role of exploration in leveraging sampled experience over dynamic programming sweeps.
  • Draw connections between Monte Carlo, Dynamic Programming, and TD methods.
  • Develop the skills to implement and utilize the TD algorithm for accurate value function estimation.
  • Apply Expected Sarsa and Q-learning techniques for control purposes.
  • Distinguish between on-policy and off-policy control mechanisms.
  • Explore planning strategies that use simulated experience.
  • Implement a model-based approach to Reinforcement Learning (RL) through Dyna, enhancing sample efficiency with simulated experiences.

This course is categorized under Artificial Intelligence Courses, Reinforcement Learning Courses, and specifically Q-learning Courses, making it an ideal fit for anyone keen to excel in these areas.


Unterrichtet von

Martha White and Adam White


Fachgebiete