Qué necesitas saber antes de
comenzar

Inicio 4 June 2026 03:18

Fin 4 June 2026

00 Días
00 Horas
00 Minutos
00 Segundos
course image

Sample-based Learning Methods

Sumérgete en el mundo de los Métodos de Aprendizaje Basados en Muestras con un curso completo ofrecido por la Universidad de Alberta en Coursera. Este curso profundiza en algoritmos que dominan políticas casi óptimas a través de interacciones de prueba y error con su entorno, mostrando el poder de aprender directamente de la experiencia personal de.
University of Alberta via Coursera

University of Alberta

6 Cursos


La Universidad de Alberta es una destacada universidad de investigación ubicada en Edmonton, Canadá. Es conocida por su excelencia en la enseñanza, la investigación, la innovación y su dedicación al compromiso comunitario.

No especificado

Actualización opcional disponible

Todos los niveles

Avanza a tu propio ritmo

Free

Actualización opcional disponible

Resumen

Delve into the world of Sample-based Learning Methods with a comprehensive course offered by the University of Alberta on Coursera. This course dives deep into algorithms that master near-optimal policies through trial and error interactions with their environment, showcasing the power of learning directly from an agent's personal experience.

Uncover the essentials of intuitively simple yet potent Monte Carlo methods and the intricacies of temporal difference learning methods, including the renowned Q-learning.

Embark on a journey to understand how to merge model-based planning with temporal difference updates to significantly boost the learning process. By the course's completion, participants will have gained the ability to:

  • Comprehend the nuances of Temporal-Difference learning and Monte Carlo methods for estimating value functions based on sampled experiences.
  • Recognize the critical role of exploration in leveraging sampled experience over dynamic programming sweeps.
  • Draw connections between Monte Carlo, Dynamic Programming, and TD methods.
  • Develop the skills to implement and utilize the TD algorithm for accurate value function estimation.
  • Apply Expected Sarsa and Q-learning techniques for control purposes.
  • Distinguish between on-policy and off-policy control mechanisms.
  • Explore planning strategies that use simulated experience.
  • Implement a model-based approach to Reinforcement Learning (RL) through Dyna, enhancing sample efficiency with simulated experiences.

This course is categorized under Artificial Intelligence Courses, Reinforcement Learning Courses, and specifically Q-learning Courses, making it an ideal fit for anyone keen to excel in these areas.


Impartido por

Martha White and Adam White


Materias