Wat je moet weten voordat je
begint

Start 13 June 2026 13:44

Einde 13 June 2026

00 Dagen
00 Uren
00 Minuten
00 Seconden
course image

Introducing Terminal-Bench - Evaluating LLM Agents in Realistic Terminal Settings

Discover Terminal-Bench, a challenging benchmark for evaluating LLM agents in real-world terminal environments, addressing gaps in current agent evaluation methods.
Anyscale via YouTube

Anyscale

6077 Cursussen


31 minutes

Optionele upgrade beschikbaar

Not Specified

Ga in je eigen tempo vooruit

Free Video

Optionele upgrade beschikbaar

Overzicht

Discover Terminal-Bench, a challenging benchmark for evaluating LLM agents in real-world terminal environments, addressing gaps in current agent evaluation methods.


Vakgebieden

Artificial Intelligence