What You Need to Know Before
You Start

Starts 6 July 2025 14:28

Ends 6 July 2025

00 Days
00 Hours
00 Minutes
00 Seconds
course image

Test AI & LLM App with DeepEval, RAGAs & more using Ollama

Roadmap to become AI QA Engineer to test LLMs and AI Application using DeepEval, RAGAs and HF Evaluate with Local LLMs
via Udemy

4124 Courses


10 hours

Optional upgrade avallable

Not Specified

Progress at your own speed

Paid Course

Optional upgrade avallable

Overview

Roadmap to become AI QA Engineer to test LLMs and AI Application using DeepEval, RAGAs and HF Evaluate with Local LLMs What you'll learn:

Understand the purpose of Testing LLM and LLM based ApplicationUnderstand DeepEval and RAGAs in detail from complete ground upUnderstand different metrics and evaluations to evaluate LLMs and LLM based app using DeepEval and RAGAsUnderstand the advanced concepts of DeepEval and RAGAsTesting RAG based application using DeepEval and RAGAsTesting AI Agents using DeepEval to understand how tool callings can be tested Testing AI & LLM App with DeepEval, RAGAs & more using Ollama and Local Large Language Models (LLMs)Master the essential skills for testing and evaluating AI applications, particularly Large Language Models (LLMs). This hands-on course equips QA, AIQA, Developers, data scientists, and AI practitioners with cutting-edge techniques to assess AI performance, identify biases, and ensure robust application development.Topics Covered:

Section 1:

Foundations of AI Application Testing (Introduction to LLM testing, AI application types, evaluation metrics, LLM evaluation libraries).Section 2:

Local LLM Deployment with Ollama (Local LLM deployment, AI models, running LLMs locally, Ollama implementation, GUI/CLI, setting up Ollama as API).Section 3:

Environment Setup (Jupyter Notebook for tests, setting up Confident AI).Section 4:

DeepEval Basics (Traditional LLM testing, first DeepEval code for AnswerRelevance, Context Precision, evaluating in Confident AI, testing with local LLM, understanding LLMTestCases and Goldens).Section 5:

Advanced LLM Evaluation (LangChain for LLMs, evaluating Answer Relevancy, Context Precision, bias detection, custom criteria with GEval, advanced bias testing).Section 6:

RAG Testing with DeepEval (Introduction to RAG, understanding RAG apps, demo, creating GEval for RAG, testing for conciseness & completeness).Section 7:

Advanced RAG Testing with DeepEval (Creating multiple test data, Goldens in Confident AI, actual output and retrieval context, LLMTestCases from dataset, running evaluation for RAG).Section 8:

Testing AI Agents and Tool Callings (Understanding AI Agents, working with agents, testing agents with and without actual systems, testing with multiple datasets).Section 9:

Evaluating LLMs using RAGAS (Introduction to RAGAS, Context Recall, Noise Sensitivity, MultiTurnSample, general purpose metrics for summaries and harmfulness).Section 10:

Testing RAG applications with RAGAS (Introduction and setup, creating retrievers and vector stores, MultiTurnSample dataset for RAG, evaluating RAG with RAGAS).

Syllabus

  • Section 1: Foundations of AI Application Testing
  • Introduction to LLM Testing
    Types of AI Applications
    Evaluation Metrics for AI Applications
    Overview of LLM Evaluation Libraries
  • Section 2: Local LLM Deployment with Ollama
  • Local LLM Deployment Strategies
    Overview of AI Models
    Running LLMs Locally
    Implementing Ollama for LLM Deployment
    Using Ollama's GUI/CLI
    Setting Up Ollama as an API
  • Section 3: Environment Setup
  • Testing Environment in Jupyter Notebook
    Setting Up Confident AI Platform
  • Section 4: DeepEval Basics
  • Traditional LLM Testing Methods
    Developing First DeepEval Code
    Evaluating Answer Relevance and Context Precision
    Using Confident AI for Evaluation
    Testing with Local LLMs
    Understanding LLMTestCases and Goldens
  • Section 5: Advanced LLM Evaluation
  • Using LangChain with LLMs
    Evaluating Answer Relevancy and Context Precision
    Detecting and Evaluating Bias
    Custom Evaluation Criteria with GEval
    Advanced Bias Testing Techniques
  • Section 6: RAG Testing with DeepEval
  • Introduction to RAG (Retrieval-Augmented Generation)
    Understanding RAG Applications
    Demonstration of RAG Testing
    Creating GEval Tests for RAG
    Evaluating Conciseness and Completeness
  • Section 7: Advanced RAG Testing with DeepEval
  • Creating Multiple Test Datasets
    Using Goldens in Confident AI
    Analyzing Actual Outputs and Retrieval Contexts
    Generating LLMTestCases from Datasets
    Running RAG Evaluations
  • Section 8: Testing AI Agents and Tool Callings
  • Introduction to AI Agents
    Working with AI Agents
    Testing AI Agents with and without Actual Systems
    Using Multiple Datasets for Agent Evaluation
  • Section 9: Evaluating LLMs using RAGAS
  • Introduction to RAGAS (RAG Evaluation System)
    Metrics: Context Recall, Noise Sensitivity, MultiTurnSample
    Evaluating General Purpose Summaries and Harmfulness
  • Section 10: Testing RAG Applications with RAGAS
  • Introduction and Setup for RAGAS Testing
    Creating Retrievers and Vector Stores
    Using MultiTurnSample Dataset for RAG Evaluation
    Comprehensive RAG Evaluation with RAGAS

Taught by

Karthik KK


Subjects

Computer Science