Topics

AI Evaluation & Reliability

Measuring, testing, and trusting AI systems.

AI evaluation is the practice of measuring whether an AI system actually works — through evals, benchmarks, and reliability testing that catch hallucinations and regressions before they reach production.

26 episodes

Guests on this topic

Loïc HoussierAlex RatnerSudhir HasbeDan KleinVikram ChatterjiMaxime LabonneAishwarya SrinivasanMalte UblHamel HusainAndreas CleveMikiko ChandrasekharPhilipp KrennGiovanna CarofiglioJoão MouraAtindriyo SanyalDenny LeeSiva SurendiraOlga BeregovayaMaryam AshooriManoj AgarwalYash ShethAndrew ZiglerMehmet Murat EzbiderliVinnie GiarrussoGrant LedfordChip HuyenVivienne ZhangBrian RaymondBob van Luijt