cover of episode 706: Large Language Model Leaderboards and Benchmarks

706: Large Language Model Leaderboards and Benchmarks

2023/8/18
logo of podcast Super Data Science: ML & AI Podcast with Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

Frequently requested episodes will be transcribed first

Shownotes Transcript

In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena.Additional materials: www.superdatascience.com/706)Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast) for sponsorship information.