706: Large Language Model Leaderboards and Benchmarks

2023/8/18

Super Data Science: ML & AI Podcast with Jon Krohn

Frequently requested episodes will be transcribed first

Shownotes Transcript

In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena.Additional materials: www.superdatascience.com/706)Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast) for sponsorship information.

706: Large Language Model Leaderboards and Benchmarks 33:27 Share

Super Data Science: ML & AI Podcast with Jon Krohn

Shownotes Transcript

706: Large Language Model Leaderboards and Benchmarks