Hey everyone, this is Alex Volkov), the host of ThursdAI, welcome to yet another recap of yet another incredibly fast past faced week.
I want to start with a ThursdAI update, we now have a new website http://thursdai.news) and a new dedicated twitter account @thursdai_pod) as we build up the ThursdAI community and brand a bit more.
As always, a reminder that ThursdAI is a weekly X space, newsletter) and 2! podcasts, short form (Apple), Spotify)) and the unedited long-form spaces recordings (RSS), Zealous page)) for those who’d like the nitty gritty details (and are on a long drive somewhere).
Open Source LLMs & Finetuning
Honestly, the speed with which LLaMa 2 finetunes are taking over state of the art performance is staggering. We literally talk about a new model every week that’s topping the LLM Benchmark leaderboard), and it hasn’t even been a month since LLaMa 2 release day 🤯 (July 18) for those who are counting)
Enter Platypus 70B (đź”—))
Platypus 70B-instruct is currently the highest ranked open source LLM and other Platypus versions
We’ve had the great pleasure to chat with new friends of the pod Arielle Lee) and Cole Hunter) (and long time friend of the pod Nataniel Ruiz), co-author of DreamBooth, and StyleDrop which we’ve covered before) about this incredible effort to finetune LLaMa 2, the open dataset they curated and released as part of this effort and how quick and easy it is possible to train (a smaller 13B) version of Platypus (just 5 hours on a single A100 GPU ~= 6$ on Lambda 🤯)
We had a great interview with Garage BAIND the authors of Platypus and we’ll be posting that on a special Sunday episode of ThursdAI so make sure you are subscribed to receive that when it drops.
Open Orca + Platypus = OrctyPus 13B? (đź”—))
We’ve told you about OpenOrca just last week, from our friends at @alignment_lab )and not only is Platypus is the best performing 70B model, the open source community comes through with an incredible merge and collaborating to bring you the best 13B model, which is a merge between OpenOrca and Platypus.
This 13B model is now very close to the original LLaMa 70B in many of the metrics. LESS THAN A MONTH after the initial open source. It’s quite a remarkable achievement and we salute the whole community for this immense effort 👏 Also, accelerate! 🔥
Join the skunksworks
Speaking of fast moving things, In addition to the above interview, we had a great conversation with folks from so called SkunksWorks OS discord), Namely Far El), Prateek Yadav), Alpay Ariak), Teknium) and Alignment Labs, and our recurring guest hosts Yam Peleg) and Nisten) covered two very exciting community efforts, all happening within the SkunksWorks Discord.
First effort is called MoE, Open mixture of experts, which is an Open Source attempt at replicating the Mixture of Experts model, which is widely attributed to why GPT-4 is so much better than GPT-3.
The second effort is called Ablation studies, which is an effort Teknium is leading to understand once and for all, what is the best, cheapest and most high quality way to finetune open source models, whether it's Qlora or a full finetune or Loras.
If you're interested in any of these, either by helping directly or provide resources such as GPU compute, please join the SkunksWorks discord. They will show you how to participate, even if you don't have prior finetuning knowledge! And we’ll keep you apprised of the results once they release any updates!
Big Co LLMs + API updates
In our Big CO corner, we start with an incredible paper from MetaAi, announcing:
Self-Alignment w/ Backtranslation method + Humpback LLM - MetaAI
Summarized briefly (definitely listen to the full episode and @yampeleg) detailed overview of this method) it’s a way for an LLM to be trained on a unsupervised way of creating high quality datasets, for itself! Using not a lot of initial “seed” data from a high quality dataset. Think of it this way, fine-tuning a model requires a lot of “question → response” data in your dataset, and back-translation proposes “response → question” dataset generation, coming up with novel ways of saying “what would a potential instruction be that would make an LLM generate this result”
This results in a model that effectively learns to learn better and create it’s own datasets without humans (well at least human labelers) in the loop.
Here are some more) reading material) on X for reference.
OpenAI new JS SDK (X link))
OpenAI has partnered with StainlessAPI to released a major new version 4 of their TS/JS SDK with the following incredible DX improvements for AI engineers
Streaming responses for chat & completions
Carefully crafted TypeScript types
Support for ESM, Vercel edge functions, Cloudflare workers, & Deno
Better file upload API for Whisper, fine-tune files, & DALL·E images
Improved error handling through automatic retries & error classes
Increased performance via TCP connection reuse
Simpler initialization logic
The most exciting part for me is, this is now very easy to get started) with AI projects and get streaming on the incredible Cloudflare workers platform (Targum is part of the first Cloudflare workers launchpad but is not affiliated, we’re just superfans 🫶)
Vision & Multi Modality
There’s been some really cool stuff happening in computer vision and multi-modal AI recently. First up, a new method called 3D Gaussian Splatting that shows an incredibly clear and smooth way to generate 3d scenes from just a few images.
Compared to neural radiance fields (NeRFs), Gaussian splatting produces much smoother results without the grainy voxel artifacts NeRFs often have. However, it achieves this improved quality without sacrificing the speed and performance of NeRFs. So Gaussian splatting gives a big boost in realism compared to NeRF renderings, while maintaining real-time speeds in cleaning up those “clouds”
Supervision from Roboflow (and Piotr)
Btw our own friend of the pod and AI Vision expert @skalskiP) (who reviewed Gaussian Splatting for us) is also having a crazy ThursdAI week, with their open source library called SuperVision), which is a computer vision toolkit, and is trending #2 on Github) đź‘Ź
Apple stepping in their Vision (not the headset) Transformer game
Apple has open sourced) ml-fastvit, which is their general purpose Vision Transformers model, which they claim runs at ~1ms on mobile devices, including code and pre-train weights available on Github) 🔥
This is great to see from Apple ML teams, not only them open sourcing, but also them preparing all of us to the world of spatial computers (Vision Pro coming remember?) and many new Computer Vision heavy apps will be available at those incredible speeds.
This is also great for on device inference running these models in node / on edge (as Friend of the pod @visheratin) demonstrated with WebAI)
Additional updates included Nvidia releasing a web playground for NeVa, which is their MLLM (Multimodal LLM, get used to seeing this term everywhere) and you can play with that here) ), and Link-Context learning) for MLLMs
Agents
OpenAi is also announced that Global Illumination joining OpenAI), that team is CEOd by the creator of Instagram stories algorithm and feed contributor and the team is behind a massive open world minecraft clone. Will we see OpenAI release agents into that world? We know) that they are working on agents
A16Z - AI Town (đź”—))
Speaking of agents roaming free and interacting, we covered the open sourcing of SmallVille just last week ↴ and now we see a new open source framework called AI Town of running letting agents roam and interact with each other from Andreessen Horowitz AI division.
AI Town (Github)) is a web framework, written in TypeScript and is built to run, get customized and run with different LLMs (even Open source ones) in mind and you can see the AI agents running around in a live demo here)
This ThursdAI was so packed with great information, that it’s really worth listening to the whole recording, you can do this on our Zealous page, RSS and on twitter (all those links can always be found on thursdai.news) )
If you found this valuable, join our community and let your friends know? This is a great way to support us, as well as participate in the discussion on social, tag #thursdAI on anything you feel is worthwhile for us to summarize and This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe)