This podcast is supported by FX's The Bear. Season two follows as Chef Carmi and crew work to transform their grimy sandwich shop into a next level spot and makes history as the most nominated comedy season of all time. Nominated for 23 Emmys, including Outstanding Comedy Series. Totaling 10 acting nominations, the nominated cast includes...
From New York Times Opinion, this is the Ezra Klein Show. We are already awash in crappy AI content. Some of it is crappy commercial AI content that wants to sell you things. Some of it is crappy AI art.
And it got me interested amidst all this complaining. What does it mean right now to be making good AI art? And so I read this profile of the artist and musician Holly Herndon in The New Yorker. And then separately, this DJ I met mentioned her work to me. She's like, I should check this out. And so I went and listened to her 2019 album Proto, which was done alongside an AI voice trained on her voice and others. And I was walking to work when the song Fear, Uncertainty, Doubt came on.
I just stopped walking. What makes so much AI art so bad, in my opinion, is that it's so generic. These are generative systems. We keep calling them generative, but generative is so... When we use that term, it usually means it helped you get somewhere new. But these systems are mimics. They help you go somewhere old. They can help us write or draw or compose like anyone else.
But I find it much harder when using them to become more like yourself. And most of what I see coming out of people using them, it's all riffing on others in this very obvious way. What I like about Herndon's art is she uses AI to become weirder, stranger, more uncanny, more personal. It's going in the exact opposite direction.
And some of her art questions the entire way these systems work. She and her partner Matt Dreihurst did this project at the Whitney Biennial this year where they created an image generator based on images of Herndon, or at least what the AI system seemed to think she looked like, which is she's got this very striking copper hair. And so the way it understood her was really around the striking copper hair. She is, as she put it, a haircut.
And so they manipulated these images and they made this AI system where anybody can generate any image in the style of what AI systems think Holly Herndon is. So you can generate an image of a house and it'll have this long flowing copper hair and it'll tag itself as an image of Holly Herndon. And because it's on the Whitney Biennial, these images have a certain authority in the way these AI scrapers work.
And so as they are scraping the internet for images in the future, she is potentially poisoning their idea of what she is. She is taking control over the AI's idea of Holly Herndon. I find that fascinating. AI art that is acting as a kind of sabotage of AI systems and the lack of voice we have in how we appear in them. Along with a bunch of collaborators, Herndon has a lot of projects trying to blaze a trail into not just good AI art, but fair AI economics and ethics. And so I wanted to have her on the show to talk about it.
As always, my email is reclineshow at nytimes.com. Holly Herndon, welcome to the show. Thanks. It's great to be here. So something I find fascinating about you is that you grew up singing in church choirs. Then you moved to Berlin after college. You got deep into Berlin techno. And I think those are respectively the most human and the most inhuman forms of music that human beings make. So how did they shape you?
Yeah, that's a really good question. I mean, I feel like I'm such a product of the environments that I've spent a lot of time in. So I'm really interested in folk singing traditions coming from East Tennessee, of course, growing up in a town next to where Dolly Parton's from, she always loomed large. Then I spent a lot of time in Berlin. And so of course, electronic music and techno has played a really big part of my story. And then also moving to the Bay Area, where I got really deeply interested in technology. I
I feel like even though techno might sound and does kind of have a synthetic palette and does sound maybe inhuman, I feel like the rituals that happen around the music are very human and very sweaty and very embodied. So I think if you experience that culture in person, it feels less inhuman. But why does that magic happen? So I was in Berlin recently.
And I was down in the sort of big room in the bunker, I would call it, is sort of the way it felt to me. And I would say the music felt like being inside of a machine gun, but in a good way. Yeah.
And meanwhile, like, as you say, what's happening around it, I mean, it was actually the most inhuman music I've ever heard. And I like electronic music. But what's happening around it is so human. I mean, all these people engaged in this most physical, sweaty, smelly ritual of dancing together. How do you understand?
understand both like the meaning and the function of it? Why does music like that create that kind of transcendence? I mean, this might sound strange, but music is a kind of coordination technology. So a 4-4 techno beat is maybe the most clear communication of that. It's so easy to participate in. It's fairly easy to make. It's also fairly easy to dance to and understand. So I feel like as a kind of
If I want to call it a kind of protocol, it's an easy way to communicate what to do in that scenario. So I think that that's why people have organized around it so much. When I go out and listen to the further reaches of techno in Berlin, in New York, where I live, I'll often find myself at some point in the night thinking every piece of sound in this music is a choice.
And when that choice sounds very artificial, right, when it sounds like something so removed from, you know, somebody playing strings or somebody singing, I think this person wanted to communicate in this extraordinarily machine-like way. And this has been happening for a long time. I mean, talk boxes and synthesizers and all these technologies. And I'm curious as somebody who's made some of that music or is deeply, at least within the culture that has made it,
What is appealing about that? I mean, you said it creates this very sweaty human ritual. But first there is this transition of the person into something that does not sound like people. It sounds like music that robots might make. It sounds like music from a faraway culture. Yeah.
Maybe there's something about living in such a technologically mediated world that makes us want to find how we fit into that as humans. And music is such a kind of innate part of being a human. I mean, as a performer of the laptop industry,
I was always trying to find a way to make the laptop feel really embodied because at the time when I started performing a lot, you know, there was this criticism that, oh, you could be checking your email or this doesn't really feel like a lively performance. So I started using my voice as a kind of input stream. ♪
And the thing that I found really liberating about using my voice in that way is that I could kind of do anything to digitally manipulate my voice to make it be so much more than it is physically. But what I really enjoyed was using my voice as a kind of controller or data stream, and then it could do things that I couldn't imagine once I put it in the laptop and was able to process it in specific ways. So there's something about trying to come to terms with the systems around us by using
Working through them and working with them, collaborating maybe helps us kind of understand where we sit in that feedback loop. So in a minute, I want to play a clip of a piece of music you made. But first, I want to talk about how you made it. So tell me about Spawn. Spawn, who was our kind of like AI baby experiment creator.
Proto was released in 2019 and Spawn came about two years before that. So at the time, you know, it was a very different time, especially for audio. A lot of the visual models were developed earlier. But
Eventually things got better. We started playing with a project called Sample RNN and some other software. And you'll hear still from the stems that we might play later, the vocal quality, the sound quality from 2017, 2018 changed.
And, you know, to me, it sounded like the really early recordings that you can find on YouTube. I think it's like the earliest audio recording. It sounds really scratchy and super low fidelity. That's what the audio sounded like back in the day. And so it was this real issue of trying to get the high fidelity recordings that I was doing with my ensemble in the studio to live in the same universe as this really scratchy lo-fi audio that I was generating through Spawn.
Well, why don't we play a bit of that because you kindly shared the stems for the song Swim. And maybe we should start here by playing the ensemble, the sort of chorus you brought together to sing for the album. So that's really beautiful and really human. And now on the other side, I want to play the Spawn track on its own. ♪
What am I hearing when I hear that somewhat nightmarish spawn there? So spawn was trained on the voices of the ensemble. And so back then we couldn't deal with polyphony, which means more than one note at a time. So what we had to do was break each line into an individual line, and then we would feed that line to spawn first,
who would then sing it back through the voice of our ensemble. And I think we were feeding it through with either a voice synthesizer or a piano. I can't remember. It's been so long. So we basically used this idea, which is called timbre transfer. So that's where the computer learns the logic of one sound and kind of superimposes that onto the performance of another. So that's what we did. We had the ensemble sing a variety of phrases. We trained...
on their voices. And then we did a timbre transfer. We fed her the line that we wanted her to sing, and then she sang it back to us. And I think hearing that, one question you could have is, well, what do you need Spawn for? Why not just have a human being sing into a talk box or use a synthesizer or Ableton, right? We can make people's voices sound strange already, autotune. What is the...
the value of Spawn here? I think overall Spawn has a unique timbral quality that I actually really love because it really is a snapshot in time. It doesn't sound like that anymore. It sounds really clean and yeah, really high fidelity. But at that period of time, I almost have this like romanticism around that, almost like a vinyl hiss or a pop.
for that very particular period of time in machine learning research. But also, I felt like I really needed to be making my own models and dealing with the subject directly in order to have a really informed opinion about it. And I'm really glad that I made that decision because it's informed so much of the work that I do today. Just even the very basic understanding that
that a model's output is so tied to the training data, the input. I don't know that I would have come to the profundity of that had I not been training my own models. And that's really informed all of the work that I've done since then. So I think sometimes you just have to deal with the technology in order to make informed work around that technology. We're going to come back to the profundity of that because I actually think it's really important. But I want to do two things before we do. One is to play a bit of the full song Swim so people can hear where this ended up.
and so then i want to play something you just released more recently using i don't know if you'd call this an updated spawn but we're calling holly plus which is this much more modern voice model trained on your voice that you had covered dolly pardon's jolene so obviously the the unearthly quality
What am I hearing? What is singing? So that is a voice model trained on my voice. I worked with some researchers in Barcelona in a studio called Voktro Labs at the time. And Holly Plus was born. And as you can hear, it's leaps and bounds better, more high, you know, higher fidelity than Little Spawn.
So basically that version of Holly Plus, there are multiple versions. There's a version that can be performed in real time. But this particular version is a score reading piece of software. So I basically just write out a score with the text written out in phonemes. And then the software spits out basically pitch perfect performance of that song. And of course, it helps to have Ryan Norris playing a beautiful human guitar accompaniment.
That's a use case that I'm fascinated by that I imagine will become more and more common in the future, which is a model trained on a person that one can sort of almost autonomously create as if it were that person. You can imagine somebody training a model on all of my podcasts and then the model generates questions they could ask somebody or a model generated all of my columns and you can spit out an op-ed.
What is your relationship with that? And
Do you see it as an extension of what you can do or do you see it as a kind of partner you can collaborate with or do you see it as just some version of you that makes you scale? Because like you can't take commissions to sing from everybody out in the public, but they can all, you know, go to Holly Plus and get it to sing on their behalf. Like what is your relationship with this, you know, nascent other you or at least other voice of you that now exists in the world? Yeah.
I think I'm probably an outlier in my relationship here because my practice involved so much vocal processing. So if you listen to movement or a platform,
The albums before Proto, before I started working with machine learning, I was already taking my voice and kind of mangling it beyond recognition, turning it into a machine itself. So for me to make a model of my voice, that felt like the natural next step in an already very kind of highly mediated process with my voice.
I don't expect everyone to have that relationship. I don't really see the Holly Plus voice as something that replaces me in any way. It's something that I have fun playing with. I can...
attempt to perform things that I wouldn't normally be able to. You know, I did a performance with Maria Arnal in Barcelona. And I mean, that music is so difficult to perform. I could never sing that. She can do all of these amazing melismatic diva runs that I could never dream of, but my voice model could do it. And that was really fun. And it didn't confuse me to think, okay, I can do that now. It was more just fun to hear myself do something that I know that I couldn't
do alone acoustically. So I guess for me, it's maybe like an extension or an augmentation of my own self. So what did Holly Plus add to that cover of Jolene? I mean, you could have just sung a few tracks of harmony and added them above the melody. So what does the AI mean to you in it specifically?
Well, I think that one is perhaps a little personal because growing up in East Tennessee, Dolly Parton was kind of the patron saint of that region. And the kind of music that I usually perform has very heavily processed vocals and is usually is a bit more abstract than a Dolly Parton song. So
It was almost like I wouldn't afford myself that or allow myself that, but I would allow Holly Plus to do it because there was this kind of like level of removal. It's almost like Holly Plus can perform things that I would be too bashful to perform myself. Oh, that's really interesting. The idea that having another version of yourself out there could give you license to try things you wouldn't otherwise try. Yeah.
Yeah, like Jolene. I mean, I love Jolene as a project, but it doesn't have the same ghostliness and quality as the music on Proto, which is why I didn't release it as an album. You know, it's just not as interesting somehow. I guess the other thing, there's a question of meaning here that I've been circling in my own playing around with AI. I spent a bunch of time recently creating sort of AI friends and therapists and, you know, trying to understand like the relational AIs that you can build now.
And on the one hand, I was amazed at technically how good a lot of them were. At the same time, I find I never end up coming back. I find it very hard to make the habits sticky or the relationship sticky. When I sit with my friend or my partner, the fact that they are choosing to be there with me is separate from the things that they're saying. And an experience I'm having with a lot of AI projects is
is that the output is pretty good, right? Holly Plus sings really well. Or the therapist friend I made on Kindroid texts in a way that if you had just shown me the text, I would not know it's not a human being. But the absence of there being the meaning of it that another person brings, the fact that I know it's Holly Plus, like it's a cool project, but I'm not going to keep listening to it. The fact that I know the Kindroid can't not show up to talk to me
That that's a relationship I control totally. It robs the interaction of meaning in a way that makes it hard for me to keep coming back to it. And so somebody who works a lot with like the question of meaning and sees a lot of these AI efforts happening, how do you think about what imbues them with meaning and in what cases they end up feeling hollow?
It's really funny. We did a live performance from Proto, I guess, in 2019 in New York. And we had the ensemble on the stage. And afterwards, someone came up to me and they said, I really enjoyed the show, but I don't understand what it has to do with AI. And actually, that was the biggest compliment that I could receive because I wasn't trying to project this kind of like super future, you know, AI high tech concept.
story, I was trying to show all of the kind of human relationships and the human singing that goes into training these models. That's something I was really trying to get to with that album is, you know, allowing the some of the things that the computer can do, you know, some of the coordination that it can do is remarkable, but it can also free us up to just be more human together to really just focus on the parts that we really want to focus on, which is
just enjoying that moment of singing on stage together. I'm also not so interested in necessarily having an AI therapist. That's not what I find interesting or compelling about this space. I'm interested in exploring some of the weirdnesses in how we as a society define different things. That's the kind of stuff that I'm interested in, not having a kind of like AI chat pet. Time is luxury.
That's why Polestar 3 is thoughtfully designed to make every minute you spend driving it the best time of your day. That means noise-canceling capabilities and 3D surround sound on Bowers & Wilkins speakers, seamlessly integrated technology to keep you connected, and the horsepower and control to make this electric SUV feel like a sports car. Polestar 3 is a new generation of electric performance. Book a test drive at Polestar.com.
I'm Julian Barnes. I'm an intelligence reporter at The New York Times. I try to find out what the U.S. government is keeping secret. Governments keep secrets for all kinds of reasons. They might be embarrassed by the information. They might think the public can't understand it. But we at The New York Times think that democracy works best when the public is informed.
It takes a lot of time to find people willing to talk about those secrets. Many people with information have a certain agenda or have a certain angle, and that's why it requires talking to a lot of people to make sure that we're not misled and that we give a complete story to our readers. If The New York Times was not reporting these stories, some of them might never come to light. If you want to support this kind of work, you can do that by subscribing to The New York Times.
I've heard you say that with AI, it's the model that's the art, not necessarily the output of the model. Yeah, that's one thing that we're exploring quite a bit. So...
One of the potentials around machine learning is that you're not limited to just a single output. You can create a model of whether that's my own singing voice or whether that's my own image or likeness, and you can allow other people to explore the logic of that model and to prompt through your world. So it's almost like inviting people into your subjectivity or inviting someone into the video game of your art.
So I think it has a lot of potential to be interesting in a kind of collaborative way with your audience. One term that we're often using is protocol art. Basically, understanding that any work that's made is a kind of seed for infinite generations. So we're trying to lean into that. So, for example, if we make a sculpture, which we did a project called Ready, Wait!,
We also make it available as a package with an embedding and a Laura and all the kind of tools that anyone would need to be able to explore that sculpture in latent space. Or, you know, when we made the model of my voice with Holly Plus, we made that publicly available so anyone could make work for that.
So that's the example of protocol art, where really it becomes a collaborative experience between myself and the people who are engaging with my work. And in a way, art's kind of always a little bit like that. It's a conversation between the work that you're making and the viewer or the recipient. But that becomes a little bit more complicated and fun, I think, in an AI world.
You wrote something in 2018 that I think is worth exploring, where you said that AI is a deceptive, over-abused term. Collective intelligence is more useful. Why?
Because I really do see it as a kind of aggregate human intelligence. It's trained on all of us, specifically when you look at music, it's trained on human bodies performing very special tasks. And I think it does humans a great disservice to try to remove that from the equation. I think that's why I like to draw a parallel also to choral music, because I see it as a kind of
I want to explore what changes when you emphasize the collectivity of these models, the fact that they are in some ways an aggregate of all of us.
versus the artificiality of them, right? Artificial intelligence, which really emphasizes, no, there's something that somebody has written into software over here. They're like unearthly. They're a new kind of thing. And one thing is actually, I think, economic, that there's this whole question about who gets compensated and who's going to make the money off of this and what all this training data is going to end up doing economically. And it does seem very different to me if you understand these as on some level, right?
societal output, something that's built on a kind of commons, as opposed to a tremendous leap and feat of technology that is the sort of individual result of software geniuses working in garages and office parks somewhere. Yeah, I mean, that basically summarizes the work that I've been doing for the last several years. It's kind of like shouting that from the rooftops because I
I think if you see it through that lens, then it becomes something really beautiful and something to be celebrated and also something that's not entirely new. You know, we've been embarking on collective projects, the entirety of our kind of humanity to make things that are bigger than ourselves. And so if we can find a way to make that work in the real world with the kind of future of the economy, then, yeah, I think it behooves us to figure that out.
The not entirely new part feels important to me. The degree to which this is all a continuum feels often underplayed in conversations about AI, about the future of work, about humans and machines. But there's also a way in which you see the AI companies using this argument to say that they should be given much more free reign and much more full profits over the products of these models because they say, look,
We're not doing anything different than any other artist or anyone ever has. Scientists today work off of the collective body of knowledge of science before them. You know, Holly Herndon is influenced by folk music and choral music and German techno. And everybody is always absorbing what has come before them and mixing it into something new. That's all we're doing. We're not doing something new. We're not making a copyright infringement. So how do you understand that?
The effort to use the collectivity, right, the fact that human beings have always been in collective projects, but we do give people a lot of individual ownership and authorship over their works from, you know, what might be different here in the scale context.
and the nature of what these models are doing? So, okay, I think that there's a middle ground that can work for everyone, that can allow people to experiment and have fun with this technology while also compensating people. So spawning is a neologism that I like to use to kind of describe what's happening here. And it's a 21st century corollary to sampling, but it's really distinctly different. And that difference, I think, is really important.
It's different in what it can do and also how it came about. So what it can do, we've kind of gone into that already. You train a model, the kind of logic of one thing to be able to perform new things through that logic. So it's distinctly different from sampling, which is really like a one-to-one reproduction of a sound created by someone else that can then be processed and treated to make something new. But with spawning, you can actually perform as someone else based on information trained about them. So that's distinctly different.
But also the way that it comes about with sampling, it's this one-to-one reproduction. With spawning, it's a little bit more of a gray area in terms of intellectual property because you're not actually making a copy. The machine is ingesting that media, if you want to call it, looking at, reading, listening to, learning from. So I kind of land in that, I like to call it the sexy middle ground, right?
between people who are all for open use for everything and people who want to have really strict IP lockdown. And so that's one of the reasons why spawning then kind of mutated even further into an organization, which is something that I co-founded with three other people, Matt Dreihurst, Patrick Kulpner, and Jordan Meyer, to try to figure out this messy question of essentially data manners. How do we...
handle data manners around AI training because what's happening right now isn't working for everyone. Are there experiments that you find exciting or that you've conducted that you found the results of them promising? Yeah, I mean, I think...
Holly Plus was a really fun experiment because people then actually used my voice and we were able to, you know, sell some works through that and generate a small profit, but enough to be able to continue to build the tools for the community. So that was a fun experiment that I think really worked. And there's one experiment that I'm running right now that I'm really excited about. My partner, Matt Dreihurst, and I have an exhibition at the Serpentine London Exhibition.
in October. And as part of that, we are recording choirs across the UK. I think there's 16 in total. And they're joining a data trust. And we've hired a data trustee to pilot this idea of governance, where we're trying to work out some of the messy issues around how a data trust might work. And then we'll negotiate with that data trust directly as to how we can use their data in the
I think it's a really fun experiment and it's also because it's singing and it's choral music. It's not really sensitive health data. We can really experiment and try out different ways to make this work in a way that's not dealing with such sensitive information. So I'm really excited to see how people engage with that and how much do people really want to deal with the kind of day-to-day governance of their data. That's also a big question. So you were saying earlier that often the model is the art, but in this case the governance is the art. You know, in this case I
I think the model and the governance and the protocol around it are all the art. This idea of control is interesting, though. I mean, so it came out a while ago that Facebook, that Meta, had been training its AI on a huge cache of pirated books. And I think my book was in there. My wife's book was in there. Like the books of virtually everybody I know were in there. And so a bunch of authors sued.
And I also felt some part of me, like I wanted to be paid for my inclusion, but I didn't want to not be included in all these. And it reminds me a bit of social media where at a certain point, whether or not you wanted to be on social media or not,
it was sort of important that you had something representing you there, right? It could be not your real photo, right? You could have some control over it. But if you didn't do it, then you had absolutely no control over what you appeared as online. And it probably wasn't plausible you could appear as nothing online. So maybe something you didn't want would, you know, be your top Google search result. And here it's going to get even weirder because there isn't really, you can't have your homepage online
in the artificial intelligence model. All you are is training data. And so there's something very strange about this. You know, if before all you were was kind of a profile, which was a very flattened version of you, now you're training data, which is a very warped version of you. And this question of how do you have any control over that data, like if you want to participate, but you want some definition over how you participate,
There's no real obvious avenue towards that. There's none at the moment, but I think that that's coming. I think people will opt in under terms that they feel comfortable with to be able to shape the way that they appear in this new space. I don't think it's tenable that people have no agency over how they appear in the future of the Internet. That feels idealistic to me. I mean, I feel like we've been going through an Internet for a long time where I would have said,
This level of data theft or use is not tenable. This level of surveillance isn't tenable. This level of flattening, the way we get each other to treat each other on social media, it doesn't feel like this is going to hold.
Like, I am amazed that people are still on X. As hostile as that platform has become to many of them, it's just so impossible to imagine leaving something happening that they will accept something they really feel angry about. They really feel like the way it is run is hostile to them, that it is degraded. But, you know, what are you going to do? I'm amazed at how powerful that, what are you going to do?
impulse is in life? Well, I mean, I totally get that. But what we decided to do was to try to build a universal opt out standard. And it's actually gaining traction. And there's precedent in the EU AI Act.
Ideally, it would be something that would be from the beginning, all training data would have been at, you know, people would have been asked permission from the offset, but that's not how things played out. So now we're in a position where we, you know, we're building tools where people can really easily opt out the data that they don't want to have included in these models. We have a
an API where you can install that on your website and easily have everything on your website not be included in crawling. So I do think that there are things that we can do. It requires a little bit of legislation. It requires a little bit of...
diplomacy, but I don't think that we should just throw up our hands and say, okay, it's over. We should just have everything. You know, if we do have a situation where we're able to get the opt-out as a kind of standard, then I think you can start to build an economy around an opt-in.
Something that I'm really proud of, we just announced SourcePlus. So I'm not trying to shill here, but I think this is a really important part of this conversation where we put together a data set of all public domain data. And it's huge and really,
People should be training their base models there. And then you can allow people to opt in to fine tune their models and create an economy around that. If you have a public domain base layer model, then you can actually create an economy around that. But I don't think we should give up.
I definitely agree. I don't think we should give up. For a lot of people, they need to make a living out of the work they're doing. Yes. One thing that I find inspiring about the idea of thinking of it as a collective intelligence is it maybe points away towards the idea that there's modes of collective ownership or modes of collective compensation. And...
At least in the space of art, when you're thinking about this idea that, you know, you might have your voice out there for anybody to use. I think for a lot of people, that's scary, right? I mean, we're very used to business models that are about nobody can use this thing of mine unless they pay me, right? We have patents, we have copyrights.
What does that spark for you? Like, what if we, you know, what are the ways to do this in a more collective, open source way that you think might work to make it possible for people to live, but also to create? Well, I think first and foremost, it should not be a one size fits all solution. I mean, you know, we're talking about art and that encompasses so many different practices that function economically in so many different ways. That's something that was really devastating, I think, when it came to streaming.
Streaming was really revolutionary and wonderful for a lot of people, but it was really devastating for a lot of other people because everything had to have the same economic logic as pop music. And a lot of experimental music doesn't follow that perfectly.
per-play valuation logic. A lot of experimental music is about the idea, and you just need access to that idea once. You don't need to listen to it on repeat. And so if the access to that idea costs a fraction of a cent, that's going to be really difficult to pay for. It's almost more, you almost need more like a movie model where you pay a little bit more to gain access to that idea.
I think what's really needed is that people have the ability to create whatever subcultures and whatever kind of economic models work for their subcultures and aren't squeezed into a kind of sausage factory where everything has to follow the same logic. So I know you and your partner are working on this book for this forthcoming exhibition that has, I think, the most triggering possible title to people in my industry, All Media is Training Data. What's the argument there?
Yeah, so this is a book that's a series of commissioned essays and interviews between me and Matt about our approach to AI data over the past 10 years. I do realize that this is kind of triggering for a lot of people, but I think it's something that's worth kind of recognizing. You know, as soon as something becomes...
in media, as soon as something becomes machine legible, it has the potential to be part of a training canon. And I think that we need to think about what we're creating moving forward with that new reality. You know, a lot of the work that we're doing around the exhibition is we're creating a
training data deliberately. So we're treating training data as artworks themselves. I'm writing a song book that a collective of choirs across the UK will all be singing from. And those songs were written specifically to train an AI. So all of the songs cover all of the phonemes of the English language. So you can really, the AI can get the full scope of the sound of each vocalist.
So we're kind of playing with this idea of making deliberate training data kind of, we like to call them mind children that we're sending to the future. I want to talk about what's triggering in it for a minute, because I think when people hear that, they may think media in the sense of the news. And I'm actually least worried about the news because the news is where we're covering new things that happened that are not in the training data.
But media, if you think about it broadly, right, visual media and music and all the other things human beings create, I think what people hear, all media is training data. What they hear is everything we do will be replaceable, right? That the AI is going to learn how to do it and it's going to be able to spit it back at us. And then it doesn't need us anymore. When we become the training data, we're sort of training our replacement, right? Like those sort of very grim ideas.
that will come out of factories before they outsource somewhere, right? Where people are training, you know, the people are going to replace them at a lower cost. Is that how you see it? If you're training data, does that mean you're replaceable? Art is a very complex cultural web. It's a conversation. It's something that's performed. It's a dialogue. It's situated in time and place.
We wouldn't confuse a poster of the Mona Lisa for the Mona Lisa. Those are two different things. So I'm not worried about artists being replaced or about, you know, infinite media demand.
meaning that artists have no role in meaning making anymore. I think that the meaning making becomes all the more important. I do think we have to contend with a future where we do have infinite media, where the single image is perhaps no longer carrying the same weight as it did before. So yeah, there are some things to contend with, but I think that we won't be replaced and I think it'll be weird and wonderful. ♪
This podcast is supported by Progressive Insurance. Whether you love true crime or comedy, celebrity interviews or news, you call the shots on what's in your podcast queue. And guess what? Now you can call them on your auto insurance too with the Name Your Price tool from Progressive. It works just the way it sounds. You tell Progressive how much you want to pay for car insurance, and they'll show you coverage options that fit your budget. Get your quote today at Progressive.com to join the over 28 million drivers who trust Progressive.
Progressive Casualty Insurance Company and Affiliates. Price and coverage match limited by state law. There are a bunch of programs that are coming out now that use AI to generate this sort of endless amount of pretty banal music for a purpose. So I have this one I downloaded called Endel. And it's like, do you want music for focus? Do you want it to sleep? Do you want it to... And it's fine. If I heard it on one of those, you know, playlists on Spotify, I wouldn't think much of it.
And I think it points towards this world where I think the view is we're going to know what we want and what we're going to want is a generic version of it. And we're going to be able to get it in kind of vast quantities forever. But you're an artist and you said something in an interview I saw you give about how reality always ends up weirder. It always mutates against what people are expecting of it.
And so I wonder how much you suspect or see the possibility of the sameness that AI makes possible, the kind of endless amount of generic content that
leading to some kind of backlash where people actually get weirder in response, both weirder with these projects, but also, you know, more interested in things created by humans in the same way that, you know, a lot of artisanal food movements got launched by the rise of fast food. I mean, how much do you think about backlash and the desire for differentiation as something that will shape cultures and software here?
Well, there's a lot in there. I mean, the backlash has been huge. I think that AI has certainly joined the ranks of culture wars, especially on Twitter. So I think the backlash is already there. But I think we're also in really early days. So some of the examples that you gave, I feel like they're kind of trying to please everyone and
as we move into a situation where your specific taste profile is being catered to more, I think it will feel less mid and feel more bespoke.
direction where some of this can go. I think a lot of people are really focused on prompting at the moment because that's how we're interfacing with a lot of models. But in the future, it might look more like, you know, maybe you have a kind of taste profile where the model understands your tastes and your preferences and the things that you are drawn to and just kind of automatically generates whatever media that would kind of like please you. So the kind of preferences
production to consumption pipeline is collapsed in that moment.
One of the things that I always appreciated as a young person growing up was hearing things that I didn't like and didn't understand. And that was something I always found really difficult with algorithmic recommendation systems, as I just kept getting fed what it already knew that I liked. But, you know, when I was just being exposed to new music as a young person, I really needed to hear things that I didn't like to expand my palate and understanding of what's possible in music.
And so that's one thing that I think you could just kind of have like a stagnation of taste if people are constantly being catered to. So I think people will crave something different or will crave to be challenged. Some people won't, but some people will. One of the things that occurred to me while I was looking through a lot of your work was that what I enjoyed about it was that you were using the relationship with the generative system differently.
to make yourself and make the work stranger. And that felt refreshing to me because my experience using ChatGPT or Claude or anything really so often is that it makes me more generic and that there's this way in which AI feels like it is this great flattening. It'll give you a kind of lowest common denominator of almost anything, you know, that human beings have done before.
And that the danger of that feels to me like it's a push towards sameness, whereas a lot of your art feels to me like a push towards weirdness and a kind of sense that you can interact with different versions of these systems in a less...
sanded down way and find something that neither a human or a machine could create alone. Is that a reasonable read of what you're doing? Is there something there? Yeah, I think that's largely because I use my own training data. I create training data specifically for this purpose of training models rather than using something that's just laid out for me. I think you get a lot of kind of mid or averaging data.
from these really large public models because that's basically the purpose. It's supposed to kind of be a catch-all, but I'm not interested in the catch-all. I'm interested in this weird kind of vocal expression or I'm interested in this other weird thing. And so that's what I really want to create training data around and really focus on for whatever my model is. So I think people should just get into training their own models. I want to end by going back to a song from Proto.
And it's one of the stranger songs on the album. And I thought maybe we could just sort of talk about what it's doing and people can hear it. So why don't we play a clip of Godmother? Godmother
What's happening there? Okay, so yeah, when I delivered that single to 4AD, I was like, here's a single for the next album. They were like, okay, what do we do with this? So this is, I guess, a really early voice model trained on my voice.
So if you compare that to the Jolene song, that's basically how far we've come in the last five years, which I think is just remarkable. The speed is incredible. So I trained Spawn on my voice, my singing voice, and then I fed Spawn stems from a collaborator of mine named Jay Lynn.
And so Spawn is attempting to sing Jalyn's stems through my voice. And Jalyn's music is very percussive. It's mostly percussion sounds. So it ends up being this kind of like almost like a weird beatboxing kind of thing because it's trying to make sense of these sounds through my voice. Well, here, why don't we play a clip of the Jalyn. This is one of my favorite songs from her. It's called The Precision of Infinity. Yeah.
And so, yeah, it's not that it's a machine. It's just something that a human being cannot do quite on their own. I mean, there's like a Philip Glass sample in there. It's beautiful. But I don't know. It's funny when you say that Spawn feels so old because something I like about it is it feels very, compared to a lot of what's coming out now, its strangeness feels much more modern. It feels truer to how AI feels to me than the much more polished things we're currently hearing or seeing. Yeah.
which it's like this thing has exploded in all of its weirdness and all this effort is being made to make it seem normal. And I think the reason Proto sounded very current to me when I heard it for the first time this year is that in sounding abnormal, it feels more actually of this moment, which feels very strange, even as everybody keeps trying to make it seem not that strange. Well, thank you. I appreciate that. I feel like at the time I was...
This AI conversation has been going for so long. The hype was already started back then. I feel like so many things that were being marketed as AI,
It was kind of misleading what the AI was doing or how sophisticated things were. So at the time, a lot of people were creating AI scores and then having either humans perform them or having really slick digital instruments perform them. And so it was giving this impression that everything was really slick and polished and finished. And that's why we decided to focus on audio as a material specifically, because you could hear...
how kind of scratchy and weird and unpolished things were at that time. And that's, I wanted to meet the technology where it was.
And that required a whole mixing process with Marta Salogni, who's an amazing mixing engineer in London, to try to get the human bodies in the Slick studio to occupy the same space as the kind of crunchy lo-fi spawn sounds. But it was really important to me that I wasn't trying to do the whole smoke and mirrors of like, this is some glossy future thing that it wasn't because I actually found the weirdness in there so much more beautiful.
As somebody who has now been for years playing around with models and working in these more sort of decentralized possibilities, I think it's easy if you're outside this and don't have any particular AI software engineering expertise, as I don't, as I think most of my listeners don't.
And you see, well, there's models by OpenAI, by Google, by Facebook. It feels like no human being can do this, right? These are companies getting billions of dollars. How are you able to participate in this sort of world of models? Like, how much expertise do you need? How do you figure out, like, what are the interesting projects, right? Like, if somebody wants to understand this kind of world of homebrew AI, so to speak, how did you start it and where do they start?
That's a really good question. I mean, I think the landscape has changed so much since I started. I would say, you know, first thing, you can interact with publicly available models. And once you kind of understand how those are working, then I would just...
Do the really boring work of reading the academic research papers that are tedious. Take your time. Drink a coffee. Watch the YouTube video where they present it at a conference and maybe some people ask questions and that helps to flesh it out. This was our process. It's been really, really kind of messy and boring.
Yeah, we didn't have a lot of handholding, but I think if you're really interested in learning more, the information is out there. You just kind of have to roll up your sleeves and get your hands dirty.
I think that's a nice place to end. So always our final question, what are three books you'd recommend to the audience? Okay, so Reza Nagarastani wrote a book called Intelligence and Spirit. It's a pretty dense philosophical book about intelligence and spirituality that I think is really great. On a lighter side, Children of Time by Adrian Tchaikovsky.
is a really enjoyable AI science fiction about intelligent, genetically modified spiders. One of my favorite books. Yeah, it's so good. So you kind of see the kind of society and technology that a super intelligent spider society would build, which I love.
And then there's a book called Plurality that was led by Glenn Vile and Audrey Tang and a wide community of contributors. I also contributed a small part to this book. It's about the future of collaborative technology and democracy. And it was actually written in an open, collaborative, democratic way, which I think is really interesting. So check it out. Holly Herndon, thank you very much. Thanks so much. This was really fun.
Thank you.
I've got prostate cancer, but I really wanted to make it to the big game with my grandson. And here we are.
Go, go, go! With Erleada, apalutamide, being there is possible. Erleada is a prescription medicine used to treat prostate cancer that has spread to other parts of the body and still responds to a medical or surgical treatment that lowers testosterone. Erleada may cause serious side effects including heart disease, stroke, or severe skin reactions which can lead to death, falls, fractures,
Ask your doctor if Erleada is right for you. Or tap this ad to visit Erleada.com. CP43-9267-V1.