cover of episode Can we talk to animals?

Can we talk to animals?

2023/8/16
logo of podcast Unexplainable

Unexplainable

AI Deep Dive AI Chapters Transcript
People
E
Eza Raskin
K
Karen Bakker
Topics
Karen Bakker:动物拥有复杂的沟通能力,这在西方科学中长期被忽视。对鲸鱼歌声的研究是突破点,揭示了动物沟通的复杂性,也部分恢复了土著文化中早已存在的相关知识。AI技术能够帮助我们分析超出人类听力范围的声音数据,但动物的世界观和人类可能存在巨大差异,AI翻译可能不完整,需要结合实地观察。 此外,生物声学技术,无论能否实现跨物种交流,都是保护工作中一种低成本、低侵入性的强大工具,可以用来监测气候变化对物种的影响。 最后,在利用AI解码动物语言的研究中,需要谨慎使用人类中心主义的概念,例如‘语言’。我们需要在技术进步和潜在风险之间取得平衡,在建立伦理框架和法律法规之前,应该暂停可能造成损害的研究。 Eza Raskin:AI能够通过构建语言的‘形状’(潜在空间)来帮助翻译不同语言,即使这些语言之间没有直接的对应关系。这项技术可以应用于动物沟通的解码,但动物的‘世界观’(Umwelt)可能与人类差异巨大,因此AI翻译可能存在局限性。 然而,一些共通的体验,例如自我意识和改变意识状态的追求,可能存在于不同物种之间,从而使得跨物种交流成为可能。 此外,这项技术在未来可能面临滥用的风险,例如被用于精准狩猎或制造虚假的动物信息。因此,我们需要制定类似‘日内瓦公约’的规范,在技术应用之前建立伦理框架和法律法规,避免潜在的负面影响。

Deep Dive

Chapters

Shownotes Transcript

Translations:
中文

This episode is brought to you by Shopify. Forget the frustration of picking commerce platforms when you switch your business to Shopify, the global commerce platform that supercharges your selling wherever you sell. With Shopify, you'll harness the same intuitive features, trusted apps, and powerful analytics used by the world's leading brands. Sign up today for your $1 per month trial period at shopify.com slash tech, all lowercase. That's shopify.com slash tech.

Feel your max with Brooks Running and the all-new Ghost Max 2. They're the shoes you deserve, designed to streamline your stride and help protect your body. Treat yourself to feel-good landings on an ultra-high stack of super-comfy nitrogen-infused cushion that takes the edge off every step, every day. The Brooks Ghost Max 2. You know, technically, they're a form of self-care. Brooks, let's run there. Head to brooksrunning.com to learn more.

It's Unexplainable. I'm Noam Hassenfeld. A couple months ago, I went to the Aspen Ideas Festival in Colorado. It's this festival where tons of scientists, journalists, politicians, teachers, all kinds of people get together, hang out, and present their ideas.

I was there to tape a live conversation for Unexplainable about animal communication. And I got to talk with two brilliant scientists, Karen Bakker and Eza Raskin. Karen's a professor at the University of British Columbia, and she's written a book all about animal communication called The Sounds of Life. And Eza is a co-founder of the Earth Species Project, which is a nonprofit that's trying to decode animal communication using AI.

We had a wide-ranging conversation that started with the basic discovery that our world is filled with way more animal sounds than a lot of people initially thought. And from there, we talked about what those sounds might mean, whether there's some form of language, and whether scientists might someday be able to use AI to actually decipher what animals are saying. We've edited this a bit for clarity and length. Here's our conversation.

So just to start, before we get into how researchers are trying to decode animal sounds today, I think it's worth talking about how they realized they were worth decoding in the first place. So Karen, you have a story in your book about how European and American scientists really started to discover that these sounds were even worth exploring. And you talk about whale song.

So I wonder if you could just set the scene for us a little, like what was the state of research on ocean sounds maybe 100 years or so ago? Yeah, so cast your mind back to about 100 years ago. There's a widespread assumption that only humans possess language and moreover that other species do not possess complex communication capacity.

This is, of course, a blind spot in Western science that we're going to unpack throughout the course of the conversation today. But at the time, one of the most remarkable aspects of Western science was really a kind of a sin of omission. There wasn't a lot of work done recording animal sounds. There wasn't a lot of work done trying to decode them. And that all changed when...

a small number of researchers, some of them associated with then classified Navy efforts to listen to underwater ocean sounds, began attempting to decode and categorize the really complex underwater sounds they were hearing in the ocean. So these scientists made early efforts mostly to record the pretty profound and amazing sounds that whales make, which are now well known to us,

But at the time were pretty astounding to Western scientists. You know, the kind of operatic ululations of humpback whales, the very staccato, powerful eardrum blasting sounds of sperm whales. There's a whole symphony under the ocean of which Western science was largely unaware. Their work came to public attention when a renegade scientist named Roger Payne and his brilliant wife, Katie Payne, a classically trained musician,

took some classified recordings that Navy scientists had given them and published them as an album, which remains the best-selling nature album of all time, went platinum, and was one of the major things that changed the dynamic around the campaign to end industrial whaling, was arguably one of the things that saved many whale species from extinction. So whales were our...

re-entry as Western scientists into something indigenous cultures had long known.

that many, many species are capable of complex communication. And from there, many other species have begun essentially to reveal themselves to Western science. But I want to emphasize right at the start, this is long-held human knowledge that somehow we have forgotten that we had forgotten, and we have just begun remembering really how to remember. So, you know, talking about things that we've forgotten or maybe things that we...

we're not aware of until very recently. You know, it's not just these sounds under the water, right? It's not just deep sounds that we haven't been able to hear because we didn't have submarines. I wonder if you can tell us a bit about

some of these sounds that we can't hear without technology that are around us all the time? The vast majority of these sounds are inaudible to the naked human ear. They are either above our hearing range in the high ultrasound or below our hearing range in the deep infrasound.

There is an evolutionary reason for this called the acoustic niche hypothesis. So in most ecosystems, what you get is much like radio stations on the radio dial. You'll have different species essentially broadcasting acoustic communication at a specific set of frequencies, a band, and also being able to hear in that same frequency range. We hear pretty much at the frequencies that we are able to vocalize at.

The ability to record beyond human hearing range is only about 100 years old for ultrasound and infrasound. But what AI does is it allows the parsing, the categorizing of that data at scale. We reversed a very fundamental constraint of 20th century biology

We used to have basically a scarcity of data. Now we have a hyperabundance of data because of cheap digital recording devices, many, many such recording devices from the Arctic to the Amazon. And now we have AI that can do some, not all, automated things.

tracking, categorizing, parsing. That doesn't mean we can actually then make the next step to translating, but it gets us a lot further. So you're talking about now using AI, using this more data we've had, using AI to analyze it,

Asa, I wonder if you can talk a little bit about the research you've done with using AI to map things like shapes of languages. Like how does AI help us translate between languages when we might not be able to understand them? Yeah. So I'm going to tell this story in two parts. And the reason why we started Earth Species in 2017 was that something fundamental changed, right? Because if you're going to try to translate a

a language without a Rosetta Stone, that didn't exist in human history up until like 2017. Then that changed. So what changed? AI gained the ability to build shapes that represent languages. For those of you that are AI people in the room, these are called latent spaces or embedding spaces. But these shapes are really interesting because they turn semantic relationships into geometric relationships. What does that mean? Think about

a language like English. Now imagine a galaxy where every star is a word and words that mean similar things are near each other and then words that share a semantic relationship share a geometric relationship. So king is to man as woman is to queen. So in this shape, king is the same distance direction to man as woman is to queen. So you just do king minus man that gives you a distance and direction. You add that to boy and it'll equal prince. You add that to girl, it'll equal princess. Like all the internal relationships of a language are encoded in this shape. Right.

And if you think about it, like, you know, dog has relationship to like man and to wolf and to fur and to yelp and to howl. It sort of fixes it in a point in space in this shape. And then if you solve the massive multidimensional Sudoku puzzle of every concept to every other concept, out pops this rigid structure representing all the internal relationships of a language. The computer doesn't know what anything means. It just knows how they relate. And here was the deep insight from 2017.

Could the shape of two languages possibly be the same? So I'm holding a Portuguese word cloud here. You're holding an English one. And the mathematical relationship between the stars in my galaxy that represent woman, queen, king, man is more or less the same mathematical, that is spatial relationship in your word cloud. Exactly. If I were to throw up

Cree or Inuktitut, a little bit less overlap, but more or less a lot of these concepts are invariant across human languages.

And that is why we are able to really effectively translate now using AI between our different word clouds. That's exactly right. You literally just rotate one shape on top of the other. And even though there are words in one language that don't exist in the other, the point which is dog ends up in the same spot in both. And you sort of blur your eyes in the same shape. And that works for English and Spanish. And you're like, cool, those are related languages, obviously. But also works for Finnish, which is a weird language, and Aramaic and Urdu, obviously.

Esperanto, like every human language sort of roughly fits in this universal human meaning shape. And that was the moment that we're like, there's a path through. Do you think if we build this shape for animal communication, it fits somewhere in the universal human meaning shape and the parts that overlap, we should be able to directly translate into the human experience and the parts that don't overlap, we should be able to see complexity. And so this gives us the ability to start getting like

blurry, Polaroid images of things that are beyond the human imagination. Let's go back one step to this notion that we can translate between our different word clouds, English and Inuktitut, and then extrapolate that to non-human communication systems. So first of all, you have to imagine that scientists now have the ability to create these kind of latent spaces or word clouds with non-human communication regimes. For example,

there is now an elephant dictionary with thousands of sounds and field biologists have painstakingly documented what each of those signals mean. African elephants, for example, have a specific signal for honeybee. They're terrified of honeybees. They can get into their trunks and their ears and sting them. And so there's a very specific behavior elephants display when they hear the sound. So tested through playback experiments, we have this elephant dictionary

But there are many ifs here, because the assumption that the underlying worldview, the felt, lived, embodied experience of an elephant, the umwelt, as researchers call it, is anything like a human, is one we haven't yet proven. And so my word cloud may not contain any concepts that actually overlap with Eiza's, with the exception of a very few. So it may be that AI translation is either a dead end,

or it would only allow us to develop a small subset of translatable concepts and that there are other rules governing these non-human communication systems that we've yet to figure out. I'll just give a couple examples.

It may be that animal species have different languages for different times of year. A little bit like, you know, if you're familiar with Indian classical music, ragas, you know, there's the morning raga, the evening raga. So you have to throw out all of your assumptions about language. Maybe they have different languages for different parts of the world if they're migratory. The Bering Strait may have a different language than the warm waters of Hawaii where you give birth. So

We are just really at the beginning of trying to figure out whether what AESA is trying to do is achievable. My personal bet, and I'd love to hear your view, is that there will be an incomplete translation. We will be able to detect names. We'll be able to detect alarm calls. And we'll be able to detect the labels that are given to sort of features of the environment that are linguistically invariant. But there are many, many more complex concepts that we're going to have to invent entirely new types of science

to begin understanding. And those are going to combine field observations with AI. Absolutely. I just want to say that all of our work is built off of decades of painstaking research done by biologists out in the field. And everything we do is hence in collaboration. But then just to add a couple thoughts to what you're saying is, one, the way we're describing doing this kind of translation with rotation, that's 2017 AI tech. It's now sort of Stone Age.

There are like many other techniques which I can talk about that like that becomes just one tool among many. But why should we expect, like I agree, like the umwelt of a sperm whale might be so completely different. It spends 80% of its life in complete darkness a kilometer deep in the ocean. Why should we expect there to be any overlap whatsoever? And I'll give two examples for why there might be overlap.

And then I'll sort of talk about the parts that, like why I think this goes even beyond language. The first example is the mirror test. It's like, how do you know whether another being has self-awareness? One way you might discover that is you would paint a dot on them where they are unaware of that dot. Then you put a mirror in front of them and it's like,

if they see the dot and they start to like try to get it off, that shows that they're connecting the image in the mirror with themselves, that they have a self-image. Now, if they don't respond, that doesn't actually tell you anything. Researchers thought for the longest time that elephants couldn't pass the mirror test, but it turns out they're just using small mirrors. Right? So, but a number of species do pass this kind of mirror test. And that means if they're communicating, they may well be communicating about

a rich interiority, like a self-awareness, one of the most profound things that we have. Another example, as I have this in my presentations, an incredible video of lemurs biting centipedes to get high. So they're like taking hits off of centipedes. They get super cuddly. They enter these trans-like states. It's sort of like a proto-burning man. And

Dolphins do the same thing. They will intentionally inflate puffer fish also to get high off of their venom and pass them around in the original puff puff pass.

So gorillas and chimpanzees will spin. They'll hang on a vine and spin to get really dizzy. Transcendent states of consciousness appears to be a thing that we share and desire across many species. So that, too, is a very profound thing that if we communicate or if they communicate, they may well communicate about. So there are some, I think, really interesting areas of overlap. This conversation we're having, on one sense, it can feel like we're just...

we're like speeding ahead and there's this one other hurdle we have to pass, which is like figuring out this translation, the AI is almost there. We're also using a lot of anthropomorphic language, and I feel like that is definitely

I understand why we would do that. We don't have better words for those concepts. I want to clarify that scientists do not use that language. They use very technical terms. So, for example, scientists would not use the term "name". They would say "individual vocal label" or "vocal signature".

Equally, most of the scientists studying the communicative regimes of non-human species would use the term "communication", not "language". Because language is sufficiently anthropocentrically defined in terms of complex combinatorial capacity, symbolic content, syntax, so on and so forth, that it is as yet to be proven that other species have "language".

I just want to clarify that although Eiza and I, in a sort of public communication of science way, are using these terms, the scientific community is pretty rigorous

perhaps incorrectly so, but nonetheless pretty rigorous, about setting a boundary between humans and non-human species. But one of the things that this research may eventually do is create a sufficient weight of evidence that we do indeed say, "Ah, yes, other species have language. We may need to change our definition of language in order to do so." Or, "Ah, yes, other species do convey symbolic meaning through language,

And here's how. We're not there yet. So progressively, I think this science is going to lead us somewhere very, very interesting in terms of asking these fundamental questions on the basis of a huge amount of empirical evidence. I wonder, you know, we briefly mentioned the umwelt question about what does it mean to be a bat or a whale or any of these things that perceive the world completely differently. And I guess I just wonder if,

we could be communicating something to the animal that the animal would be potentially understanding in a different way and acting in a way that looks like what we expect the animal to act because that's how we're understanding things.

But is it even possible to imagine that we could actually communicate where we both know that we are understanding each other? I mean, the same problem exists between any two humans. The myth of communication is that it ever happened in the first place. Yeah. So I think the practical, pragmatic, scientific response is playback experiments. And that is how these are tested. We assume this acoustic signal means this.

We can play it back in the field or in lab-controlled conditions. We see the responses, what we predict. Elephant honeybee alarm call leads to very specific physical behaviours, the group of elephants coming together, dusting themselves. Now, beyond that,

I mean, the act of communication is a profound mystery. The ability of any two beings on Earth to believe they can actually understand one another, it is actually quite magical. So science is one of the ways of approaching some of the great mysteries, and communication is one of them.

But the reason I think this captures so much public imagination, Aza and I have talked about this in the past, is because this is also a great mystery which has been the subject of much reflection in various mythological and spiritual traditions. And so that is some of the richness of this work. It's also some of the controversy that it inspires in the scientific community. We're going to take a quick break from the conversation here.

When we come back, we're going to ask whether we should be trying to translate animal communication at all. One issue we haven't touched on is sort of a should issue. Yeah. And assuming we can use AI to analyze these language shapes to figure out in some basic sense what animals may be saying to each other, maybe we can communicate with them, maybe we can listen to them,

Knowing humans, I can imagine there would be some maybe not friendly ways to approach that situation. And I'm wondering what you think about the danger of being able to understand animals and being able to communicate with them.

So I want to give, excuse me, like a really specific example of the new responsibility that we as a species are going to have to show up to in the very, very near future. So you guys, I'm sure, have encountered ChatGPT. You can build chatbots like ChatGPT in Chinese, even if you do not speak Chinese, right? And you've probably also seen like all of the deepfake stuff, right?

Now, it is possible with just three seconds of anyone's voice to continue to speak in their voice and say what they were saying. And what this means is that within months to short number of years, we will be able to build essentially synthetic chatbots, synthetic whales, synthetic belugas, synthetic tool-using crows that can speak in a way that

You know, they don't understand. They're not speaking to one of their own. Sort of imagine you had the superpower and your superpower was like being able to walk up to somebody whose language you don't understand. You sort of cock your ear and you listen. You're like, okay, I see this pattern after this pattern. You start to babble with those patterns and you don't know what you're saying. The other person's like, yeah, wow. It's Douglas Adams' Babelfish. It is, except here's the plot twist. You'll be able to communicate with

before you understand. And so this is actually the case. We're actually starting our first experiments with zebra finch likely later this year. We're doing real-time two-way communication with captive population to see can we do, like start to cross this communications barrier by being able to speak before we understand. And this is

It obviously lets us start to get to decode much faster. But humpback whale song goes viral, right? Like songs sung off the coast of Australia can like go, you know, a thousand kilometers. The humpback whale song will be picked up by the world population within a season or two sometimes.

And so if we're not careful, right, if we just create a whale that starts to sing, especially before we understand what it means, we could be messing up with wisdom tradition, right? Creating a kind of whale QAnon. We don't know. And that means before that happens, because that means, like, it's a very...

A crazy thing to think about, I didn't think we were going to get here this quickly, in the next 12 months, five years, certainly before 2030, we will have the capacity to do real-time, two-way communication, animal to AI, not necessarily animal to human, and AI.

We need to have a kind of Geneva Convention for cross-species communication, a prime directive, sets of norms, ways that IRBs review. There are a whole bunch of things we need to set up, and I think you can talk about that too. Yeah, and I think there are even more nefarious uses. Precision hunting.

Precision fishing, of course. - Yeah, poaching. - Yeah, poaching. This will enable the acceleration of the kind of cat and mouse game between poachers and gamekeepers, no doubt. There also is the specter of being able to domesticate species that were formerly not domesticatable by humans. So we may be able to use this in certain contexts, and this is what my next book is about,

for biodiversity conservation goals. At the same time, it could allow bad actors, and keep in mind how big the multi-billion dollar global illegal wildlife trade is, to further capitalise on their ability to ensnare animals that have so far been out of reach. So the Geneva Convention long term for multi-species dialogue, great.

Prior to that, I think we've got a more immediate problem on our hands given the biodiversity crisis with respect to nefarious uses of these technologies. The only saving grace is that the AI may not be really as good as we think. So, you know, first of all, we're being very, very self-centered here, as usual. We're humans. We're assuming other species actually want to talk to us.

They may be like, boring, you know, or they may just assume that these sounds, which are gibberish, you know, are to be avoided, rightfully so, or they may simply be able to detect it's not being made by another living member of their species and avoid. So my hope is that we're going to reveal ourselves to be slightly stupider than we think.

They're going to reveal themselves to be smarter than we believe. And maybe that'll create a bit more breathing room. But no doubt, longer term, deep fake AI technology creates a whole bunch of risks. And do you think we should, given these risks, you think this is something that scientists should push forward? I mean, I believe we should have a moratorium. A's and I don't agree on this point, I think. I think there are certainly thresholds that if we cross, we should have a moratorium and we should stop. Absolutely. Here's the thing.

We are mutilating the tree of life, and at some point, we are going to cut the branch upon which humanity depends. So we are in the land of Hail Mary passes. The hope for, I think, working on showing, and really, the point is not really to talk to animals. The point is really to understand and listen.

And along the path to that, we are creating the technology that solves the fundamental problems we see across all of conservation biology and ethology research. Every biologist we talk to needs to do classification, denoising, detection of signals, to understand biodiversity, to understand their behaviors. And so the tools we build as we head towards decoding

are the fundamental tools that are accelerating conservation biology, which to the extent that conservation science

accelerates conservation, like we're trying to broad scale do that. But then there are these moments when we get shifts in perspective and that changes everything. We talked about songs of the humpback whale, but also when human beings went to the moon and when human beings were dosed with that overview effect and seeing us as a pale blue dot suspended in space, planet, spaceship earth, right? That's when the EPA came into existence.

Noah was born, modern environmental movement started, Clean Air Act was passed, and that was in the Nixon era, right? And so the goal here is like, there are moments in history

which superpower movements. There are no silver bullets, but maybe there's silver buckshot. And maybe if we know this is coming, we can arm every other conservation org out there. Rights for Nature, personhood for non-humans, Eel Wilson's Half Earth, much bigger marine protected zones. When this becomes the thing that the entire world sees and becomes the top of politicians' priority list,

suddenly I think we can accelerate and be a force multiplier for every other conservation and climate action out there. And that, I think, is the reason why it's worth pursuing. I wonder if, yeah, I mean, just before we finish, I wonder, could you just say a bit about why you think we should have a moratorium? I will, but I do also want to build on Aza's point. So the climate change and biodiversity crisis are intimately interrelated.

And the fundamental challenge of the next 20 or 30 years, as we add a couple billion more humans to the planet, is a sort of Noah's Ark-like challenge. How many species will be around at the end of our lifetimes? And acoustics, regardless of whether we actually achieve interspecies communication, is a powerful tool in the conservationists' toolkit.

Because simply through the use of digital bioacoustics, you have a very low cost, very effective monitoring regime that is much less invasive than human monitoring. And this is now something that is being set up around the world. I can't go into the technical details for lack of time, but very simply, bio and ecoacoustic indices allow us to tell simply by listening...

the extent to which climate change is disrupting species, species migrations and movements, and species abundance or disappearance, etc. So we may never achieve interspecies communication, but what we can do and we all should be doing if we're interested in environmental work is supporting the inclusion of digital acoustics, bio and eco acoustics, into conservation work as a low-cost, minimally invasive, very powerful tool

So I hope that's a take-home message for all of you. The question of the moratorium, I think, is really one about something humans are not very good at, that is having the ability to create a space

in between what we think we're capable of doing and reaching for that thing. So there is a... In the tech community, in the scientific community, there is a can-do ethos.

I can do that, and so I want to do it. But there is a "should do" question here that I think requires very, very careful consideration. And I see no, and I hear I disagree with you, compelling reason right now to continue the work that could be so damaging to other species

that could lead to precision hunting and poaching. Without getting a lot of the ethical frameworks in place, it would mean updating the Convention on International Trade in Endangered Species, updating a lot of international environmental regulatory frameworks. AI governance poses a more general problem, right? So my view is we need to get our house in order on that. And just like from time to time, human genomics research has hit pause and there are certain no-go areas like cloning,

I think we can come up with a set of no-go areas for AI science in this regard that would allow technical progress to still be made, but not be invoking the kind of risks that we barely have even begun to understand. And on that, we actually, I think, agree, which is that as... LAUGHTER

As we show up to the new responsibility of the power, you have to ask how is that power bound to wisdom and how knowing that this technology is coming, because regardless, the ability to emulate any signal that's being pushed by the market forces on the human domain.

So we need to accelerate as fast as possible all of the ethics and legal updates. It's Pandora's box, right? Every new technology creates a new responsibility. I think there's a small enough set of researchers doing this that we could actually do a better job this time at sorting out the responsibility before we unleash the technology. I completely agree.

Thanks so much to Asa Raskin and Karen Bakker and to the Aspen Institute. If you want to read more about the history here, I recommend Karen's book, The Sounds of Life. It's got the whale story we mentioned, but also some fascinating stories about everything from chatty turtles to bee communication. You can also watch a video of the full discussion we had at aspenideas.org. This episode was produced by Bird Pinkerton. It was edited by Brian Resnick and Meredith Hodnut, who also manages our team.

We had mixing from Christian Ayala, music from me, Serena Solon checked the facts, and we're so happy to have her on the team. And Mandy Nguyen is not afraid of spiders. If you enjoyed the show, it would bring us a lot of joy if you'd leave a review or just send us thoughts directly. You can email us at unexplainable at vox.com. We read every email. Unexplainable is part of the Vox Media Podcast Network, and we'll be back next week.