Hello everyone, welcome back to another episode of the AI podcast where we talk about recent developments in artificial intelligence and how it evolves in the future. I'm your host Eli Schaefer. I'm so excited to be here with you today. We have an exciting episode, so let's get into it.
Okay, so today we're going to talk about an AI search engine called Proplexity. If you don't know what Proplexity is, it's pretty much
AI tool that you can ask it questions and it will give you the answer. This is what their hyperlink says. It says, Perplexity is a free AI-powered answer engine that provides accurate, trusted, and real-time answers to any question. That's just a brief summary of what the service is. So Perplexity launched in 2022 and they've
made a very big impact on the AI space and they've been growing ever since for the last two years which is crazy. So what is happening is kind of insane. This is something that we've seen a little bit with other large language models as well and it's not really different with perplexity where you
News outlet companies are accusing them of two things. So one is plagiarism. And then the second one is web scraping. And essentially what web scraping is, is when a robot goes through the internet, goes on websites, gets data from those websites, and then puts it in an index, similar to what Google uses so that like websites can show up on there. So that's that's kind of what web scraping is. And
Why they are being accused of this is they have, there's been a little bit of research done where people from different news outlets will give it a link to one of their news articles and ask it to summarize the article. And some of the text it will output is word for word what the article says. And obviously this is something that is a big deal, especially with these news outlets. Um,
plagiarism is something that they want to stop and protect. And also because Perplexity is a big enough company, it makes sense for them to just want to go after them solely because they could probably make a good amount of money if they could catch them, you know, doing something unethical or illegal. So it is, it's a pretty big deal. Now,
Web scraping is, although it is kind of what they've been accused of, Perplexity's head of business, Domitry, I hope I'm saying that right. He said that, so he said summarizing a URL isn't the same thing as crawling. Crawling is when you're
So he pretty much said that summarizing a URL isn't the same as web scraping. He said, so this is what he says. He says it is when you're just going around sucking up information and adding it to your index. He noted that perplexity's IP might show up as a visitor to a website that is otherwise kind of prohibited for robots.
only when the user puts a URL into the queue, which doesn't meet a definition of crawling. So pretty much what he's saying is, although it could seem like it is web scraping or web crawling, those are the two different terms, because there's an IP address from that company going to those websites and collecting data, what they're essentially saying is,
The AI is really just fulfilling a request from a human. So it's not the same as an automated robot going through and doing this 24-7. It's really just a robot that's just fulfilling a request, which is going to a website, getting data from it, and then bringing it back to the user. And then what they're saying is,
Now, it's up to the user if they use that data. If they use that plagiarized content, then it's kind of on them. It's not really perplexity's fault. That's the stance they're taking. Kind of like how if you search something up in Google, a result that is text from the actual website will show up under that website. Doesn't mean Google's plagiarizing it. It just means Google is showing them that content, which is from the website. So,
Really interesting. I'm curious what your thoughts are on this because there is... This has kind of been a concern for a while. Even image generation, like, AIs, people have...
kind of accused that as being like using copyrighted material because it's trained off of actual artwork that people have made and then it generates images on its own kind of like inspired by that artwork and text generation is the same thing so
Yeah, I'm curious what your thoughts are. Definitely let me know. Don't forget to subscribe. That's it for today's episode. Thank you so much for tuning in. This has been a really good discussion. I know I've certainly learned a lot about artificial intelligence. I'm going to definitely be a little bit more cautious when I use it. Don't forget to subscribe. I will see you in the next episode.