cover of episode #48 AI Case Study: AI's Transformative Role in UX with Jascha Goldermann

#48 AI Case Study: AI's Transformative Role in UX with Jascha Goldermann

2023/12/14
logo of podcast Future of UX | Your Design, Tech and User Experience Podcast | AI Design

Future of UX | Your Design, Tech and User Experience Podcast | AI Design

AI Deep Dive AI Chapters Transcript
People
J
Jascha Goldermann
Topics
Jascha Goldermann: Booking.com长期以来一直探索AI和机器学习在提升搜索体验等方面的应用。大型语言模型的出现为AI在UX设计中的应用带来了革命性的变化,它能够处理、理解和生成文本数据,并更好地连接上下文,使AI助手更像与人类的自然对话。在开发AI旅行规划器项目中,Booking.com组建了一个跨学科团队,通过用户研究、市场调研和概念测试,确定了AI能够提升用户体验的方面,例如理解用户需求的上下文,并根据之前的对话提供更相关的答案。在设计过程中,团队面临着如何平衡AI的自主性和设计师的控制权的挑战,通过提示工程来训练AI处理各种情况,并为敏感话题预设模板化回复。整个过程需要设计师和工程师紧密合作,不断迭代改进提示,以优化用户体验。 Jascha Goldermann还分享了他对提示工程的看法,认为它将成为设计师的一项重要技能,但不太可能成为一个独立的职业。他建议设计师关注生成式AI工具及其局限性,学习如何使用这些工具,并关注AI领域的最新动态。 Patricia Reiners: 作为节目的主持人,Patricia Reiners主要负责引导访谈,提出问题,并对Jascha Goldermann的观点进行总结和回应,推动访谈的进行。她对AI在UX设计中的应用表现出极大的兴趣,并积极参与讨论,例如询问Jascha Goldermann在项目中遇到的挑战和经验教训,以及设计师需要掌握哪些新技能才能适应AI时代的设计需求。

Deep Dive

Chapters
Jascha Goldermann discusses his role at Booking.com and the introduction of AI into product design, focusing on the AI Trip Planner project.

Shownotes Transcript

Translations:
中文

Hello my friends and welcome back to another episode of the future of UX, your go-to place for forward-thinking designers and tech enthusiasts. I'm Patricia Reiners and today I am super happy to have a guest with me who is truly working at the intersection of UX and AI: Jascha Goldermann. Jascha is a design manager at Booking.com, one of the world's leading digital travel companies.

And I met Jascha at a conference in Berlin last month where he talked about a case study that he was working on at booking.com. An AI travel agent. And I thought, okay, this is super interesting. I definitely need to invite him to the podcast so that he can share his learnings to all of you guys.

So here he is now. He's going to share his expertise on integrating AI into product design and will provide insights from one of the case studies that he was working on, the AI Trip Planner for Booking.com. Yasha has been instrumental in shifting the focus from UI to AI and has some great tips on prompt designing and responsible AI use. So

Whether you have worked with AI in the past or not, no worries. This is a beginner-friendly episode, so you won't want to miss this deep dive into the future of design and technology. So enjoy this episode with Jascha Goldemann. Perfect. Okay. So welcome, Jascha, to the podcast, The Future of Your Ex. I'm super excited to have you today. So welcome. Welcome.

Thank you so much for having me. I'm also excited to be here. Amazing. So cool to have you. So before we are diving into the topic of AI or UX for AI products, I would like you to introduce yourself, talk a little bit about your background and what are you doing at the moment? Yeah, for sure. So yeah, my name is Jascha Goldermann.

I am now a design manager at Booking.com. So I manage designers in different teams, mostly working on improving the search experience at Booking, so really at the heart of the product.

And I live in Amsterdam. This is where Booking has its headquarters. Before that, I spent the last 10 years in different UX roles, mostly in Berlin, around half of it in leadership UX roles. I've been both a senior designer, lead designer. I've worked as a head of design before. And I have background in graphic and visual design.

And yeah, that's me in a nutshell. Wow, cool. Nice. Quite a journey. Super interesting. And what I also find really interesting about you is that you are not only focusing on the search, but also exploring some future topics, especially AI. This is something that you shared at a talk at the Hedge Conference. And tell us a little bit about the project that you worked on that is focusing on AI.

Yeah. So before I get into that project, I just want to note that

Booking has been experimenting with AI and with machine learning capabilities already for many years. And there are many different teams that are using different machine learning models for all kinds of use cases. And so within search alone, you can imagine that we're also exploring different ways of using machine learning to improve

basically how people find the best accommodation and all elements of their trip. But then at the beginning of this year, when within machine learning, the capabilities of large language models became more widely available, more publicly available,

we had discussion within the company of what does this mean for us? Like we already have experience working with machine learning models and we have our whole machine learning organization and we have our machine learning hub. And now there's this new technology that is available to, yeah, even, you know, basically anyone now that significantly, dramatically enhances the capabilities, at least for certain use cases.

So what large language models do is really enable machine learning models to process, understand, and then generate data. So that could be text. Well, with large language models, it's text because it's built to understand language, essentially, and produce language. And so we were looking into what does this mean for

for us, right? There are lots of use cases that this could unlock. And the one that seemed the most straightforward, let's say, was, okay, what if you could actually talk to our booking assistant that already was machine learning enhanced in the same way how you would talk to any travel agent? So really use these capabilities of large language models to have

this online travel agent feel like a conversation with a natural human being. Now, the difference with large language models and normal machine learning models is not just that it can produce language, but it can also really, it's much better at connecting contexts. So for example, if

It has produced a response and then the user responds to this response. Then it will connect sort of the meaning between those responses and it will follow up with sort of an understanding of that meaning. So it will follow up on previous questions and it can, you know, what we would say as humans sort of read between the lines even. So there's a lot of information that you don't have to specifically read

provide that will sort of be understood almost like we're using these terms in the way how we talk about human conversation right like understanding and and and knowing and thinking and learning and in reality it's not doing those things but it's producing an output that is

very similar to how a human would do those things. And so the idea was, okay, let's build an AI trip planner, we called it. So let's build an AI enhanced virtual travel agent. And

Because we had the support from company leadership and basically we also had a bit of urgency behind the project. We got together as a tiger team. So basically a group of experts from different crafts. We had some machine learning experts, a lot of developers, designers, researchers, writers. And I was leading that UX work stream up to the point of releasing the first MVP of that AI trip planner.

And so that was different to all the other use cases of machine learning models that I've worked with at Booking in the past or that I've implemented for different purposes at Booking because it's really like, it's a massive step in terms of the capability, let's say, that it provides. It's just a game-changing technology, the large language models and the generative AI in general.

And so, yeah, that was how the AI Trip Runner basically came to life. And can you tell us a little bit about the process of starting a project like that, especially from the X perspective? This is always very interesting, right? Because you need to keep the user in mind and really need to make sure that you're solving the right problems and not just like putting AI on it because you can and because it's fancy.

you need to make sure that it's really solving problems. So how was your process? I mean, the process is similar to how you would approach any project.

complex project. So like you said, first, you really need to understand the problem space and what are the sort of goals and frustrations of users. We already had a booking assistant, so we could start with that. We could basically look into what was the value that users were already getting out of it and what was missing. And so our research could focus a lot on

With this additional capability, what is the additional value that we can unlock? How can we improve the experience for our users?

And did you already have research that you looked into or did you do any kind of new research sessions? Both. So we had different types of research as well. So on the one hand, we had research looking into concepts, so kind of concept testing with prototypes that were exploring basically AI enabled or with large language models improved sort of assistance.

At the same time, we also had market research. So really looking into where other competitors are going or generally the market is going and what kind of value is being unlocked with this kind of technology in general. So some inspiration from that as well. And then, of course, also new research. So as we got together and started collaborating

creating our new concept, we at the same time within almost in parallel, we try to validate these kinds of concepts with users as well. Makes total sense. And what do you think were like the main problems that you discovered where you thought, okay, there can AI actually bring value to the table?

That's actually helpful. What were some of the biggest pain points from the user, some of the goals that you came up with to solve with AI? Yeah. So if we use AI in the context of conversational experience like we did, then the main enhancement that you get is in the way how it really applies an understanding, let's say, of the real world to the conversation.

So if you look at the previous booking assistant that was already machine learning enhanced, but not with LLM capabilities, what it would do is that you could talk to it in natural language and it would do its best to identify the right keywords to understand basically what you're trying to achieve. And it was trained on machine learning models for specific topics. So a machine learning model, for example, for pets.

So then if you would ask if this hotel that you were interested in or a hotel that you booked does allow pets, then it would understand that because it had a good understanding of that particular topic. But it was very limited in the sense that it was looking for keywords and it was looking for topics that it was trained on and it didn't really keep the context. So if you ask the same question twice, you'll get the same answer twice. But now with large language models,

it basically has an understanding of a much bigger amount of topics because it's trained on really vast amounts of data that was sort of scrapped. And

applies knowledge of different topics to whatever you want to talk about. So if you talk about whether this hotel is pet friendly and it gives you a relevant answer, then later if you ask a different question that still might, it's going to take the understanding that you're traveling with a pet and it's going to carry that understanding throughout the conversation. It's going to keep that understanding alive.

There are also limits to that, but that in itself, even if you limit that to a session or something, that in itself is like a massive improvement in how much value a user could get out of this conversation. And in general, conversational experiences, conversational UI already have specific features

value, let's say, that they're trying to unlock. So some people would rather maybe use their voice, for example, to

to plan their travel if they could instead of interacting with UI elements. And there might also be accessibility reasons for this. So maybe they cannot interact with the visual UI, but they could with the voice UI. So for that case alone, if you build in a system that can understand and then also generate language in the same way how a human would,

obviously you're going to have a much more helpful assistant for that specific use case than if it's just trying to fish for certain keywords, for example. Yeah, definitely. I mean, that sounds like a huge improvement.

I haven't tried the AI Assistant yet, but I mean, that's like a huge upgrade, I feel, when you really have a good conversation with this assistant and really book your travels through it. So, I mean, that sounds amazing. But, you know, I'm thinking about it from the X perspective.

You went through the whole process of thinking about, yeah, I want to integrate the ChatGPT API. I prototype something and then I have the result in the end that I test. What do you think are some learnings that you got out of the process? Because I assume it's very different than designing a basic chat interface, right?

So what are some of the things that you need to keep in mind of when you are a UX designer and starting a project like that? Yeah, there are a lot of things. So generally with this technology, you're losing a little bit of the control that we as designers are typically very used to. So when we design experience, typically we know that we basically define how the experience is supposed to be and we have control over it.

But then when you add generative AI technology to your product, you're letting go a little bit of that control because you cannot

100% plan out how it's going to answer, for example, in a question if it's a conversational experience. So what you need to do in or what we had to do in our case, we had to map out sort of situations and through prompt engineering, try to train the AI and how to deal with those situations while not being 100%

able to define exactly how every single interaction would be. So we did try to identify which topics are so sensitive that we will basically take over and we'll just show a templated message. So we do have responses that we can guarantee will always be 100% exactly what we want the response to be. But we need to trust that it's

that we're able to identify the situations where we need to basically do that. For other situations, if it's a completely valid problem, for example, about what should I pack or what would you recommend me to see when I travel to certain destinations, we need to sort of train the AI to respond in a way that we wanted to respond and then trust almost in a sense that applies that kind of training. So

The benefit that you have with large language models is it understands and it generates language similar to how a human would, but then also comes with the same sort of risk and responsibility that you have to kind of think about it in the way how you would think about training a human. So let's say if you hire a travel agent and they have

no understanding of travel and they have no understanding of how to interact with customers, then you wouldn't just let them work at your company and advise travelers. You would first train them.

And in the same way with prompt engineering, we're training this AI on basically providing a context about who are your customers, essentially your users, who are you, what are your values? What is your goal? What is their goal? All of that with ethical considerations, of course. And, you know, you also have to train it in sort of what kind of language do you want to apply? And some of the learnings that we had there is,

the way how you train the AI for prompt engineering, it makes sense to use the same language that you also want the AI to use. So a mistake that we almost made at the beginning, but that I think could be made generally when we try to work with this technology is that we think we're just providing rules. And typically when you provide rules, you're very concise and precise and to the point, and you're basically saying,

do this, don't do this, but that's not how you would naturally talk to an agent that you're training and that's also not really how you're training this AI. So basically in our prompts, which are basically long, long documents that contain all kinds of context for the AI, you have almost like conversations basically saying,

We want you to refer to yourself as this, and we don't want situations like this to happen, like how you would talk. And that kind of language makes it into the way how the AI also talks. So it's not just a matter of saying, don't use complex language. It's not just a matter of saying, don't be too informal or something like that.

Yeah. Going to try to understand from the way how it's being trained, in which way it's also supposed to then produce data. Makes total sense.

I think sometimes it's really difficult to articulate exactly how you want it to become in the end. This is, I think, a big challenge. How did you do that? And also who was responsible for the training part where you also involved in the whole training the model or other people from the team. So how did that go? Yeah. So that's also something which we kind of like

where we kind of had to find out for ourselves what's the right process there. So from the designers and the writers, we already had like sort of mapped out what we think the experience should be for different situations, right? So we basically knew also we had guidance sort of of how we wanted the AI to handle certain situations.

But then it was up to the machine learning engineers to actually translate that, let's say, into prompts. Because the prompts live inside the code, at least for now. I mean, this is probably changing already as we speak. And maybe soon we will have tools where you have more like a visual interface to craft those. But when we worked on this, basically the prompts are living inside the code.

the code. And what we had to figure out for ourselves is that, well, the way how those prompts are crafted really matters in terms of how it shapes the experience that the users then have with our AI trip planner. So it's not okay to just have engineers craft those prompts. We need to have designers and engineers side by side, looking at those prompts, tweaking them, because that's really how you define the experience.

And the tricky part, though, is, okay, you have a certain expectation of how if you write a prompt in a certain way, what kind of output it will generate. But at least for now, at this moment, we don't really yet have the tools to test it on the fly.

So it's really difficult to do any kind of prototyping, let's say. JWV as designers, we create an interactive prototype, and it's just basically user flow that we need to recreate for users to interact with. But how do you do that?

When in your prototype, the content would be depending on what input it will receive and might be different every single time. How do you create such a prototype? You don't have those tools yet. So what we had to do was a bit of a combination of trying to figure out how prompts work.

would likely translate to the content that is generated by the AI through testing it, for example, with ChatGPT. So we had some primers and prompts that we were also using on ChatGPT to figure out how would this influence this model in the way how it responds.

And then, of course, we had to get to a sort of very, very early rough beta version as soon as possible. So trying to have something that's, you know, it doesn't like the UI is completely untouched yet. It's just really to try to see what happens when we tweak the prompts in this way, what kind of output would it generate? So really getting to a very early rough beta version that you can then test and play around with.

and through that sort of learn how to improve it. And even when we got to the point where we felt, okay, so a lot of it is done and finished and we already have like the UI is ready and we have like most of the prompts are already designed, we still found by testing it ourselves and interacting with ourselves, we found so many cases of, yeah,

situations where we needed to tweak the prompt in order to handle those situations better. So that's another aspect. There are so many scenarios and you can almost not account for everything. You really need to make sure that you test for all kinds of situations and that you think about all the sensitive topics as well. And yeah, it's 100%. That's a big challenge, I can assume. Because I mean, coming up with scenarios in the first place,

can be a challenge of course but then also thinking about all the sub scenarios that are coming up all the edge cases which are not so much edge cases in most of the cases right um and you mentioned prompt engineering and i'm really curious to hear your thoughts about that um there is definitely a big discussion at the moment around the the topic of prompt engineering is this something that will be very important for us as designers to learn to really deeply understand how it works

Will this be maybe like a whole job where you can become a prompt engineer or will this be something that's slightly like fading away because we will have more visual interfaces where we just use our natural language? What is your take on that?

It's a bit difficult to say because it's like a fast evolving field. I know that there are already people working as prompt engineers, but I suspect that we will not have this as, I don't think that we will continue having sort of prompt engineers as specific roles, but rather that prompt engineering itself or prompt design itself is a skill that is more

widely expected for different roles to have. So for UX designers to have an understanding of prompt engineering as well as engineers and yeah, of course, like machine learning scientists and so on. So

We will see job descriptions for sure where companies who are trying to add Gen AI capabilities into their product that they're looking for designers who have this experience. And then maybe one of the sort of skills that they're specifically looking for is how have you worked with prompts and how have you worked alongside machine learning engineers and scientists to craft those prompts and to basically craft that experience. Okay.

So this is leading us into a super interesting area, which is also what kind of skills does, especially you ex-designers at the moment, need to learn and need to embrace to keep up to date with everything that's going on and still be relevant? Because we are living in a fast paced world. Things are changing so fast. So from your point of view, what do you think is super important for us at the moment?

I think for now it's just a matter of trying to get a good understanding of the different opportunities but also risks that machine learning models provide, trying to get a better understanding of the sort of field of study that AI is.

even if we like now when we talk about ai typically we talk about generative ai and often we talk about generative air in the context of text-based gen ai and large language models so you don't need to necessarily have an understanding of the vast field of study of artificial intelligence you can very much focus on generative ai which i think is the field that has the most um the most

applications, the most used cases in consumer products at the moment. So look into GenAI tools and their capabilities and the different models that are being released. And they're not just the models that we know of. There's also open source models and this technology is becoming more widely accessible and affordable. So it won't be long until

startups can essentially say, hey, there's an open source model. We're just going to need to hire an ML engineer who understands how to help us apply the capabilities of that model into our product. And so we're going to see these use cases pop up all around us. So it makes sense for designers to sort of

catch up on, yeah, AI and LLM, machine learning models, how they work, what limitations they have, what different models are there and available. That's like the main skill I would focus on. And how do you do that?

So the good thing is all of that is mostly out in the open. Of course, there are like lots of companies that are working on these models and we don't have access to their latest advancements, but we do have access to, you know, very, very advanced versions of different models and we can use

of course the big ones like chatgpt and and bard and and bing and so on um they're also once you start researching these tools once you start looking for large language models and uh

topics like prompt engineering, you'll find so many resources and playgrounds and even plugins for your design tools. Like even in Figma or in Miro, you will see plugins that you can use to play around with what kind of content you can generate with these models.

Then there's, of course, news about the topic. So we already have tools all around us that are using these capabilities, like Miro and Grammarly and GitHub and all these tools that we might use anyway already incorporate some of these capabilities. So it makes sense for us to at least play around with them.

And there's news that is about it and podcasts. So I think there is no lack of information or accessibility of this information and there is no lack of access to these models themselves and playgrounds to test these models. Mm-hmm.

What I feel is such a big challenge at the moment, and I think this will become even worse in the future, is to really curate that information. Because there is so much out there, the same with podcasts, the same with books. How do you find the perfect fit for your need? I think this is like a million dollar problem.

that if a company would solve, it would be so nice. You have a problem and then you get like the perfect book or the chapter of a book that explain that certain problem for you in a perfect way. But at the moment, I think a lot of people and I can definitely include myself feel really overwhelmed with all the content that's out there and all the things that are shared.

and really to understand what is relevant at the moment. So do you have any specific podcasts that you're following, maybe any newsletters or anything that you really like to read and really enjoy? Yeah, so I completely understand. I completely feel you on this. The information is almost too much to be able to know where to look and what to consume and what to ignore. And it's also that a lot of that information is just

highly situational advice because the field is evolving and there are lots of people who might present themselves as experts in the field and then write an article about it. But maybe

their advice is already obsolete like three months after, right? So that's also something to sort of be mindful of and be aware of. I'm certainly no expert in the vast field of AI. And I think not many people really are because it's very vast and broad.

So I like to, for myself at least, keep the information that I consume about this field also to be sort of selective. I do find that Ben's Bites newsletter is a good resource. It just is basically on a weekly basis, a short overview of the newest or most important

changes in the field of AI. They provide a brief summary and then links so that you can dive into specific topics more. Then there is the Invisible Machines podcast, which has more deep dives into certain aspects of it. So for me, those resources alone are already good to stay up to date on what's going on. And then you kind of need to pick and choose which topics you want to dive deeper into.

Sounds good. Perfect. Always helpful, I think, also for the listeners to check these resources out. So if people want to learn a little bit more about you, where can they find you? What is your favorite social media platform? How can people reach out to you if they have questions or just want to chat? I'm not really so active on social media, so the best way to chat with me is on LinkedIn.

But if anyone wants to dive into any of those topics, they can also reach out on ADP list. So yeah, you are a mentor there, right? Yes. Amazing. We haven't mentioned that yet. That's so cool. So are you still looking for mentees or are all the spots taken? I think I do have spots generally. They might be like in a couple of weeks time from now, but yeah,

You can also use ADP List to send a message and then maybe find if there is another spot available. And so it is something that I try to

prioritize in a sense and then I try to sort of reserve some time of my week each week nice I love that so it's so important also to give something back and to support the community in a way to share knowledge so I think that's awesome and definitely interesting for some of the listeners so if that's interesting to you I will link your profile on at ADP yeah ADP

in the description box so you can check it out. Okay, Jascha, thank you so much for sharing all the insights with us. I think it was super interesting, super helpful for us to get an understanding of how it looks like when you design AI products or enhance existing products or problems with AI. So thank you so much for sharing that. And yes, thank you for being here. Well, thank you for having me again. It was lovely having this conversation.

And I hope people could find it useful. They will. Thank you. And bye-bye. Bye. Have a good day. Cool. Okay. Thank you so much. I will...