Tech Overflow
We're Tech Overflow, the podcast that explains tech to smart people. Hosted by Hannah Clayton-Langton and Hugh Williams.
Tech Overflow
Inside Waymo’s Robotaxis with Nick Pelly
A taxi pulls up with no one in the front seat. Would you get in? We invited Waymo director Nick Pelly to take us from that first uncanny moment to the engineering that makes a driverless ride feel calm, confident and, by the data, far safer than most humans behind the wheel.
We walk through the full autonomy stack in plain English: how cameras, radar and LiDAR fuse into a single view of the world; how perception, prediction and planning work together to thread through double‑parked vans, nudge through gridlock and still behave like a good road citizen. Nick explains why Level 4 autonomy is about design domain as much as capability, why hardware still matters, and how redundancy handles blocked sensors, grime or failures without drama. We dig into machine learning at scale, from training on diverse city data to tens of billions of simulated miles, and how teams tune precision and recall so the car avoids both missed hazards and needless hard braking.
Beyond the ride, we zoom out to the business and the city. Phoenix offered a launchpad to build the marketplace, charging and fleet operations; San Francisco demanded handling a busy city and human‑like judgement; London beckons with dense streets and weather. We explore what happens as adoption grows: fewer parking lots, smoother traffic, motorway platoons, even intersections that need fewer lights when vehicles coordinate. Nick also shares his focus area—reliability and freeway fail‑safes—designing for worst‑case scenarios so the system exits danger gracefully at speed; this episode was recorded a week before Nick and the team announced highway driving!
If you’re curious about autonomous vehicles, safety, AI, urban mobility or just want to know what “robotaxi” really means, this conversation turns buzzwords into something you can picture—and maybe soon, ride. Enjoy the episode, then follow and share the show, and leave a quick review to help us bring you an even bigger season two.
Like, Subscribe, and Follow the Tech Overflow Podcast by visiting this link: https://linktr.ee/Techoverflowpodcast
Hello world and welcome to the Tech Overflow Podcast. I'm Hannah Clayton Langton.
SPEAKER_01:And I'm Hugh Williams, and we're the podcast that explains technical concepts to really smart people. How are you, Hannah?
SPEAKER_00:I am great. I'm on holiday. I am a little bit disappointed to be back recording remotely, but such is life. What about you? Is the jet lag still very prevalent?
SPEAKER_01:Oh, the jet lag's super prevalent at uh Melbourne to London, London to Melbourne in a week thing uh really's really turns you upside down. We like the most jet set podcast ever, because I think you're in upstate New York.
SPEAKER_00:Is that right? I am indeed in New York recording from my friend's guest room. So yeah, I don't go very many places without my mic these days.
SPEAKER_01:And then when this trip's over, you're done for the year. Are you gonna stay in London?
SPEAKER_00:Oh no, I think we're going to Paris at some point, but that's just on the train. So it doesn't really count, does it?
SPEAKER_01:One of the cool things about Europe, I mean you can just say stuff like that. I'm popping over to Paris. Uh that's not a concept for Australians. We don't pop anywhere.
SPEAKER_00:It's a good segue into our transport-based topic for today, which is truly accidental, but I'm gonna sort of stretch it to make it work. So who've we got on today, Hugh?
SPEAKER_01:We have Nick Pelle, who is a director of engineering at Waymo. And for those of you who don't know Waymo, Waymo is uh a company that's very well known to people who live in Phoenix, in Arizona, in the US, and to folks who live in San Francisco in California. And it's a company that uh develops driverless cars, so autonomous vehicles. So vehicles you can actually just hop into, they have no driver, no human supervision, and they'll take you like a robo taxi from A to B. And Nick's somebody I knew uh from when I worked at Google Hannah Awesome.
SPEAKER_00:Yeah, I love this phrase robo taxi, which isn't something that I had heard before we started recording. Um I learned a ton, and I think you guys are going to enjoy it too. So we will get right into it.
SPEAKER_01:Yeah, awesome. Let's get Nick on the show. Nick Pelly, welcome to the Tech Overflow podcast. It's great to see you, my friend. Great to see you here. Thanks for hosting me today.
SPEAKER_00:Yeah, we're really excited to have you here. So today we are here to talk about Waymo and autonomous vehicles, which, by pure coincidence, since you agreeing to do the podcast, Waymo have announced their London launch, and we have a good deal of UK and London-based listeners. So this is really exciting for us.
SPEAKER_02:Yeah, I'm I'm super excited to see Waymo expanding to more cities and operating in five cities in the US. And it's you know quite well known to those who are uh in those cities and using the service. But to go international, to go to London, I've got family in London too. I'm really excited to bring this technology to more of the world.
SPEAKER_00:I think it's one of those things that it's really hard to imagine until you experience it. So maybe we kick off with just like bringing it to life. Like, what's the customer experience like of getting in a Waymo? Because I have to say, I've been in an autonomous vehicle, which we might talk about later, but I've not been in a Waymo.
SPEAKER_02:Yeah, there's there's definitely a variety of vehicles with autonomous features out there. But let me describe what Waymo is. It's a robotaxi experience where you know you pull out your phone, fail a vehicle, you know, much like you would do with Uber. In fact, in some places we partner with Uber, and you know, the vehicle comes to pick you up. But in this case, the vehicle is empty. It's gonna pull over on the side of the road with no one inside. You get a private vehicle, a private experience, and it's going to fully autonomously drive you to your destination. So this is a ride hailing service. It's what in the industry we describe as level four autonomy. Uh, you know, if you'd like later, we can get into some of those different levels. The level four meaning, you know, there is no human in the vehicle that is ready to take control of the steering wheel. There's no one in the vehicle who needs a driver's license.
SPEAKER_01:I remember when I was at Google, they had those little uh Waymo bubble cars. It looks like things have evolved a lot since then. So, what's the actual hardware these days? What sort of what sort of a car are we hopping into?
SPEAKER_02:The uh vehicles that we have today, the base vehicle is a is a Jaguar, Jaguar iPace, but it's been significantly retrofitted by us to add all of the sensors that we need for autonomous driving. Cameras, radars, LIDAR, inertial motion units, IMUs, so that it can see the world very clearly. We will be introducing other vehicles, uh, you know, several that are in testing at this stage that we hope to bring you know to the public soon.
SPEAKER_00:And if I'm to put my product management hat on for a minute, can we start with like the why? So, what is the promise of these driverless cars? Why do we need them other than being able to play music and not having to speak to the Uber driver when we're going from A to B?
SPEAKER_02:That's the most noticeable uh for the user is you get a private experience, and that's a really big win. But what we think a lot about is the 40,000 road deaths just in the US, 40,000 road deaths per year that are completely avoidable. And the safety benefits that an autonomous vehicle bring are quite dramatic. We've now driven over a hundred million fully autonomous miles. And you know, we've looked back at that data, and we have between five and 10x less accidents than human drivers who uh are driving in the same geographies. The safety benefit is is really quite stark, as well as the convenience that you mentioned.
SPEAKER_00:That's really interesting because whenever I mention autonomous vehicles to anyone I know, whether it's friends or family, admittedly, no one's really in the industry, but their first sort of hesitation always comes down to safety. So maybe later in the episode we can get more into like why it's actually more safe and proving itself to be, because it's very counterintuitive, at least if you don't understand the tech behind it and why it might be superior to human drivers.
SPEAKER_02:Yeah, I think that's many people's reactions. But what we observe is even after the first ride, people can appreciate the safety very, very quickly because within a few minutes of driving, you will see how confident the vehicle is and it's seeing things, it doesn't get distracted, right? It's looking 360 degrees all the time. And folks report feeling that confidence even within the first few minutes of riding.
SPEAKER_00:Interesting. So I think I mentioned just before that I've not been in a Waymo, but if we can cast our minds back to like 2015, so I went in an autonomous vehicle in Abu Dhabi in this place called Mazdar City. And I remember it having quite significant expectations of what a driverless vehicle would be like because my parents were living out there and they said, Oh, you know, when you come and visit, we'll take you in these super cool autonomous vehicles. And when I went and actually rode in one, it was kind of more like a little shuttle rather than like a vehicle that actually was a car without a driver. So is that like the early iteration of the technology? Was this sort of shuttle service? And now has it evolved past that?
SPEAKER_02:Yeah, maybe let me lay out the capabilities a little bit. Um, I think this shuttle service, did it have a driver inside?
SPEAKER_00:There was no driver, no.
SPEAKER_02:Okay. So the SAE have levels of autonomy for vehicles from what's the SAE?
SPEAKER_00:Sorry, Nick.
SPEAKER_02:The Society of Automotive Engineers.
SPEAKER_00:Okay.
SPEAKER_02:So what you experienced there was a level four uh autonomous vehicle, and that's the same as what Waymo produces level four. So level one is a is a manually driven vehicle. Level two are vehicles that have some autonomous features that require constant human supervision. And this is actually very common in any vehicle that you buy today. This is things like cruise control, lane keep assist. You know, they can significantly reduce the load on the driver because many tasks are being assisted by the vehicle, but the driver has to fully supervise and be prepared to take over at any time. Um then you have level three.
SPEAKER_01:Nick, sorry, with with level two, so you know, you've got things like adaptive cruise control and the you know, the lane guidance and things. Does does Tesla's full self-driving mode kind of fit into level two? So really you've got a human there who's supervising whatever the car's capable of?
SPEAKER_02:Yep, exactly. I'd say Tesla's FSD is a is a very capable level two. It actually provides pretty significant assistance, but but it's still in the level two category. And that's where there's another dimension that I was going to get to in a moment. You know, you've got the levels of autonomy, which is really about how much supervision is needed. And then you've got, you know, what is the operational design domain? The the ODD is the terminology we use for those autonomy features. Just how much are they capable of? So the shuttle that you were referencing, Hannah, in Abu Dhabi was level four, but with a very restricted operational design domain. It's probably just doing point-to-point service. I imagine this is an environment that is not dense with people, with other traffic, and it's probably limited to a modest top speed.
SPEAKER_00:Exactly. So if I remember correctly, Mazdar City was almost like a campus, and you actually parked your car like outside and went in. The autonomous vehicle didn't even look or feel like a car when you rode in it. It was kind of like this little pod. Yep. And there were no other cars on the road. There might have been a few people walking around. I actually can't remember because it was a while ago. But it it wasn't replicating the typical driving experience, which is interesting because it's actually the same level as Waymo, as you've just sort of broken out. Yeah.
SPEAKER_02:That's right. It's level four, but but with a more constrained ODD. And this kind of gets to begins to get to the heart of you know what is so odd about building an autonomous system. You know, you can get to level four on a constrained ODD by, you know, just being very cautious, you know, limiting the top speed. And I'd imagine that vehicle would react pretty quickly to any object in its presence, right? If it sees anything, it could be a pedestrian, it would just come to a stop and not make forward progress until it was quite sure the path is clear. And that works okay in environments that aren't that dense, that are relatively structured. That's not going to work if you put it into the middle of a busy city like San Francisco or London, because it's not going to make forward progress. There's going to be too much activity, too much going on. It can't reason about this environment in a sophisticated enough way to both be safe and to make forward progress in the environment.
SPEAKER_00:Okay. So in order to make forward progress safely, the self-driving system, I guess, has to replicate what a driver would do, or it sounds like maybe even perform those actions better, which involves like perceiving what's around it and making decisions as a result of that. So how does that all happen? You mentioned LiDAR. Um, I'm sure there's a bunch of other stuff. Could you talk at a high level, talk the listeners through how that all hangs together?
SPEAKER_02:Yeah, let me give a high-level overview of a typical autonomous vehicle um driving stack, right? And then we can come back and kind of dig into why this is so hard to be both safe and for the vehicle to you know make progress in complicated situations. So at the high level, you've got all the sensors mentioned earlier, cameras, radars, lidars, um, audio. We also have microphones.
SPEAKER_00:Can I ask you a quick question? Sorry to jump in. The difference between a radar and a lidar, could you just talk the listeners through that?
SPEAKER_02:Radars are working on sending out radio frequency pulses and you know receiving the returns. And they are fantastic for looking for objects that are reflective to the RF frequency, which is typically metallic objects, and they give you a very good speed indication. You get this Doppler return. That's why police use radar speed guns. LIDARs are actually sending out light pulses instead. LIDARs are the rotating objects that you see on top of the Waymo vehicles. You know, they really stand out visually. They're very sophisticated pieces of equipment that are sending out laser pulses and then receiving the returns. And they give very accurate measurement of distance to an object.
SPEAKER_00:So is there like an interplay there between the lidar knowing the distance and the radar knowing the speed? And then like somewhere in the system, a calculation of how likely a collision is.
SPEAKER_02:Yes, and a sort of simplified version of how this works, and is maybe how things worked 10 years ago, is you would use your lidars to figure out that there is some object in front of you. You would then go to the cameras, which are very good, they're very high resolution, and they can help identify what the object is. And then the radar provides the additional, okay, how fast is that object moving? Now that's a very simplified version of how things work. What happens in the more modern systems is that all these data points are fused together. So you could imagine you have all of this rich information coming from lidars, radars, cameras, and very early in the pipeline. So before you've even determined that there is an object, you're actually fusing all the information together. The strengths of radar, the strength of LIDAR, the strength of camera. Modern AI allows you to really blend those very early in the pipeline. And that's at the perception phase, which is the next stage of the pipeline after the sensors. Perception is consuming, you know, this is this very high bandwidth data from the from the cameras, radars, lidars, and forming a view, like a semantic view of the world. Okay. What does the roadway look like? What are the vehicles? What are their speeds? What are their directions of travel? Pedestrians, other agents, you know, cyclists, uh, people on scooters, people on bicycles, static objects. So it's it's building a semantic representation of the world. From perception, we then go to prediction because you need to then play that representation forwards. Okay, how do we think the vehicles are gonna move? How do we think the other agents in the scene, the humans, bicycles, how do we think they're all gonna move as time progresses? The next major stage is planning. So, given that view of the world, given our predictions of how things are gonna move, what should our vehicle do? And the goal of planning is to uh generate a trajectory. Now, internally, you'll generate a whole bunch of candidate trajectories and you'll score them, but ultimately you want to decide on one trajectory to follow. So that is a description of your path through space. Then you go to uh motion control, which is turning these trajectories into commands to the to the braking, to the steering, to actually get the vehicle to execute along that path. That's the high-level pipeline. I I have very much simplified this, and you know, those in the field will know there's a lot more interplay between these systems than I've described. For example, you can't actually predict the world without knowing what you're gonna do because other agents are gonna respond to the path that you plan. So we're seeing a lot more blending of these systems, but I think it's a good high-level way to understand how we go from these sensors, which are basically receiving photons from the world, and to turn it into actuation of braking and steering.
SPEAKER_01:And how much Nikki of this is machine learned, is is is models. We've just been talking about that in the last couple of episodes, but how much is sort of handwritten, sort of symbolic reasoning and how much of this is models that you learn from data?
SPEAKER_02:Yeah, these days there is a a lot of machine learning and uh you know artificial intelligence in all these stages of the system and blending together these stages of the system. Now that has evolved significantly, you know, over the uh 10 plus years that that Waymo has been working at this problem. It's been really accelerated by advances in AI, right, and and moving uh more of these systems to to data-driven, machine learned-based approaches.
SPEAKER_00:And Nick, would you train the software or the systems behind it differently for like London versus Phoenix? Because Phoenix was this test case city, right? Because it was super optimized for autonomous vehicles, or do you just train it with information about everything that you can?
SPEAKER_02:Yeah, that's a fantastic question. Um, these models do tend to generalize very well over different cities. These artificial intelligence approaches, these machine-learned approaches, they tend to generalize well as long as you have generalized data that you're feeding into them. And this is actually one of the big advantages of them over the more legacy approach, which is a lot of handcrafted software. That tends to be a disaster when you want to go from one city to the next city. Because, you know, if you could imagine, if anyone's somewhat familiar with programming, uh, one of the basic constructs is like an if statement. Okay. If this condition, then do this, if this conditioned, then do this. That and that works fine for very simple scenarios. Maybe you could craft something that could work in in some driving in environment, some simple driving environment that way, but then it doesn't generalize that the moment you see something new, you have to revisit all of those if statements. There's just too many permutations of complexity.
SPEAKER_00:This is an awesome callback, Hugh, to the AIF ceres that we've just run through because we talk about the limitations of the human laying out all of the different scenarios and parameters that you want to make a decision on. And actually at some point, you just like hand it over to the technology and say, you're going to figure this out much better than we could.
SPEAKER_02:Yeah. And you hand it over to these models, right? Where you're fundamentally changing your programming paradigm from I'm going to specify, you know, what to do in every exact circumstance to a machine learned model where it's all about training it with high-quality data sets and then evaluating the outputs of those models in a way that is well correlated with good driving. Okay. So this brings its own challenges. You know, you need significant data. You need to be able to evaluate how well it is driving, but it allows you to generalize, right? And it's what's going to enable us to bring Waymo to cities like London and other places across the world without that much incremental work.
SPEAKER_01:So you've got all this data from all the sensors on the on the device, and you know, you're recording that. So you've got an enormous amount of training data, but what role do humans sort of play in that loop? So are you are you labeling outcomes as good driving and bad driving? I mean, how how does how does the human play a role in the case?
SPEAKER_02:Yeah, that's a that's a really good question. Um, we do need humans to do some of the evaluation, right? And to label examples of good driving, bad driving, or label examples of of a scene representation, you know, if perception outputs a particular detection of a vehicle, you know, you want a human to double check was that correct, or or maybe do some of the labeling in the first place to bootstrap the training. Now, what you quickly find is that doesn't scale, where you have you know humans doing every single eval and labeling task. You know, I mentioned we do 100 million miles of real world driving. Now, in simulation, we've done tens of billions of miles. So what you end up doing is building a lot of software to do more of this labeling task automatically and more of the evaluation task automatically. Now, humans still play a role to quality check those algorithms and models that are that are doing the evaluation. And so what's really interesting here is uh there's both the models that are running on the vehicle to actually generate a trajectory and and drive. And then there's what we're doing off the vehicle in simulation and also in you know post-processing of real miles to evaluate what is good driving, to sort of critique our driving. And there's models involved there as well, you're right. In fact, even more powerful models to critique the drive because you have the luxury of time, you're no longer latency sensitive. The whole stack is really moving quickly to these machine learned approaches. Humans are needed to label and to evaluate, but you try and automate that as much as you can as well.
SPEAKER_00:And so is this a fair metaphor? Like when I think about what makes a good driver or someone that I'd feel safe driving with, one of the things is like they have a lot of miles under their belt, they've got lots of experience on the road. And with all of these real driving hours and billions of simulated driving hours, we're basically giving the system more experience driving than like any single human could obviously ever have in their entire life. And that's what makes it safe or safer.
SPEAKER_02:Yes. I I believe the average human does what, like one million miles of driving in their across their lifetime. So we're over over a hundred humans worth of accumulated driving. And then it's also what I mentioned earlier, which is the computers don't get distracted. They never get tired, they never fall asleep, they're never looking at their phones, they're paying attention all the time, and then they have the benefit of this huge accumulation of driving data to train.
SPEAKER_00:Forgive me as a lay person and a lay person who doesn't really drive because I live in London. But I feel like my biggest takeaway so far is that I had always thought that in autonomous vehicles, like it's the hardware that's really impressive, or it's like a big hardware revolution. And I'm sure there's some very impressive hardware in there, but it's sounding really like it's a huge software-driven product and it's all that smart in the system, or what's really the revolution here.
SPEAKER_02:Yeah, it's it's definitely both. We have some quite brilliant hardware engineers. We have optical, mechanical, electrical, manufacturing. And what the hardware does is it makes the software problem easier. For example, with LiDAR, you can get a precise distance to a particular object, you know, down to the millimeter resolution. With camera, you could get an estimate of distance, but you have to infer it. You'd have to see successive frames of motion or have you know multiple camera images that you're doing, you know, stereofocusing. You can use techniques to infer, but it's going to come with higher errors, and that makes the software problem more challenging. So hardware has really helped make the software problem more tractable because this is this is really a tough engineering challenge. I mean, uh I've been at it for eight years. Waymo, I think, has been at it for 15 years or so now. And you know, the industry has been talking about doing autonomous driving going on since the 1980s, actually. There were some initial efforts. So it's it's an incredibly tough challenge. So anytime that you can make things a bit easier in hardware, you know, that's that's often a good choice. And over time, you know, we expect the software to be able to reduce the constraints on hardware and then allow us to, you know, down cost the vehicles and simplify things while retaining the safety and performance.
SPEAKER_00:That makes sense. So it sounds like you must have a huge wealth of types of engineers that work for Waymo with like all of them, their own individual craft. And if everything is optimized to be as high-tech or as sort of cutting edge as it can be, there's a holistic benefit of the code.
SPEAKER_02:We do, I absolutely love it. I've always been a software engineer that works close to the hardware. You know, I worked on Android for a long time. And I love building something that you can actually touch. You know, these these are robots, these are robotic systems that are extraordinarily complicated. And, you know, I love visiting the manufacturing lines and seeing the work that we're doing. Uh, I love talking to, yeah, we have materials engineers, we have tire experts, you know, we have battery experts. And the things you can learn about how you can we optimize charging, for instance, by discharging and recharging at the optimal rates. I love being a software engineer working amongst all of these uh the these hardware disciplines.
SPEAKER_00:Yeah, that sounds pretty awesome. I'm really outing myself here as someone who doesn't know much about cars, but if you've got all these sensors and radars and like cameras and the car gets dirty or like a bird poops on it, sorry, that's a really gross example, but I'm just thinking about it. Like, what happens if the LiDAR doesn't get affected by that or like the camera must do, right? So, how does that all get managed in practical terms?
SPEAKER_02:Yeah, no, these are these are great questions. And this we're driving at such scale, this is happening not just every day, but every hour. You can imagine a like a plastic bag is blown up off the road and then and covers the aperture of a camera. So the yeah, these things are very real, they do happen. We have redundancy, is the big answer. Any one sensor can completely fail. And usually the way it fails is not a hardware failure, it's usually environmental, as you suggested. It's it's yeah, bird poo on the aperture. But we have enough redundancy that we can keep driving even with you know one or more sensors dropping out. And now what we do is is we'll adjust to those conditions. So we may drive at a lower speed. If it's severe enough, we may return to base as quickly as possible. Uh, in some cases, you know, we'll have to pull over to the side of the road. But uh, you know, we have redundancy to be fault operational for all these sorts of things. This is part of the challenge, right? Of making of deploying these systems at scale. You need to have systems that keep driving for across a huge range of faults.
SPEAKER_01:Can I take us back to the choice of Phoenix and the choice of San Francisco? Seems like a good time to sort of talk about the environment. So, you know, what made Phoenix sort of a great test case for for Waymo and then and then why did you choose San Francisco next?
SPEAKER_02:So the Phoenix choice predated me joining Waymo, but uh, you know, I I imagine it was sort of seen as a sweet spot of okay, there's some challenges there, there's there's a reasonable business there, but it's also achievable. This is back sort of 2015, 2016, I imagine these decisions were made because it's a suburban layout. It doesn't have quite the challenges of a New York or a or a San Francisco in terms of density of pedestrians and agents. So that was a great place to get a system deployed and also build all of the surrounding pieces that you need. This isn't just a self-driving vehicle that this is an application, much like the Uber app, to hail vehicles. This is an operations component to it to charge the vehicles, clean them. So there's all these other components to building a ride-haling ecosystem that Phoenix gave us an opportunity to really develop. But then San Francisco was a significant jump and it was very deliberate to go there next to really pick a city that's significantly more challenging. In fact, it's really a sort of step function above Phoenix. In suburbs, you can kind of drive on rails and sort of nudge for okay, if someone's encroaching on the shoulder, you sort of nudge around them, but for the most part, you're driving within the marked lanes. As soon as you go to a dense city, London, San Francisco, you are very frequently needing to generate quite creative routes amongst the traffic. You know, you'll encounter double-parked vehicles and you have to decide, am I going to try and overtake it? Uh, what route will I pick to overtake it without causing gridlock? Or you'll be at an intersection which is just completely in gridlock, and you have a short green, and you have to decide, am I going to try and push across the intersection to the other side, or am I going to just stick here? Uh, you really have to behave a lot more human-like. So this was a significant challenge and really required bringing a lot more AI and machine learning to the planning side of things.
SPEAKER_00:Are there cities, Nick, that you just can't see Waymo ever wanting to try and tackle? Like I'm thinking Manila or Mumbai, where it's just pretty chaotic and unexpected things happen, and maybe the weather is a bit trickier. What's the view on that?
SPEAKER_02:At this stage, no, I don't think any city is off limits because of the advances in using AI in planning, that we're able to navigate these very complicated scenes. Now, there'd be more training required for a city like Manila. There's just a more diversity of objects on the road, a different style of driving that's required. There'd probably be some different evaluation that's needed to score what is good driving and what is bad driving. But that's an achievable goal now. I was going to just mention weather as well. So it's notable that right now the cities we drive don't handle a lot of snow and ice. But we have announced that we are launching in cities that do have winter snow and ice. That is a problem we've known has been on the horizon. And now we have some of the right techniques. Software is very much involved, but when it comes to weather, hardware is also a big part of the solution.
SPEAKER_00:That's pretty awesome. And I guess without stating the obvious, there's obviously the what is technically possible. And it sounds like there's a soul for many different things. But then there's like what's the commercial attraction of X City? Where do you think you'll resonate best with consumers, that sort of thing? And that must play into how you choose which city Waymo, you know, launches in next.
SPEAKER_02:Yeah, I mean, fundamentally, we're we're a business that has spent a fair bit on RD and we need to make that money back, make it a viable business because we can't scale unless it's a viable business. So as a ride hailing business, you look at the markets where other ride hailing companies are making a lot of their money, and those are typically great markets for Waymo as well.
SPEAKER_00:Is there anything that's much more difficult to solve for than we as lay people might think? Because I've taken away that safety is actually much easier than the average person off the street would think. But is there anything that's sort of the inverse? Like it should you'd think it should be super easy and it turns out to be hard.
SPEAKER_02:Safety is easy if that's all that you cared about, because you would just park the vehicle and it wouldn't move or it would creep forward at a very slow speed and you know stop if a butterfly was uh seen at 100 meters. But now what's what's hard is to solve for safety in combination with you know behaving human-like, you know, being a predictable road user and getting you to your destination on a reasonable time. Uh what you end up doing as you zoom into the different models involved is you're often making trades on what we call the precision recall curve. So I mean let's take a sort of simple example. Uh let's say we're looking at the perception system and it's trying to make sure it's detecting cones, traffic cones placed in front of it. So precision would mean that if it thinks it sees a cone, then it's really a cone. Okay, that it's not a tree, that it's not some other object. It's precise that that is definitely a cone. Now, recall is that you didn't miss a cone. There was a cone in front of you, and the and the perception system fails to detect it, what we call a false positive. Uh trying to tune these curves, trying to tune systems to produce precision recall curves where we can pick a good operating point on there that allows us to detect what's in front without ever missing anything and never get confused and think that there's something there that's not, so we don't make forward progress is extraordinarily hard. And the cone example is kind of simple, but there's much more subtle examples when you can think of vegetation, you think of debris on the roads, you think of steam coming out of subway vents, where it's very hard to, you know, at scale get the right trade between making sure you always see something that you should stop for and that you don't unnecessarily. stop for something that you should proceed through. And when you get the trade wrong, it could be dangerous in both directions. Okay. If you don't see something, you could have a collision. If you think you see something that wasn't really there, you hit the brakes too hard and then you have a tailgator collision. So it's it's dangerous either way.
SPEAKER_01:And is it is it fair, Nick, to to talk about sort of recall precision in one way or does it vary across the systems? Because I imagine I mean like Google search obviously is a high precision system, right? You run a query and you just want great answers. You don't necessarily want all the answers. I imagine Waymo could be a high recall system in that you boy, you better, you better see all the objects. But if you think a cone is a a small tree, is that okay? So or or is there different types of curves across the whole system? How does it actually work?
SPEAKER_02:So definitely the the optimal places on those curves and the targets for your level of recall, your level of precision would vary by what subsystem, you know, I gave an example on perception that we have similar curves looking at the output of planner, for example. Yeah, we're obviously far more sensitive to missing a human than we would be to missing a traffic cone, for instance. So this is not a single kind of global optimization. This is a huge number of metrics and scenarios that we're having to reason about.
SPEAKER_00:The example that gets cited or a version of the example that gets cited when you talk to people sort of off the street about autonomous vehicles is like, oh well, what if it has to choose between like hitting an old lady and hitting you know a parent with a child. I think you're going to just tell me that that's not how it works. But I know that that will be sort of front of mind for listeners.
SPEAKER_02:Yeah, this is the classic sort of framing where there's an implication that there's a moral judgment being made by the programmer and then a sort of well you know who's making that moral judgment. But it's not actually how the system gets built. I described earlier you know using if statements to build a self-driving system. And maybe in that kind of construction you would have such a determination being made.
SPEAKER_00:Right. Because you could say if it's small, avoid it to avoid like children. But then you're I don't know you're biased against taller people, which is problematic in different ways, right?
SPEAKER_02:Right, right. So how these systems are actually built is as I mentioned with with with these models that are trained on large amounts of data. So it's really about you know evaluating the the best outcomes given different data sets. And the sort of artificial example of do I drive into the old lady or do I drive into the small child? It's a sort of a contrived example that doesn't really play out in a in a data set. What you see in a data set is a rich scene with many other options available and the evaluation would be prioritizing well you avoid both of them right but it it's just a sort of overly simplified scene that doesn't it doesn't really reflect the data that we train on. There are many other options available to you know have have evasive maneuvers.
SPEAKER_00:Okay so Nick before we wrap up the technical section of this episode I like to ask is there anything that's cool or particularly insightful or interesting for the listeners that I've not asked because I wouldn't want to limit us to the question list that I came up with before the episode if there's something interesting beyond that.
SPEAKER_02:Yeah I mean to me this is just one of the most interesting engineering challenges of our time because of the well the the complexity but the breadth of engineering involved and I've touched on some of the the hardware side of things but you know also on the software side we have these real-time safety critical systems on the vehicle as well as you know rider experience and you know user interface systems in the vehicle then we have the the mobile app and we have a lot happening in the cloud uh there's both the sort of ride hailing system that you can think about matching demand and supply and having a efficient marketplace there, you know, much like other ride hailing companies do, but then also the simulation systems and the log replay and the ability to visualize and play back what's happened in the field. There's such a rich ecosystem of tooling and infrastructure off the vehicle as well. I don't know if I've ever worked on a project with such a span of different software systems. It kind of brings every single software discipline together as well as many different hardware disciplines. For those thinking about you know next steps in in in career I think autonomous vehicles are a fascinating one. And although we're at this inflection point where the technology is starting to work really well and starting to scale there's still so much left ahead. You know this reminds me of working on Android in about 2009 when it was just starting to hockey stick you know we were just starting to see the the sales really follow exponential curves. And I remember at the time people on the team thinking like, oh are we done like we've kind of built most of this now like and and it's working but are we done? But no there's always more to do. There's always more to make it more efficient to work in more environments to improve the user experience to make the system more efficient and bring costs down. You know I I think we've actually only just begun on this journey and there's so much more interesting work ahead.
SPEAKER_00:It's like cutting edge but then if we think about what Android probably looked like in 2009, oh my goodness, isn't it great that we didn't stop there.
SPEAKER_02:I think the Android team in 2009 was maybe around 200 people and that team now is in the in the tens of thousands. So the bulk of the work is is actually yet to come. You know we're moving from the phase where it's a bit more RD, a bit more like will this work? Which is which to me is super exciting. Yeah I I like working on things where there's a chance it won't work. But now we're transitioning from that into you know it does work and now how well can we make it and how far can we scale it?
SPEAKER_01:This is a very foreign concept I think to Australians. I don't think there's too many Australians who probably driven in a Waymo or any autonomous vehicle but I imagine that's not the case in Phoenix and not the case in San Francisco. So those populations must have really adapted to Waymo. So maybe maybe you can tell us a little bit more about that.
SPEAKER_02:Yeah I think we're at this really interesting point where in the cities where we have high density which is San Francisco and Phoenix this is a mainstream product you know if you talk to people in those cities they know Waymo they use Waymo many people using it very regularly it's really ingrained in their lives yet for the rest of the world probably seems like a foreign concept something that they read about and doesn't it doesn't seem that real to them that is going to change rapidly over the next one two and three years as this rolls out to more and more major cities you know as as we've announced London. So I I think this is a such an interesting moment in time where it's it's mainstream to a small number of people but will very quickly become well known I think in five years' time everyone will be looking back and won't remember a time before we had autonomous vehicles like the smartphone is is today just ubiquitous.
SPEAKER_01:Do you think mixed cities change when that happens so you know I can imagine in the limit there are no cars operated by humans or all cars are driverless. I guess that'll probably never happen completely but in the limit does this change congestion, city design, those kinds of things?
SPEAKER_02:Yeah it it it absolutely will um smartphones were able to roll out very quickly because they're they're easier to manufacture that this is going to take a little longer for all the impacts to to play out because we're talking about city design we're talking about manufacturing of much larger objects we're talking about safety critical system that you know has a lot of engagement with regulators as well but this is the direction it won't just be ride hailing it'll be personal car ownership it'll be trucking it'll be all forms of transportation over time you can imagine some quite high percentage of cities right now is dedicated to to vehicles and especially to parking I believe it's in in the 30 to 40% range if you look at a city by real estate is dedicated to parking wow and you know with autonomous vehicles better utilization of the vehicles so there's less parking and when you do need to park them you can easily move them out outside of the city. So what this will mean for how cities are laid out and and real estate is is quite dramatic and this will be significant. Also to people's lives I think this will make the world feel smaller, you know, much like the jet plane did or the automobile did originally it'll become easier to get from A to B because you can use that time much more productively and you can know you're getting there much more safely. And I'm sure over time we'll see a sort of abundance of options. You could imagine autonomous vehicles that have have beds that you can sleep in. I I don't know what timeline that is on, but that's that's clearly you know where we're going that you can work, you could play, you could sleep in in in these cities and just make the world feel feel smaller. So it's very exciting to be working on something that's gonna you know will is going to have this impact over time.
SPEAKER_01:And you do you think they'll be sort of more swarm-like too I mean like you know obviously we've talked about a car and its hardware and its software and decisions it makes but I guess in a world where there's lots of driverless cars they can all communicate with each other and take advantage of each other's sensors and there could even be a standard where you know different companies share data and whatever else do you think this becomes sort of more swarm-like in some interesting way as well do we need traffic lights?
SPEAKER_02:I mean it's lots of things are going to change right yeah I ironically this gets easier as more vehicles are autonomous because some of the harder challenges are dealing with unpredictable human-driven vehicles if they're more autonomous that then then this becomes easier and and then you can take some of those opportunities you can imagine on highways platooning the vehicles such that they are far more fuel efficient because they are just sitting in each other's draft or yeah not needing traffic lights because they are just communicating with each other as they approach the lights and just you know in a sort of beautiful dance sort of synchronizing their trajectories. Now this is going to take some time you need to get to some critical mass where the vast majority of vehicles are autonomous for this to really work. And that's why you know as I said earlier so much of this work is yet to come. We have the basic autonomous system functional but so many of these benefits that we can that we can gain are still ahead of us.
SPEAKER_01:So Nick I know we're getting pretty close to time but I we haven't spoken much about you and and your role at Waymo so you're you're a director at Waymo. So what actually are you personally working on?
SPEAKER_02:Yeah so I actually bridge both the hardware and software organizations I report into both the VP of hardware and the VP of software where I spend a lot of my time recently is on our reliability systems, our backup systems. You know when something goes wrong making sure that both the hardware and software are fault operational. This is actually really important for freeways. We're doing a lot of testing on them. We hope to bring them to the public very soon what's hard about the freeway is you need to make sure when you're at high speed that if anything goes wrong, the vehicle behaves safely so this is where the backup systems become really critical and we have some quite an amazing capabilities now in terms of significant parts of the vehicle can fail. It could be for a software reason it could be for a hardware reason it could be an environmental reason and the vehicle will continue driving and you know exit the dangerous environment like the freeway and get onto a safer environment. So yeah I love kind of working at the at the blending of hardware and software to make those sorts of responses possible.
SPEAKER_00:It sounds like Nick you love to find the most difficult problem ever to solve and spend your time thinking through the best way of solving it.
SPEAKER_02:Yeah I I spend a fair bit of time thinking about what is the worst thing that could happen. What is the worst scenario the vehicle could find itself in and how can we make sure the vehicle does the right thing in that really bad scenario.
SPEAKER_00:Amazing so that probably brings us to time I'm very much looking forward to checking in in one to two years on whether or not Waymo's taken over the world I'll certainly be opting in early in London as soon as I can to give it a try.
SPEAKER_02:Fantastic thanks so much for hosting me Hugh and Hannah it's been been a pleasure and yeah I look forward to to bringing Waymo to to London and more places.
SPEAKER_01:Yeah awesome great to see you Nick thanks for your time so that was awesome Nick is clearly super smart like somehow one foot in hardware and one foot in software which from what I understand is no mean feat and an awesome build on episodes gone before yeah it was and I'd I'd say to all of our listeners you know go back if you haven't and listen to our episodes on AI apps and coding because they're just such a fantastic foundation for the conversation we had with Dick today and really build your understanding of this whole broad ecosystem but what a great guy fantastic.
SPEAKER_00:Yeah and he definitely I know we talked about London a lot but Nick at least as far as I could tell was teasing like a true Waymo takeover. So if you're not in San Francisco or London Waymo could be coming to City New You that is pure conjecture not officially signed off by anyone at Waymo but that's just my own inkling and what a great episode to end on which was sort of the perfect summation of so many of the things that we talked about and also a tease for a lot of cool stuff that we could cover in season two.
SPEAKER_01:Well if you enjoyed the Tech Overflow podcast you can learn more about us at techoverflowpodcast.com makes sense as a URL I guess and uh we're also available on LinkedIn where uh I post more frequently than Hannah does on Instagram and X.
SPEAKER_00:I actually posted to our Instagram story twice this week but he's not wrong and leave us a review recommend us to your friends we really want to come back for a season two and that will all be down to you guys. So do us a solid because I really want to go to Melbourne. See ya. See you Hannah bye