Tech Overflow

You’ve Already Lost Control of Your AI Data

Hannah Clayton-Langton and Hugh Williams Season 3 Episode 4

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 35:25

We compare how we actually use ChatGPT (and Claude) every day and why most people treat LLMs more like a personal helper than a work automation tool. We dig into what happens to your data after you hit Enter, from memories and human review to cross-border storage and training settings. 

We cover several topics:

• Our top real-world use cases for ChatGPT and why they are mostly non-work 
• How ChatGPT memory works and what it can infer about you 
• The "asking, doing, expressing" framework and why “expressing” feels new (it's not something you've ever done with Google)
• What the under-26s' usage stats suggest about adoption and behaviour 
• Where your prompt data can be stored and why multiple jurisdictions can apply 
• Why companies keep multiple copies of data and what that means for control 
• How human-in-the-loop review works and how incredibly rare it is 
• The ChatGPT “improve the model for everyone” toggle and what opting out changes 
• Personality and tone settings in ChatGPT plus the risk of AI-fuelled echo chambers 

If you've liked what you've heard, please, please, please like, subscribe, leave us a review on Spotify, Apple Podcasts, and share with your friends, family, anyone who you think might be curious about how tech works. 

And if you'd like to learn more about the show, you can follow us on our socials. We are on LinkedIn, X, Instagram, TikTok, and YouTube Shorts. And of course, we've got our own website, techoverflowpodcast.com.


Like, Subscribe, and Follow the Tech Overflow Podcast by visiting this link: https://linktr.ee/Techoverflowpodcast

Hugh Williams:

It's not something we're used to doing with our computer. So I think this is kind of new.

Hannah Clayton-Langton:

The whole rhetoric around the power of AI is about how it's the way that we work.

Hugh Williams:

Once you type your data in and you press that ender key, you're setting your data free, and you know, you've kind of lost control of that.

Hannah Clayton-Langton:

Hello world. Welcome to the Tech Overflow podcast. I'm Hannah Clayton Langton.

Hugh Williams:

And I'm Hugh Williams, and we are the podcast that explains technology to curious people. How are you, Hannah? How's life?

Hannah Clayton-Langton:

Life's great. I'm excited to revisit our favorite topic of AI today. Um, how are you?

Hugh Williams:

Oh, I'm well. I'm well. I've been in lots of prep for this episode, including my own use of ChatGPT and Claude and uh all kinds of things, reading some research papers. It's been uh it's been a ton of fun, but um really pumped to get into this episode. And as you say, great to be back on AI. I mean, we did those two fabulous episodes in season one. If you want to check those out, they're episodes six and And then one of our best rating episodes was episode two in this season, where we talked about vibe coding and agentic So it's just great to be back on the theme, Hannah.

Hannah Clayton-Langton:

Yeah, love the theme. I find that the AI topics are the ones where I feel like my chemistry changes the most because I learned so much. Today I'm very curious for us to get into how people are using AI. And I guess when I say AI in this context, I really mean LLMs, like I think a lot of people do. So, how are you using it? How should I be using it? Uh, I think we've got some external sort of stats on how other people are using it. And should we be worried about the ways that we are using it?

Hugh Williams:

And as your resident expert, Hannah, I'm I'm really keen unpack what happens when you type data into one of these Where does your data go? How is your data used? Where is your data stored? How long is it stored for? And maybe give our listeners a little bit of advice and around sort of how to how to think about that data. So I think this will be a fun, this will be a fun episode.

Hannah Clayton-Langton:

100%. And our first episode of this season gave me pause for around my data in general. friends that listened that they felt the same way. So uh we're about to make that existential crisis a whole lot worse, guys. Talk about LLMs. Well, when it comes to how I'm using AI, I did prepare earlier, which maybe be a good place to kick us off. So I actually asked ChatGPT how I use ChatGPT. I asked for it to summarize it. Um, I don't know if that's allowed, but here we are. And it gave me my top three use cases. So I will walk you through them if you're interested.

Hugh Williams:

Very interesting.

Hannah Clayton-Langton:

Okay, so first one, everyday decision support. That's quite vague and not a huge surprise. Makes practical decisions faster, outsourcing research. Actually, I just bought some new running shoes. So ChatGPT helped me choose them. Second related theme is health and performance curiosity, exercise physiology, bit of cold water swimming. If you've listened to earlier episodes, you know that I I was curious about cold water swimming.

Hugh Williams:

In London? Seriously?

Hannah Clayton-Langton:

Hugh, in London in March, I went for at least 60 seconds in water. It was really cold.

Hugh Williams:

That's that's ice bar territory. Anyway, okay, go henna. Yeah.

Hannah Clayton-Langton:

The third one, which is probably the most flattering, is exploration. So it says here economic or career topics, language and and AI concepts, no surprise there. So top three, I think that's a fairly accurate of myself from my trusty in-pocket decision helper, aka

Hugh Williams:

That's a great set. You should be very, very proud of that third bucket, language and etymology. Look at you go. That's awesome.

Hannah Clayton-Langton:

I do study languages at university, so there were I didn't that I was actually still thinking in that way, but I must be. Yeah.

Hugh Williams:

I didn't do the same homework, but I'll tell you what I do, Hannah, and maybe our listeners can follow along with If you've got ChatGPT installed on any device, you can go into the settings. And if you scroll down, you'll see an option called If you click on that and then you scroll down again, you are going to see an option called memory. If you click on that, and then you click again on manage you'll see a bunch of memories that Chat GPT has from all of the sessions that you've had with ChatGPT. So these are things that it thinks about you. And um, I scanned these and I came up with my top three Bit of overlap, I think, Hannah.

Hannah Clayton-Langton:

Okay, and just to be clear, these aren't prompts that it's These are like snippets of information it's gleaned about you as a person.

Hugh Williams:

Yeah, correct. So these are sort of meta-level summaries. So it's a little bit like the prompt that you ran, Hannah, to find out the top three categories. If you like, these are sort of meta reflections on the of things that I ask. Maybe later on in the show we can talk about how these are behind the scenes in ChatGPT, but uh, you know, maybe not important for now. But uh it looks like if I if I bump up a level and try and the top three, I would say I spend a lot of time talking to it about music and musicians. Very cool. And a lot of time talking about recipes, nutrition. So for example, it's figured out that I don't like eggs I like a breakfast smoothie with increased yogurt and Sounds true. So lots of optimization, things about cafe lattes, all kinds of very Melbourne things. And then a good amount of uh, you know, fitness running stuff, including lots of memories about park run. Because I love a park run on a Saturday and how many park runs I've done and where I've done them, and suggestions for park runs and all those kinds of things.

Hannah Clayton-Langton:

That sounds pretty accurate, to be honest. So you're using it for everyday queries, it sounds like.

Hugh Williams:

Yeah, yeah, yeah. And it's uh it's got quite a few memories in here that I probably delete. So it's um it says that I'm going to London for three months at the end of June. I think that was two years ago. So I could probably safely delete that. But you know, there's a there's a few things in here that uh off base, as it were.

Hannah Clayton-Langton:

Okay, so you can delete the memories and they sort of form there and sit there and reinforce the output. So like when we were coming to Melbourne, I asked for travel and it was like, oh, do you want me to recommend some hikes? And that's probably because it's worked out that I like hiking.

Hugh Williams:

Yeah, absolutely. And and likely what's happening behind the scene here is when you type in a prompt into ChatGPT, behind the scenes, secretly appending these memories, and then these memories used in the in the context window in the LLM to produce answer. So it makes it feel a little bit more like it knows you, I when it's when it's giving you back the answers using memories.

Hannah Clayton-Langton:

It's pretty cool that you can go in and see them. So I will go and do that. I'm not gonna do that live in case it's something I don't wish to reveal, but I'll go and do it after the episode.

Hugh Williams:

Yeah, be careful with it. Uh there's definitely some personal stuff in there that I gonna read out on here, Hannah.

Hannah Clayton-Langton:

Yeah, Hannah, you are a hypochondriac. Okay. Um, Q, from the research that I did ahead of the show, it sound like either of us is dissimilar to the norm. So most people are using Chat GPT for the same sort of information queries and not necessarily using it for automation of tasks or coding, which is the U case, the use case you hear a lot about. But I'm like asking it what running shoes to buy, and so are people.

Hugh Williams:

Yeah, that's it. Um, there was a paper that came out of OpenAI, I think September last year, sort of a white paper. So not a peer-reviewed piece of research, but a white paper that came out. And uh first thing is that almost 80% of conversations into three buckets practical guidance, seeking and writing. And so I think you and I, lots of practical guidance, lots of seeking information, but we probably didn't have in our top three writing, you know, so whether it's refining things, emails, you know, improving documents. And they also interestingly proposed a new about how these tools are used. And it's the asking, doing, expressing framework that up with. And what they figured out was that 49% were asking a question and getting back an answer. Uh 40% of the prompts were getting it to do something, uh, then 11%, and this is super interesting, I know you love Uh, 11% is people just expressing something. So that's neither asking a question nor asking it to do it's just sharing a feeling with ChatGPT.

Hannah Clayton-Langton:

Okay, so to put that into real terms, let's take a cooking use case. Asking is like a simple question. So it might be like, is there cheese in pesto?

Hugh Williams:

Yep.

Hannah Clayton-Langton:

Right.

Hugh Williams:

100%.

Hannah Clayton-Langton:

And then doing would be like, create me a dairy-free pesto

Hugh Williams:

Yeah.

Hannah Clayton-Langton:

And then expressing would be.

Hugh Williams:

I love the pesto.

Hannah Clayton-Langton:

Yeah, and that's it. I mean, I don't I think the expressing is the thing I'm doing the least. It's kind of hard for me to get my head around. But you said 11% of interactions are that expressing

Hugh Williams:

Yeah, exactly. People just saying something to their LLM that's a question or asking you to do something. So, you know, hey, I love the pesto. I love my running shoes.

Hannah Clayton-Langton:

Yeah. I guess maybe maybe pesto and running shoes aren't the best expression, whereas expression might be like, I'm really and overwhelmed. And sometimes people just say that to their LLM in lieu of it to anyone else.

Hugh Williams:

And I think this is pretty unique. Like, I think this is not something that we're all doing with Google. It's not something we're used to doing with our computer. So I think this is kind of new.

Hannah Clayton-Langton:

And listeners, just to give you a bit of an a behind-the-curtain look into the podcast and Hughes and my's relationship, is I'm very fast and loose with how I think about my data. And so Hugh and I had like a big debate the other day of what's the big deal with these LLMs? Like, I'm giving all my data away. If there's one thing I took from episode one of this season, I'm giving my data to everyone anyway. And this expressing point is the one where you sort of won me round in our debate because that isn't anything that's Google or in WhatsApp or in your metadata from Google Maps is this whole like emotional piece, which is I don't know if it expected with these LLMs, but like it's certainly taken me by surprise.

Hugh Williams:

In the OpenAI paper, they dug even deeper, Hannah, and they figured out that about 1.9%, so just under 2% of all are sort of relationships, personal reflection that people are having. And about half a percent is role play and companionship. So people kind of, I guess, trying to have a relationship their LLM.

Hannah Clayton-Langton:

Okay, well, I have a separate stat that there are two and billion prompts a day to Chat GPT alone of a user set of two and three hundred million users. So what did you say? Like one and a half percent sounds fairly low, but in absolute terms, that is a lot of people, or maybe a smaller number of using it a lot for this relational emotional support which is that is bizarre.

Hugh Williams:

With those numbers too, actually, in this September paper OpenAI last year, the latest number they had was 700 users per week, which maybe is 300 million a day, maybe too far apart. And that graph was going up to the right and almost vertical. So I think we could probably safely assume now that certainly more than a billion users a week using these and probably that 2.5 billion is probably more like three a half, four billion today. And heaven knows what it will be by the end of uh this

Hannah Clayton-Langton:

Okay, wow. So quick follow-up on this whole expressing use case. You've got two daughters who are in their early 20s, so like in the younger generation, very tech savvy. Do you think that they use it? Not necessarily to say they're using it for the expressing case, but do you think that they're using it differently to how you would use it or I would use it? Because actually, between me, you, and your daughters, we span three different generations, right?

Hugh Williams:

Well, first of all, here's the big stat uh almost half messages to Chat GPT, half of the prompts come from people are under 26. So that's the biggest difference.

Hannah Clayton-Langton:

Half of those prompts are coming from people under the age of 26. Wow.

Hugh Williams:

Yep, which just shows you that there's very, very heavy usage, for example. So lots and lots of people using it for homework, writing, social media posts, messages, emails, if they're sending any. It's open all over their screens, whether it's the smartphone or not, uh, and it's the copilots through life. Because the other really interesting stat that came out of paper is that 73% of the usage of Chat GPT is non-work. So I think the myth out there is that it's a work tool, it's a productivity tool, but the reality is in fact that it's three quarters non-work.

Hannah Clayton-Langton:

That is interesting because the whole rhetoric around the power and the disruption of AI is about how it's changing the way that we work. And whilst it certainly is, and I think I mentioned in a episode, like my husband's work have now implemented an AI where if you don't use AI every week, they call you up on it. So there's some of that. That feels a bit contrived though, whereas actually people are reaching for it to solve everyday problems that aren't work I think that certainly fits my pattern of use. I'm definitely using it more for recipes, workouts, general I'm using it to explain things to me more than I'm using it summarize documents. I actually think in that when I asked for the summary, the that I mentioned up front in the episode, it actually called out you're not using this for automation or summarizing. You're using it like more conversationally.

Hugh Williams:

And I guess, look, you know, if half the prompts are coming from people under 26, then I guess it's not all that that it's more non-work use than work use, given likely that population is still studying, doesn't yet have its first full-time job, and so on. So I think those things are correlated, but you know, it's a personal productivity tool used by young people.

Hannah Clayton-Langton:

Okay, so follow-up question. If it is much more personal queries and in some cases like expression, is it worrying that we're pumping all of this language models, which presumably they're collecting and for various different things? Like, is that a concern? Because I care less, if I'm honest, about what company documents I'm putting in. By the way, I am putting it into the enterprise Chat GPT, so licensed with all the right things, but that's less of a concern to me than some of the more personal stuff that I might be through with ChatGPT.

Hugh Williams:

Yeah, and maybe let's just talk about that a little bit. What happens when you actually do type into Chat GPT, Gemini, whatever it is. So first thing is you're in the UK right now, I'm in There's absolutely no guarantee that the data that you're in is actually stored in Australia or the UK, respectively, right? So that's that's not guaranteed. Quite likely that it is, given, you know, they're both big for these companies. But, you know, it could just as easily be stored in Europe in your case, um, or it could be stored on the west of the US in my case. So the data is potentially leaving the country and somewhere else, and you have no control over that.

Hannah Clayton-Langton:

Okay, so on that basis, Hugh, thinking about data collection storage, what data policies or data protection laws is my data in the UK subject to? Is that based on where the data is being stored or is that on where it's being collected?

Hugh Williams:

Yeah, both. Both, Hannah. So let's imagine we'll make up a fictional case, right? So let's imagine that your data that you're typing in is in the EU somewhere, and obviously the UK is famously not of the EU anymore. So, first of all, the company is offering the service in UK to you, and so they have to comply with UK laws. So there's going to be some privacy laws in the UK, and the company is operating their service in the UK, they have to make their best effort to comply with UK privacy laws, they are. Right. So that's the first thing. The second thing is because the data is fictionally the EU, they also have to comply with the storage rules around data that's stored in the EU. So whatever the privacy limitations are, the anonymization are, the retention limitations are, they have to be to your data because it's stored in the EU. And it could be subpoenaed in the EU. So if uh if somebody wants to access your data, they they could force the company to extract it from their EU center and give it to them, even though you never knew your data was stored in the EU. So two things are applying at once.

Hannah Clayton-Langton:

It's worth then knowing what different products are coming of which countries, right? Like I know this came up when Deep Seek that AI model that out of China, that must have been sort of the beginning of year. And we got some warnings at work about what to use it for from like an IP perspective. But presumably, countries like China, look, we know they have a different approach to data. And so if you're using a model that comes out of China or is related to China, then it will be subject to a different on data protection.

Hugh Williams:

Yeah, that's right. That's right. And obviously, you know, different countries all over the have different approaches to these things. And you know, and some might feel familiar. So if you're in the UK, the EU privacy laws will feel pretty familiar, and some might feel a lot less familiar. So let's continue our fictional example here. So let's imagine that you're in the UK, you type a prompt, data's stored in the EU, but behind the scenes, the folks at Apenai want to, you know, they want to inspect some prompts, and by chance they inspect your prompt and they It could be evaluated in Poland, it could be evaluated in it could be evaluated in the UAE. Um, it just depends on where they've hired that contingent to evaluate the prompts and provide them with feedbacks on So your data could simultaneously be subject to the rules of three different places or potentially even more.

Hannah Clayton-Langton:

Well, I was gonna say if I'm in the US or a different country outside of the EU for work, or but let's say when I was in with you, I'm a UK user, which it must know it's processing data, I guess, based on where I'm launching the query, and could be pitched all around the world for the human in the loop validation if it gets called out for that. And as we've talked about in previous episodes, that'll be like a low-cost workforce that could be out of any of the that you just mentioned. So there's a lot of complexity in how it's being And the tr the reality is we don't know, and they can do all and it's lawful because as long as they're not breaking any they can, you know, I'm handing my data and it's up to them use it within reason, how they need to or see fit.

Hugh Williams:

Yeah, absolutely. And and even with their best efforts, it can be very complex for them to manage these data sets across the world. And that's why their legal folks get paid an enormous of money and uh why they need very, very smart people working in their companies, because this is very, very complex. And the technology actually makes it even more complex, So, you know, back in the old days when I was working at we used to store three copies of everything. And the reason we did that is because we were storing data very cheap computers that are very, very unreliable. And so you want to have three copies of all of the data. You want to put those in smart places just in case you lose one of the copies because the computers are unreliable. And then if you lose one of the copies, you furiously make copy so that you've got three again. And that gives you what's called 11 nines of reliability. So you have 99.99999 up to nine nines after the decimal of reliability. So you have almost no chance of losing any data ever if have three copies. Folks like OpenAI and Enthropic today are well aware of And so they are also making sure that they have multiple of the data, and they're pretty smart about where they store them. They're more sophisticated than we were back in So we used to just make sure that we had them in different of the data center, maybe across two data centers. They're really, really smart about different geographies, data centers. And so there's even more complexity. So my hunch is they don't just have three copies of most of the data, they probably have four, five, or six copies, and those copies are deliberately going to be in very places in case they lose a whole data center or maybe even a whole region of data centers. So your data is everywhere, Hannah.

Hannah Clayton-Langton:

And when you think about the complexity of the different they need to be preparing for, as you say, like it's really because we're such a global world these days. You're downloading an app or you're primarily using an app in one location, your home location, the app's from a different and then you're traveling around and you're subject to sort of all those jurisdictions and there are specific requirements of companies that collect data. So that is sounds like an enormous headache, to be honest with you.

Hugh Williams:

Yeah, enormous headache. Enormous headache. And you know, you've got to remember these companies are sort of startup growth kinds of companies. So they're gonna be doing their best, but it's also gonna be quite chaotic inside these companies. And so they're definitely gonna be moving fast and breaking things, as they say. And, you know, they're gonna be a lot of tension with the teams and a lot of tensions about where they launch and they do. Um, so it's gonna be a very, very complex world when it to trying to make sure that they're complying with all of laws and making sure that the data's stored where they think it is, and uh, you know, so on and so forth. But definitely, once you type your data in and you press that ender key, you know, you're setting your data free, many different jurisdictions, many different data right across the world. And, you know, you've kind of lost control of that.

Hannah Clayton-Langton:

One last point on data before I move us along, and I don't there's a definitive answer to it. It's just an observation of mine is that with things like LLMs, no one has more resource or better infrastructure to progress than these companies, right? They're commercial companies, they're getting tons of at the minute. On the flip side, like government agencies, in particular legal departments of government agencies, are not known efficiency. You know, they're very thorough, they might not be super resourced. And there's a weird sort of disparity there because I just see a world in which it all keeps up to the pace of the tech And I don't know whether that's a problem or not, but it's it's definitely there's definitely a contrast there.

Hugh Williams:

I think it's super problematic, Hannah, with all due to smart lawyers who work in governments. These are not AI professionals, right? They don't understand the tech. Technology in depth, they may not even be listeners to this podcast.

Hannah Clayton-Langton:

Well, but also the the most expert people are the ones working in the companies, right? So even when it comes to maybe we'll get into it in more later, but the the recent like Pentagon versus Anthropic, when there was debate around the use case of LLMs in autonomous and the government wanted the final say on whether or not the LLMs were safe to use. No one knows the LLMs better than the people who built and with the LLMs, but they're all in the companies running the They're not in all of the sort of adjacent medical boards and government agencies. So there's a pace disparity there that's problematic.

Hugh Williams:

And look, it brings me to one thing that I really dislike Australia. You know, I've I've spent the last nine years, I think, now back in Australia after my long stint in the US. And one of the things that infuriates me most about is that we try and legislate what these companies can do and stop them innovating. I much prefer the US climate where the companies are a bit more freer to do the things that they want to do. And then if they do things that seem outside the bounds then people sue people. And so, you know, somebody will sue somebody else and send a legal precedent that says that's where the boundary And so it's it's a lot more free and loose in the US, and you know, it's more of a reaction to what happens. Whereas I feel like in Australia, we'll have the Australian AI policy bill, and we'll we'll debate this thing and release it, and we'll constrain all innovation before the even happened. And it's incredibly infuriating, Hannah.

Hannah Clayton-Langton:

It's an interesting debate, and one other facet of it that I about quite a lot is that this AI revolution is taking place a Trump administration in the US. And I'm wondering, like, what the heck kind of bearing is gonna have on where we land in the future on all this stuff? Because it's certainly the most unique administration of my Yeah, we're having this huge societal and technological in the way we live our lives. And the US is normally like a real player in shaping how these things unfold. And it's just quite a unique world we find ourselves in on front.

Hugh Williams:

Yeah, yeah, absolutely. And look, it's a fairly pro-tech administration, I guess, is one thing that you might observe from a distance. Um, so it certainly will be interesting to see how it how it plays out.

Hannah Clayton-Langton:

Okay, so let's get back to the sort of nuts and bolts of So we've putting all this data in, it's being processed under some sort of government jurisdiction, bit of complexity But the actual ways that the data are being used. So if I call back to our episode around like our phones to us, and we got talking about like metadata and how process data, it's obviously super valuable for But it's generally processed at an aggregate level, Like it's not likely that someone's going in and reading my prompts with Chat GPT unless it's like a human in the loop test. But I assume those are pretty low as a percentage of

Hugh Williams:

You got billions of these prompts per day, Hannah. So it's certainly not the case that any significant fraction of them are ever looked at by humans, but certainly some You've got this incredibly small probability that is actually looked at by a human. And the reason that somebody at OpenAI or at Anthropic know, at Google or Microsoft would look at this prompt improve the product, right? So they would open up your prompt and they'd make an You know, does it look like this user was happy with the Does it look like the LLM successfully answered the question? So just asking simple questions around the performance of tool. And that's used in, you know, all kinds of different ways the companies are improving the products. But the chances of that happening is very, very small. And you can actually control it. So if you do go into ChatGPT and you go into your you go down to data controls, if you do go into data you'll find there's a little slider for an option that's improve the model for everyone. And it's turned on by default. And what that means is basically two things, Hannah. So the first thing is that your content can be used to future models. And also it means that humans can inspect your data for quality improvement process. So if you kind of want to opt out of that, you can just turn that toggle off. And then your data won't be used in any future training and it won't be looked at by individual humans.

Hannah Clayton-Langton:

Do you think that will always be an option? Like I see some websites, if you want anything other than the standard cookies, which is a similar sort of like personal data point. You have to use the paid for version if you want to turn that stuff off. Do you think that is where this will eventually head? Which is like if you want the privilege of us not using the for training, you're gonna have to pay a subscription.

Hugh Williams:

No, I don't think so. I think I think what happens with these advanced options, And remember, I'm I'm well down in the settings here, For most of our curious listeners, this is going to be new right? That you can actually go into settings, data controls, and off, improve the model for everyone. Most of our listeners are gonna go, oh, that's cool. I might go do that. With these advanced options, nobody ever turns them off. Right. So you would find that probably 0.1 of a percent, 0.2 of a of users today have actually turned this off. And uh, even with us releasing a podcast that explains should do it, it's really not gonna change how people access these advanced options. So I just don't think it's an issue for them. As long as the the toggle is turned on by default, they're be happy campers because nobody will actually go into an option and turn it off.

Hannah Clayton-Langton:

And there's also other things that you can toggle, it's related, but my husband was telling me that you can change the like level of enthusiasm and the tone or like level of So he recently turned his to be like meaner because there's whole that well, maybe we should talk about this for a second. There's this whole phenomenon that like LLMs can be in their nature, which I presume gets people to use them more because it's nice when someone agrees with you. But I did read that there's like a it can be a bit problematic with conspiracy theorists who just get like validation on view that the world is flat from Chat GPT. That's like, hey, that's a great point. And there's you know plenty of reason to believe it's true. So he made his a bit more cynical, but you can make it like chatty, you can make a bit more positive.

Hugh Williams:

Yeah, you can. And I think this is a reaction by the folks at OpenAI to if you like, because Claude has always been a little bit your friend, a little bit more chatty. And so I think the ChatGPT folks have made it possible control the sort of style and tone of ChatGPT to allow people to make it be a little bit more Claude-like or perhaps pick a pick a different personality. But if our listeners want to follow along, you can go into GPT, into the settings, and into personalization, and you'll see an option called bass style and tone. can indeed pick different kinds of personalities. I've got mine set to candid because I just wanted to get the point. Um, but you can choose professional, friendly, candid, efficient, nerdy, and my favorite, cynical.

Hannah Clayton-Langton:

Cynical. That sounds very British. I think the quirky would be annoying. I might play around with it because yeah, now that you've me to that in the app, there's bass style and tone, and these characteristics, including like how much it uses and then whether it's like warm, enthusiastic. Interesting.

Hugh Williams:

Yep, so you can make yours less enthusiastic, less of a of a cheerleader, Hannah. I think when it comes back to conspiracy theories, Hannah, know, we we talked about this a lot back in episode seven of season one. We have to just remember that these are probabilistic, you monkeys, if you like. And so ChatGBT, it's got this huge context window of all different words and tokens, and then it's generating output that follows those, right? So if you're having a long conversation about a conspiracy about the earth being flat or whatever it is, it's going generate sentences that follow on nicely from the context of what you've been talking about. So, of course, surprise, surprise, it will agree with you and reinforce the kinds of conversation that it's having. I mean, that's just how this technology works.

Hannah Clayton-Langton:

So it's creating even more echo chambers for folks. Yeah.

Hugh Williams:

It's producing plausible output given the context that it That's all these LLMs do.

Hannah Clayton-Langton:

Yeah, and it's dressed up so cleverly to feel like you're a conversation with someone to whom you want to divulge it. Sounds like deep personal information. But yeah, you're right. This is literally just a probabilistic model giving you the likely satisfactory response as a result of the prompt and that is it.

Hugh Williams:

Yeah, and it's incredibly impressive because it's trained on trillions of words. It cost a blockbuster movie to train. It is an impressive probabilistic machine. And so the context that it produces is incredibly relevant the conversations that you're having or the task that doing, but it is a probabilistic text generator. And uh, we're gonna have a really good interview coming up in the season, Hannah, about the future of AI. And I'm gonna get into it with our guest who I want to know into a spoiler yet, but we're gonna get into it with our and talk about the future of AI and you know, is it possible that these machines will ever become intelligent? But they're certainly not intelligent today.

Hannah Clayton-Langton:

Yeah, they're just dressed up to feel really intelligent, which is the whole that is the secret sauce, right? Suddenly AI has come to the fore, and that's the reason because it feels like we're talking to someone. And I guess that harps back to the point we made, which is most people are reaching to this for majority non-work use cases it's like a pocket digital life coach. I think I saw it described as. probably pretty fair. Like all the all the things we discussed, whether it's, you workout advice or recipes or something more interesting than it's all under that umbrella of life coach, career, you sometimes relationship advice.

Hugh Williams:

Yeah, absolutely. Absolutely.

Hannah Clayton-Langton:

Can we talk a little bit more about training? I've actually just opted out of training while we've been Um, is that something we should be worried about or inverse, that something we should be open to? Because I think it says in the app, do you want to improve it for everyone? And you're like, oh, well, I do kind of want to improve it for everyone. But what does that mean in practice?

Hugh Williams:

Yeah, so what that means in practice is that in future runs, if you are, if you've opted in, your data can be in the training process. But I think what our listeners have to remember is it's with billions of other prompts typed in per day. And so you are, you know, a drop in an ocean isn't even fair, right? You are less than a drop inside an ocean. And so very literally, you know, what will happen is that influence in a very minute way the style and the of a word appearing in a future model. Your text is never going to be verbatim reproduced by the in a future model. So you might say, Well, I really want to protect my privacy. I don't want them to use it in training, and you might choose to turn that off. But I think by leaving it on, all you're doing is becoming of the masses that influences the probability of a word generated and perhaps, you know, the probability of a tone of phrase coming out of the LLM. So I think in practice, it probably doesn't really matter.

Hannah Clayton-Langton:

That's a bold statement, but we're going to make it on this It doesn't really matter on an individual level.

Hugh Williams:

No, no, I don't think so. But I would say, you know, because humans can inspect and because you're putting your data in an unknown and because your data's moving around the world, you should be a little thoughtful about what data you these systems. So personally, I would not upload my medical records and have them stored randomly in some country, randomly copied the world. And I'm I'm not sure that I'd want any probability of a human opening up and having a look at a PDF of my personal medical records. So I think we should all be thinking about what data we're And I think if we're not comfortable with that, then just upload the data.

Hannah Clayton-Langton:

Yeah, so common sense use. It's probably never gonna get looked at. There's a small probability that it will. Don't share anything too overtly personal because you just do you shouldn't do that anyway.

Hugh Williams:

Yeah, absolutely. So just using common sense. But in practice, you know, your personal details aren't to appear out of a model in the future for somebody else. You just don't have that influence in the training process.

Hannah Clayton-Langton:

Fair enough. I do think people will get a kick though out of um now that tipped everyone off how to look at their memories, it'll be they're a Spotify rap. They'll go into their chat chat GPD and see um see what it of them.

Hugh Williams:

So maybe, maybe homework for you, Hannah, is uh before our wrap episode, a bit more expressing. And uh you can report back. We'll we'll chat about how our LLM uh use changed based on uh observations of LLM use. What do you think?

Hannah Clayton-Langton:

Yep, that's totally fair. And this whole expressing point, like I wonder if I should be using ChatGPT more as a coach. I don't know whether it's a good thing or a bad thing. It was certainly an interesting thing that that's a a that's bubbling up in sort of over 10% of prompts, but um not one I've explored in depth just yet. Okay, guys, this has been the Tech Overflow Podcast. If you've liked what you've heard, please, please, please like, subscribe, leave us a review on Spotify, Apple Podcasts, with your friends, family, anyone who you think might be about how tech works.

Hugh Williams:

Yeah, and if you'd like to learn more about the show, you follow us on our socials. We are on LinkedIn, X, Instagram, TikTok, and YouTube And of course, we've got our own website, So we'll see you next time, Hannah. Take care, guys. Take care, bye.