Artificial Intelligence Explained | How Generative AI Will Change Our Lives feat. Microsoft’s Maxime Whaite & Marc Beckman
Marc Beckman: [00:01:00] I'm wondering if you could start today by explaining, perhaps in, in more, terms, not so much on the scientific side, like what is generative AI?
Maxime Whaite: It's a really good question. I think it's maybe one of the most essential questions. There is a lot of. Misinformation or maybe misunderstanding about what generative AI is.
To start with the fundamentals, right? We get very large sets of data. In those large [00:02:00] sets of data, there are patterns. And so based off of an input, it learns those patterns. Across that mass of data. And then based off of an input, I trigger the model and it generates an output. What's really important to understand is it's essentially replicating the patterns and generating a new output, but based off of the patterns that it learned.
And it's initial, initial training set. So when we think of the most popular generative AI, which would be ChatGPT, right? This is a kind of generative AI, a large language model. So it takes in a piece of text and it outputs a piece of text, but it is not an intelligent system. And I think this is where sometimes people are like, oh, there's a little man in the machine or, you know, it's thinking, it's reasoning, it has logic.
It doesn't. It's really, it's like, uh, it's like a weather app. It's just [00:03:00] predicting what the next word is going to be, and it's predicting that so accurately that it, it's mimicking intelligence, but it is not, in fact, reasoning or thinking. And so, the real important lesson to understand about that is because it's predicting that next word based off of what it's learned, What goes into the learning, that data set on what it's trained, is massively important because that data set is everything that it's basing itself off of, um, to be able to generate the answer.
And I'm going to pause there because ChatGPT, generative, ChatGPT, here's a headliner, ChatGPT is not really generative AI. Generative AI makes one small, there's, is one small component in a much larger system. The generative AI engine is surrounded by these satellite systems. That are putting guardrails [00:04:00] in place, that are permitting it to give certain answers or maybe not permitting it to give other answers.
And so, when you use tools that have generative AI in them, and that's always a really important caveat. There is always generative AI in the tool. Just always understand that there are other systems at play interacting with that generative AI model. You're never actually accessing that prediction engine, that word generation engine directly.
Marc Beckman: So, ChatGPT is not generative
AI.
Maxime Whaite: It is not 100 percent Generative AI. It is a combination of the GPT 3. 5 or the GPT 4. That is the nugget. That's the model. But it is way more than that. I would say that, see, this is the thing. Generative AI has existed since the, like, over 10 years. It's an old technology.
What made it grow to 100 million users in three months were all of their surrounding systems. It [00:05:00] was the, we had a reference earlier, it was the user experience, right? This idea that you could be conversational. A generative AI model alone doesn't have like a little Chat box that you can go and type things in.
A generative AI model alone doesn't have like, you know how it types Yeah. It's got that streaming thing. Yes. That's not generative ai. Those are other systems in place that are, you know, owning the streaming. This is like to go into the tech world, right? Like this is this of standard development. Yeah. So, so
Marc Beckman: there are different types of generative AI then, right?
So we're referencing right now. Language, specifically words, but what else is an example of
Maxime Whaite: generative AI? Right, so we take this principle, right? Large data sets, there's patterns in all data. You train it on that data, and then it, it generates another data based off of that large data set. So, text is a very common example, because the internet has billions and billions of words [00:06:00] on it.
So it's a really good place to start. where we can start training models from. But you can train on images, you can train on sound, right? You can train on video. There's an emerging world where there's this word, multimodal, where the industry is really interested in. Let's not only do text, but let's start and everyone's seeing this in the recent like ChatGPT enhancements, right?
You can like chat with it and it can like shoot out images now, right? Yeah, so they're combining these different kinds of generative AI systems together. So that'll be something to watch You know, what is the next mode? What is the next kind of thing that you can produce? Is it going to be Charts, is it going to be, there's like all sorts of different kinds of media
Marc Beckman: in the world.
So, so just for everybody to come along in case they're not sure they understand the term of art, multimodal, you're saying that means it could be language, it could be music, it could be artwork, it could be graphs, correct?
Maxime Whaite: Yeah, [00:07:00] literally anything under the sun and we're still discovering what modes are interesting to include.
Right, text was sort of one of the most obvious ones to start with because in a way that's how we think, right? It's very easy to instruct in text. It's harder to instruct in in images and so on. So text is like a foundational aspect of it. Even when you generate an image, Even if you're, you're always typically going to have this instruction of what image you want or like what modification to an existing image you want.
But from there, after we have text, who knows what new kinds of modes we're going to add.
Marc Beckman: So just by way of example, again, to keep the room together with us, you mentioned ChatGPT as it relates to language. Um, can you mention just some generative AI platforms as it relates to artwork and music and other areas where people might be interested in, in leveraging these tools?
Mm
Maxime Whaite: hmm. So another famous one is, is Dally, if that rings a [00:08:00] bell. Yeah. So that's that idea that you can start generating images from it. Um, there's a number of ones, I don't have the specific name, but you can start generating music clips. Right. One that is very topical is generating people's voices. Um, and so there's a number of different platforms that do that kind of thing.
Um, so yeah, you can speak into it and then, or put a recording clip and then you get someone's voice out of
Marc Beckman: it. So Max, how much should we trust these platforms right now? Like we see problems, right? I think sometimes we hear about the word hallucinations and we'll read things that don't seem accurate and in fact are not accurate.
Or I remember seeing an image just two weeks ago of Taylor Swift. I think she had six fingers and, you know, She was kissing the coach, one of the coaches on, you know, obviously that was a deep fake. So why are, why, what is a hallucination and how much should we trust generative AI at this point in time?
Maxime Whaite: Yeah. If we go back to that core definition, that it's predicting the next [00:09:00] word or predicting the next pixel, let's say. Just like your weather app might have, might give you, might say it's raining, but it's not raining outside. It's the same thing. There's certain confidence in terms, the model has a certain confidence score in terms of what might be that next word or that next pixel.
When we look at the heart of it, it's essentially just trying to get the pattern right. Like, if, if we, you know, to use non technical language, right, it's just trying to get the gist of the pattern, to be as close to the patterns that exist in the source material. So hands are very tricky, that's like one of the areas that these image generating models have typically struggled with, because to get a hand to sort of look right, you know, it's sort of a, it's like a rectangle with a bunch of digits sticking off of it, right?
So it'll do something like that. But because if we go back to what I said earlier, it's not a reasoning machine or a logical thing. It's not like [00:10:00] counting the digits on the hand to make sure that there's 10. Now, the models are improving, so they're starting to add this kind of thing into the models. So you'll notice that the image generation ones are getting better and better.
But the basis thing is it's just trying to get it to look sort of right based off of what it's already learned. But because this whole thing is just based off of prediction technology. A prediction is just that. A prediction can be wrong. That is the essence of prediction. And so when we go to hallucination, which is what we talk about in the text world, like it says something that sort of sounds right, right?
It sort of fits in with the general pattern, but is in fact the facts are wrong. It's like invented. You know, something that is just not true, um, that is one of, it's the nature of generative AI. So,
Marc Beckman: so the, so the artificial intelligence believe, like, is working in a way that it should be working, it's working the correct way, but [00:11:00] the output, the outcome is incorrect correctly?
Maxime Whaite: Yeah, because it's, it's sort of gotten the pattern wrong. Now, We as human, we don't like that, right? We want, the purpose of us using these generative AI systems, typically, like with ChatGPT, is we want the facts to be correct. Now that's not always the case, because you could use generative AI to write a novel, right?
And if you use it to write a novel, maybe, but if you use it to write a novel, you don't, if you're writing fiction, uh, fantasy or fiction, you don't really care if the facts are correct, right? Right. And so, It really comes down to we decide what we want the system to be like, and we shape the technology to get to those outcomes.
So this idea of hallucination is a big problem in the industry for the piece of the industry that cares about factuality and accuracy and things like that. And that's why when I say that ChachiBT is not only generative [00:12:00] AI, that there are parts of it that are like systems that have. Like watchdog systems.
This is a pattern that we're starting to see elsewhere in the industry. You might have generative AI produce a result and then you might build a second system that checks the work of the generative AI result. And so, as I As we're, it is only in use that we are encountering what the real challenges are, and as we are encountering them, we're starting to build and design new kinds of systems and
Marc Beckman: approaches.
So then, so practically speaking, I'm sure everybody here is using GPT 4, I assume you are, and what I've been doing is I've been trying to check it, and I would ask, I ask GPT 4 now, is this, is this information up to date? Can you provide a source? Can you give me a link to this source? But is GPT, is the artificial intelligence Behaving then when it provides those sources in a way that's 100 percent correct.[00:13:00]
Maxime Whaite: I'm interested to hear that GPT4, ChatGPT, was able to provide you with links because, and this is where we get into not all generative AI systems are implemented the same. ChatGPT, is maybe one of the purest expressions of generative AI. I mean, as it was like last month when I was more actively using it in that when it is, you are sending it instructions, right?
So every question, every statement, every input that you send in is an instruction to the model. You are triggering, you are in control of the output based off of your instructions. It's sort of like a trigger phrase, right? It responds to your instruction. That output AI is coming sourced directly from the knowledge and the facts contained within the model.
Other implementations of generative AI system leverage something that the [00:14:00] industry calls RAG, which is they pair up a generative AI system with a librarian service. And so if I put an instruction in, that instruction doesn't actually go to the generative AI first. It's going to first go over to the internet into search, and it's going to go search the entire internet for whatever is relevant to my question.
It's going to get those results, pass them over to the generative AI, and say, with some additional instructions, say, answer this user's question. Based off of the facts that you found that were relevant on the internet and that produces a completely different type of result and typically grounds the answers more because you're essentially embedding these facts into Requiring it to, you're sending some instructions required to answer based off of the facts.
So we've seen like hallucinations go down a lot more in that approach because it already has the facts present, so it [00:15:00] doesn't have sort of the opportunity to try to, uh, invent more of them.
Marc Beckman: Gotcha. Yeah. Alright, so like, let's move away from the downside and talk about the positive because. You know, clearly there are amazing efficiencies, there are ways to close the digital divide and beyond.
So, as it relates to careers specifically, how can generative AI provide an advantage for perhaps everybody in the room and beyond? How can we see not just new job creation, but what can be beneficial to people that are looking for jobs, and as a result, using generative AI? Mm hmm.
Maxime Whaite: So, the Dean made a really great statement earlier, which is in private industry, generative AI is there, right?
It's getting put into every single tool, whether it's an email to help you draft email faster, whether it's going to show up in spreadsheets so that you can You know, like, manipulate data easier without having to really understand how spreadsheets work. Whether [00:16:00] it shows up in, you know, like, Word document, and you can just instruct it to, like, draft your entire proposal or whatever work that you have to do that day.
It is, it is present. And so the opportunity for everyone out there is how do you get really good at using that technology? Because it's, It, you are the hero of the story, it's going to make you that faster, stronger hero if, if you can learn to sort of wield it. If, if I, if it takes me two hours to write an email, because I'm really thinking over that email, right?
I'm staring at a blank page and I'm trying to lay it out. versus, I go, I like, I crack open my generative AI tool, and I'm like, alright, this is sort of what I want the email to look like, and within five seconds, I've got like a working draft, and then I can start working with that. I'm gonna, I'm gonna be more productive.
I'm gonna be faster. I can go to my coffee break earlier, right? That opportunity is up to me. And [00:17:00] employers and people hiring, I think, That's what they're genuinely interested in, someone who can be effective and efficient at their work.
Marc Beckman: Yeah, that's interesting. It's actually happening all the way at the governmental level, too.
Um, in fact, one of our guests in a few hours is The New York State Assemblyman, Clyde Vanell, who used artificial intelligence, specifically generative AI, to help draft the bill. He was the first legislator to draft a bill. And he came under tons of scrutiny from both sides for this, but yet one of his biggest prizes from that experience was that it allowed for him to give more time to his constituency.
So it's really interesting how So, these efficiencies as a result of using these tools can, you know, really make society function perhaps at a higher level, um, but what kind of new jobs do you think could come out of the fact that we have these new technologies, right? Every time we go through a phase of introducing new technologies, um, whether it was when the automobiles were brought out or computing, we're scared at first, but then it seems like more jobs are [00:18:00] actually created.
So, What kind of jobs that are specific to generative AI do you predict we'll see in, you know, six months, a year, two years?
Maxime Whaite: Mm hmm. And we're, indeed, we're already seeing some of those new jobs. I, I, it's gonna show up, I think, in various ways, depending on what sector you're in. So for the technology companies that are producing generative AI, or for the startups, Startups that are building these new, in fact, they are brand new startups.
There's a whole sector in the startup world of folks producing generative AI products. In the technology companies that produce the raw parts, like are training the models and Um, are like building these AI search systems that people can use in their startups. We're seeing, um, new kinds of jobs for lawyers, for example, um, like copyright and all of this evolving space on policy, I think is going to be one of the places in the ecosystem where you're going to see a lot of new jobs.
For most of us, like in the room, um, [00:19:00] it, people talk a lot about this new title, Prompt Engineer, right? At the basis, prompt engineer is just, what does that mean? It means that you get really effective at getting, putting in a good, an input to get the output that you want. Um, and I think if you build upon that, I think there's going to be a whole new skill set.
I don't know if they're necessarily going to be stand alone jobs. They might be. I, I, I know in the educational world, there are new classes, um, coming into being. There's like teachers that can coach you on how to put in good inputs, how to be a good prompt engineer to get the right output. But I think overall, even if there are not new jobs, all of us are going to be acquiring new skill sets.
So
Marc Beckman: prompt engineering feels like it's the cornerstone of my every day, it's like the existence of every time I want to use AI, I'm like, Oh my God, am I doing this the right way? What should I really be putting in here? Like, is there a place where everybody in the room could learn how to become like a better prompt engineer, how to like, heighten [00:20:00] those skills?
Maxime Whaite: Yeah, there's a number of places, including even ChatGPT, isn't that funny? Yeah. Yeah. It, what is really interesting though, is this is where literacy in terms of generative AI matters. Whenever you use a generative AI product, it's really important for you to understand what is the model? Is it GPT 3. 5? Is it GPT 4?
Is it LLAMA? Is it one of these other ones out there, uh, Gemini, for example? Because every single different model reacts differently to prompts. So we know, for example, for GPT 4, that whatever you write last is sort of going to have a stronger emphasis. This is a concept called recency bias. So if you write a long set of instructions, your most important instruction you want to sort of put at the end.
We also know that if you do hashtags, like if you want to put like bullet points in your instructions, like, I want you to do like five things. [00:21:00] If you put hashtags in front of those, like, five different things, it will react to your instructions more consistently. But this is very specific to that one kind of model.
And so, uh, number one, those things are sort of discovered by trial and experimentation, because it really is just based off of the output, that's what you see. So those discoveries are shared as, uh, People like have some notion of this really does make a difference, but number two, it really does matter what the foundational model is.
So I, I can't emphasize this point enough, sort of this dry technical stuff really does matter because you are gonna get. You were gonna, you were gonna interact with the system differently depending on what the underlying technology
Marc Beckman: is. It's kind of an interesting topic like just to dive a little bit deeper.
We all talk about the bias that's built into the algorithms, but what happens to people that learn how to prompt engineer maybe [00:22:00] at a, in a better way than their peers? Does that give them an advantage of sorts? Like perhaps like someone in a board meeting Can engineer better than a colleague on the board and because they can manipulate AI better in that meeting Maybe they have access to more information at a higher level and then can sway Opinion of that board or in the government or something like this like it like what what could be the impact at that level?
You know, it's
Maxime Whaite: a really good point. I If you are better at prompt engineering, you're potentially more efficient, or you're potentially getting the exact result that you would like to see, you're maybe spending less time. And so, when I say that skill set is so important, it's It is true. It needs to be democratized.
I think there is a role for schools to play, and I don't know at what level of schooling, but it is important, I think, for all, if this technology really becomes what the world thinks it does, right? I think Mackenzie recently I had a [00:23:00] white paper published. It's like 40 trillion of estimated productivity value is going to be added to the world economy.
Like people are making really big predictions on this kind of thing. If it becomes as ubiquitous as everyone thinks it will, then it really does become important that everyone understands sort of how to work with it in a very basic way.
Marc Beckman: So there's more good that's coming out of generative AI beyond the corporate setting and beyond academics.
Like, I think it's helping. Governments interact with their citizenry and it's also providing citizens and constituents with the ability perhaps to access things that they have not been able to access or might have not even known about before. So can you talk a little bit about that interaction between how governments can use generative AI?
to create maybe a better environment and a better experience for the entire community?
Maxime Whaite: Yeah. There, there are essentially three trends that we see in generative AI where it becomes really useful. [00:24:00] One is this idea of automation or essentially reducing boring tasks. You can just get them done like pretty fast and out the door.
Two is Two is to be more efficient. And we've, we've talked a lot about that. Like the idea of like you write an email, you can, you can get a draft of a document going. But the third one is access to information. And what's really interesting is this idea of ChatGPT challenging, you know, the Google search, right?
Yeah. That's fundamentally, what that's about, it's not necessarily, uh, efficiency or automation, that's, that's that's about access to information. And it changed the way it brought the information to you, right? It centralized the entire internet. I could just go ask my specific question and get my version of what I wanted to see exactly as I needed it.
Governments, at their core, you know, this is my personal thoughts on it, are information distribution [00:25:00] organizations. Like, in a lot of ways, think about a government. Not the person that you elect, but the organization. The system that runs, that needs to be there regardless of who's at the helm. Their job is to get you access to information so that you can get your benefits, you can pay your parking ticket, like, you just, you need to understand how to work with that.
They, their interest is to get how to work with them out to you as much as possible. And guess what, I have no surprise to anyone in this room, that's a challenge for government. When was the last time someone put their hand up and was like, wow, the government's so easy to work with. Like, my tax return is like, so simple.
I could do it with my eyes closed. I definitely don't need some sort of tool to help me with that. Right? Like, it's tricky. There's lots of, government is complicated. It's like a body. There's all different kinds of, Organs, right? There's many different departments, each handling little different things.
It's tough to navigate. Even government itself has challenges talking to [00:26:00] itself. And so, if we think of something like ChatGPT, which centralized the entire internet, there's this real big interest in government of could we make It easier for our residents, for our communities to access government. Could we somehow, without necessarily centralizing government, but like provide this layer on top where it's on your terms.
You go, you ask your question, and the generative AI Has access to all the pertinent information from the government and can come and generate that personalized response to you. So this could
Marc Beckman: be in a lot of different ways, not just with regards to paying taxes, but it could be great if you're looking for solutions surrounding health care or helping with your children or access to other areas of personal finance and beyond,
Maxime Whaite: right?
Exactly and And it can do new things, like, if I have a question about what I can do for my children, let's say, [00:27:00] uh. It could potentially go and answer the direct question maybe I had in mind, which is, you know, like, are there any vouchers to pay for, for, for their daycare, let's say. But it may also pull some programs from some other departments, like the Department of Youth in New York City, who has like summer programs, um, for like kids to, to go, you know, Be able to be busy during the summer, let's say.
It could go pull from a number of different places and generate this very complete answer across the different silos of government, which is a very exciting
Marc Beckman: possibility. So it's interesting because it could provide our communities with like a broader view of all of the benefits that government brings to them.
And yet. We might not even know that those benefits are available.
Maxime Whaite: It is one of the, I think it is one of the eternal struggles of government is like, a lot of times there are some amazing programs that are just not well, and how would you find out about that program? You have to go to a government website.
I mean, hands up in the room, when was the last person that someone was like, oh [00:28:00] man, I can't wait to go to a New York City government website. No, no hands?
Marc Beckman: What about taking it a little further? So these are like existing concepts, right? So like perhaps we could go into San Diego's website or Boston's website or New York City's website.
But are, are local government leaders going to be able to say, like, perhaps we could create new streams of revenue to benefit our community through the use of generative AI?
Maxime Whaite: I mean, one of the big preoccupations I, I personally feel that the government has is economic development. If you create a, a thriving and prosperous community, number one, you, you have a happier population, right?
That's your mission as a government. Number two, it increases your tax revenue base, right? You can then take that tax revenue, put it back into programs, and then Keep increasing the wealth of your community. There's like that great feedback loop. So from a sense of revenue generation, a lot of the focus is [00:29:00] on how can we support small businesses?
How can we make sure, in a lot of ways, it's like spend money to get money, right? Like how do we get grants out there more easily? How can we like continue to grow the local economy? And so that's why, you know, it's not surprising. I don't know if anyone in the room is familiar with it, but chat. nyc. gov It was launched by the mayor, um, first with small business services around this idea of like helping people
Marc Beckman: start.
So it's an interesting platform. I'm just, just by a show of hands, I'm curious, are you familiar, if you're familiar with this platform that NYC built, can you raise your hand? Yeah. So that goes to underscore your point, most people are not. Um, it's interesting, I actually had a conversation with New York City's Chief Technology Officer, Matt Frazier, and the conversation is on my podcast, which NYU produces, called Some Future Day, and he breaks into the benefits of this platform in a major way.
So, if you guys want to check [00:30:00] out and get more information as to What New York City is doing specifically as it relates to artificial intelligence, I would definitely suggest that you check out that episode with Matt Frazier. I think it's really insightful and we really break it down in a major, major way.
But what happens? So There's, there's a problem, I think, as it relates to cities, and I know that we weren't going to really, um, tackle this issue. We didn't plan to, but as you're saying it, I'm starting to realize, like, perhaps there are certain communities within New York City that would like to get to this information and these benefits more than other communities.
But we still don't have, like, broadband accessible to every community in the city. We still don't have hardware accessible to every community in the city. So is there still going to be a problem with regards to the digital divide and access to this information if we don't even have broadband in every community?
Maxime Whaite: Yeah, I think there's a fundamental agreement out there, where there's certainly a lot of discussion, right? Like, internet's sort of like water, [00:31:00] like, it's sort of, it's becoming so foundational. That governments are trying to, like, make sure that you have access to it, like, it's not even an if. I think there's even been conversations around, at least having, a government provided one as a minimum, and then you
can
Marc Beckman: I know that New York City has done this, actually under this, um, administration, Mayor Adams has done a tremendous job at bringing broadband.
I think that they've brought it to over 400,000 families since he's been in office. On the
Maxime Whaite: flip side of that though, this is something that folk are really interested in, is can you have these models, can you have generative AI be offline? So this is like back more to the tech world, but there's this idea of you can make the model smaller.
Um, so LLM is what we call ChatGPT. It's very big. It's got billions and billions of words in its model and its knowledge. You can shorten [00:32:00] that down and make these things called SLMs, Small Language Models. And the idea is that when we first went around building these models, We said we're gonna like grab all the knowledge in the world and we're gonna build this super general purpose thing, so if you put any input in it, there's probably some pattern that already existed in that data and it'll probably be able to answer it.
The thinking is starting to change, the thinking is like, if we think about as humans, we don't memorize billions of words, we're really good at finding the information we need, but we don't memorize it. But what we do have inside of our brains is this idea to reason, or or to um, or to think through problems right we like learn the foundation of how the world works so there's this new kind of model starting to come out which is based on this white paper which is like all you need is like really good textbooks so it's specialized it It's still generalized, but it's essentially, it's given like textbooks on, it learns essentially [00:33:00] textbooks, so on biology, organic chemistry, project management, like all of the different disciplines in the world, but it's sort of given these building block things, and the idea is that Then you have something that's starting to approach more of a reasoning engine, because it has like these foundational things, and those being much smaller, you could run disconnected on your cell phone, for
Marc Beckman: example.
And that could be a good solution so that everybody has
Maxime Whaite: access to that information. It's one of the ones that they're looking at, right? The question becomes then, you still need some sort of knowledge, right? Like as a human, how do I find knowledge today? Well, I Google it, which what? It relies on the internet, right?
But you could, you know, there's this idea of you could have it stored locally on your device. Do you
Marc Beckman: think traditional search engines like Google will be gone imminently?
Maxime Whaite: It all depends on how we use it, I think. I think that's certainly one of the concerns out of any search engine. It's like, what is the future?
And this is where we go back to this technology is really [00:34:00] old, so Google search engine had generative AI. I don't know if you recall, like, have you ever used Google and they're sort of the larger font? And it had like a summary, sort of, it had like a digest before. That was generative AI. But this is where we go back to like, what makes ChatGPT really interesting is not only the generative AI, but it's these other things around it where people, this new form factor was like, wait a second.
So it's almost like search engines have already been doing generative AI. The real, I think the real transformation is, are they going to start looking more like chat? Is it going to start being more of a chat experience and less of a You know, a question and then a bunch of webpages underneath. So, we're,
Marc Beckman: we covered a little bit of career, we spoke a little bit about cities and communities, but on a personal level, something I'm concerned about, and I think the room will probably share this concern also, is my privacy.
So, I think that certain, um, AI platforms, and I don't know if it's generative, but like QuillBot, for [00:35:00] example. If I'm correct, it recently said that if I put information in there, they could have that information. So what should I be concerned about? Like if my 15 year old son is using some of these tools, are they have, are they going to be able to access his personal information, anything that he puts into the artificial intelligence?
Um, what types of privacy concerns should we all have with regards to the images we're sharing in these platforms and
Maxime Whaite: beyond? We should always have privacy concerns. Great, that's a short answer. Um, that goes back to the dry technical stuff really matters, and being literate and educated on how these, at least the building blocks of how these systems are working is important, so that when you go to use a product, you understand.
So for example, right, these models do not need to be trained. So there's this, maybe this misunderstanding that if I use generative AI, it learns from my interaction with it. [00:36:00] It does not automatically learn. You can have a fixed model, like GPT 3. 5, right? It was trained on the internet back in the day. It can be fixed, and it can never be trained again, and it can still be useful, and I cannot give it my data, and it will not learn from me.
However, I, as let's say the person that builds the system, I can take a decision that I will take the conversation history. And I will save the conversation history from all of my users, and I'll continue to train the model, or fine tune it, or I'll set it aside and train a new model. That is a decision that I take as a technology, as an owner of that generative AI product.
And it will, it will defer, like depending on whatever product. Each one of those companies is taking different decisions about how they handle the data. So it is really important to understand our, when you're using a generative AI product, is that model fixed? Is [00:37:00] that model learning? What is that model learning on?
Is it my data? How is my data being stored? And how is my data being shared? Because those are all implementation decisions that the people that built those products have complete control over.
Marc Beckman: So which models are accessing our personal data now?
Maxime Whaite: The models don't necessarily access your data. The thing that would get your data into the model is if someone takes a decision to save your data and then start training the model on your data. It has to be an active.
Marc Beckman: Right. Yeah. So are there any that you could call out that have been doing that? Like, where should we beware?
Maxime Whaite: I, I would need to double check my facts. I think at one point, Chad Chivite was. Saving people's conversations and leveraging it in some way, but I would have to check my facts on that.
Marc Beckman: So like going back to the issue of, again staying personal and moving [00:38:00] beyond privacy, but getting into the idea of like by it built in bias If these entities, if these LLMs are Um, if they've already put bias into the algorithm, and then they continue, like, look, generally speaking, if they're saving people's inputs, there are a lot of people that, you know, are going to be, um, racist or biased and whatnot.
Can we ever get over the hurdle of not having a bias filled LLM?
Maxime Whaite: Every, everything in the world has bias in it. We are never going to escape bias. So. So, I think one of the best examples of that is like GPT 3. 5 or GPT 4, right? These popular large language models. When they're trained, and they're trained on data on the internet, Wikipedia, GitHub, like all of these different content, um, what's really interesting is it's not just English that's out on the internet, right?
There's many different languages. So, chat GPT performs [00:39:00] really well in English because most of the data is in English. But if you start going to these edge cases, like some of the West African languages, or like some of these languages that are less represented in its data sources, it's not as good in those other, we call this less quality.
So even from this very simple aspect of just languages, this is our understanding, right? What's in the data set matters, and if things don't show up in the data set, remember it's always predicting that next word off of the patterns it's already learned. So if If those, if data doesn't exist, and patterns from that data are not in its data set, it's not going to be able to generate from that, right, so even in that example of language, there's, there's a bias, and the bias is on English. [00:40:00] [00:41:00]