"The Data Diva" Talks Privacy Podcast

The Data Diva E159 - Julie Schroeder and Debbie Reynolds

November 21, 2023 Season 4 Episode 159
"The Data Diva" Talks Privacy Podcast
The Data Diva E159 - Julie Schroeder and Debbie Reynolds
Show Notes Transcript

Debbie Reynolds, “The Data Diva” talks to Julie Schroeder, General Counsel/Chief Legel Officer, AI/ML Thought Leader. We discuss the misconceptions surrounding AI and the importance of understanding its limitations and risks. We emphasize the need to differentiate between AI and data sets and the importance of understanding the ethics and bias issues related to AI. They also discussed the complexities of data and AI, including the potential for bias and limitations in the data used and the need for a comprehensive and integrated approach to data and AI analysis.

Julie shared her career journey, including becoming a tech founder and General Counsel with expertise in AI and data. She emphasizes the importance of understanding data ownership, Data lineage, and risk management in the healthcare industry and the need to move away from the idea of a company being like Santa's workshop with fragmented roles. It is important to have people like Julie in the future of corporations who can understand and communicate with different groups. 

We discuss Julie’s extensive expertise in artificial intelligence (AI) and machine learning (ML) and engage in a deep exploration of the intricate relationship between law and Data Privacy within the context of AI and ML technologies. Julie begins by dispelling the common misconception that AI is a recent development, tracing its evolution back to the 1950s and 60s. She emphasizes that AI is not a singular technology but rather a category encompassing various subfields, including machine learning and natural language processing. Throughout the episode, the significance of understanding data lineage and ownership is underscored, particularly for companies seeking to leverage AI effectively. Julie also highlights the inherent limitations of AI, emphasizing that it cannot replace human judgment in complex and nuanced tasks. Moreover, she acknowledges the persistent challenge of bias in AI systems, shedding light on the complexities of mitigating bias entirely. The discussion concludes by stressing the importance of comprehensive data collection, as the absence of data can lead to skewed results and erroneous conclusions. This episode offers valuable insights into the legal and ethical dimensions of AI and Data Privacy, encouraging a more nuanced understanding of these technologies in today's evolving landscape and her hopes for Data Privacy in the future.

Support the Show.

35:42

SUMMARY KEYWORDS

ai, data, generative, people, company, human, general counsel, understand, privacy, year, data scientists, talk, problem, bias, natural language processing, error rate, nlp, generalist, started, ownership

SPEAKERS

Debbie Reynolds, Julie Schroeder


Debbie Reynolds  00:00

Personal views and opinions expressed by our podcast guests are their own and are not legal advice or official statements by their organizations. Hello, my name is Debbie Reynolds; they call me "The Data Diva". This is "The Data Diva" Talks Privacy podcast, where we discuss Data Privacy issues with industry leaders around the world with information that businesses need to know now. I have a special guest on the show all the way from the DC metro area. Julie Schroeder. She is a General Counsel, Chief Legal Officer, and Artificial Intelligence and Machine Learning thought leader. Hello.


Julie Schroeder  00:45

Hey, thanks; nice to be here.


Debbie Reynolds  00:48

Yeah, I'm happy to have you on the show. You and I are connected on LinkedIn. You drop some knowledge bombs on LinkedIn. And you and I decided that we wanted to chat, and we had so much fun talking, and I thought, oh, you should be on a podcast. I really like people who understand the legal and really understand the data side of stuff. And I feel like you really understand that, and I think that's why we get along so well. But tell me about your career journey. Your interests, not only in law, but your data interests.


Julie Schroeder  01:32

So I did 15 years as a trial and appellate attorney with Supreme Court cases. My last Supreme Court case was Philip Morris vs. Williams, which established punitive damages in the United States, and it's still the case law. I was seven months pregnant. The only other women in the court were Sandra Day O'Connor, RBG, and Nina Totenberg. So after that, I was trying to decide what to do next. And someone said this company is looking for a General counsel. And I said to him, Brad, do I look like a 40-year-old white man from an M&A Department of a big law firm to you? And he said, No, you don't. But they've been looking for a year. So there's no dearth of them. They're clearly looking for something else. And that is how I ended up having my first General Counsel job. I call it like a Forrest Gump kind of way, like, just you're there. And at the right time, they were using natural language processing and healthcare for data. And I didn't have a lot of experience. But in order to, I thought, do the company justice and risk management and make sure that we're getting ownership of the right things, I went through the patent with the data scientists so I can understand where data comes, goes, transforms where they use Reg X's and why, you know, the tokenization process and that whole thing, and then went back and realized that the natural language processing with a human in the loop needed data ownership, because we were obviously making our product better with the data that we got from every customer. And that's kind of how it started. Also, you know, health care, so is doing a BA, or BA agreement. And I realized that some of the BA agreement, which is, as you know, a floor, not a ceiling, needed to be changed; I needed to have a right to do this to the data. And I also needed to exempt myself from it being infeasible to return or destroy. So that's where I started. And I was at that company for seven years before it was bought by 3M. And 3M made me the transition executive, which, you know, the General Counsel is, you know, so often I'm joking, but transitioning second, like never happened. Usually, like with the CFO, you kind of are asked to leave quickly. But I ended up as the General Counsel of 3M HIS, and if you want to talk data and what's being done with it, that's a cornucopia of different uses. So, I expanded my knowledge. And I do that by just talking to the developers, the data scientists. I can't walk by a whiteboard without asking, what is that? What does it do? How does it work? And I guess I'm just a curious sort. I didn't really have a tech background. I love learning. And I love learning about this. And so it continued to add to my knowledge base, and in additional jobs that I took, I made sure that AI, ML, or analytics were a part of what the company was doing. So I could continue to apply and expand that knowledge base with different models. The first NLP model, multiple matrix regression, I've also done LDA's, used information a little bit differently. And I had to learn about that. And, you know, I continue to go to this day, I work the same way. And I love this topic. Because it is so complex, I like a good puzzle to solve. And it's so complex because you can't just say it's a privacy thing. It's a security thing. It's an intellectual property thing. It's a contract law thing. It's a third-party license thing; it's all of them, depending on what you're doing with the data in your model. It's all of them all at the same time. So, in 2016, I knew this is very odd for a lawyer. But I co-founded an NLP and Qual Analytics company with a data scientist and someone who's been doing analytics for a long time. That went well; we sold it on March 3, 2020, which was a very lucky timing. And I know that's odd. But I got to use not only my tech skills, but my business skills and my legal skills all at the same time, because I was doing the business strategy and the selling and the raising capital. And I hadn't done that before. So I think that when you throw me back out as a General Counsel, again, that makes me a better General Counsel; I call myself a generalist, which seems to be a thing now. It was not when I started; in fact, I was told not to do that. Because everyone picks their attorneys by subject matter area. And if you're a generalist, you are not going to be able to get a job, which I ignored because I didn't want to be stuck just doing one thing. Again, I liked doing a lot of different things. And so, along with the tech, I'm a generalist, and all of the different internal verticals and the problems that can arise. And I would say I have deep subject matter expertise because I've really gotten into it the way I've gotten into AI and ADA. So, apparently did all the wrong things. And then I am before you now.


Debbie Reynolds  07:52

Yeah. Wow, I feel you. I understand you a lot. I think I'm similar in that way. People ask me, Why don't you call yourself the Privacy Diva? Because data is the core of what I'm interested in, right? So I can go in any direction. I'm just choosing at this moment to focus on privacy. But I think it's just fascinating to me; I think people like you are the future of corporations because you need to move away from the idea that a company is like Santa’s workshop. So everyone has their little fragmented thing that they do. And magically, some toy pops out at the end. And that's just not the way it works, right? Because Data is the lifeblood of organizations and is running through all parts of the organization. So if you don't have people who understand that and also know how to talk to people in different groups, you're not going to be a successful organization. So, I think people like people like you are vital to the future of corporations.


Julie Schroeder  09:00

Thank you, the same to you. One of our first conversations, I reached out because I said you call yourself a privacy lawyer, right? You've obviously got expertise in a lot of other things, which, you know, I would love to talk about as far as being the future of corporations. It's quite funny right now because I get a lot of interest from LinkedIn. But if I would apply to a position, it would be very hard for a recruiter. Your recruiter is a check box exercise. And I am very confusing to recruiters because she's a lawyer, but she started her own tech company, but you know, she's a founder. That just doesn't compete. And the really funny thing is if you use AI to parse resumes, which a lot of companies are doing now. They're using generative AI; I will never be picked because of perplexity, and perplexity, as you know, is the statistical correlation and probability of things that go together. And, you know, the dog was on a leash is an easy one. But a woman who's a chief legal officer, who's a generalist, who knows tech, and had a tech company does not make sense.


Debbie Reynolds  10:29

Right. One thing people don't understand about AI is like a machete; it's not a scalpel. So, it doesn't understand those nuances that a human would need to assess something that's accurate.


Julie Schroeder  10:45

I find that ironic.


Debbie Reynolds  10:46

Yeah, yeah, totally. What is it about AI that people don't understand? So, well, this is a funny thing. This is a joke that you and I talked about. Now, because of generative AI. Now, everyone's talking about AI. Now everyone's an AI expert; people weren't really thinking about it before allowing generative stuff hit the news or hit mainstream media. And so it's kind of funny and weird to see everyone become an AI expert all of a sudden. I hadn't really thought about it. But what are some of the misconceptions, or maybe one of the big misconceptions that people need to understand about AI?


Julie Schroeder  11:29

We talked about this a little bit before. So, first of all, AI did not spontaneously generate six to eight weeks ago; I actually just wrote a post on this generative AI dates back to the 1950s and 60s, depending on who you're looking at what they did. NLP, or machine learning and natural language processing, started in the 1970s at DARPA; the problem was, at the time, that computers were not sophisticated enough to be able to parse the way that we needed to. Also, we had to find a way to deal with human language because that's why it failed originally, because of the idioms and the expressions and the way that you use things, taught math, that statistical probability correlations to make sense. We're at an inflection point. Generative AI, in the past year, has made some strides due to being able to tag better being able to understand data better, but it's not new. And it's iterative. Like any technology is iterative. So that's the first part. And the second part is AI is a category, not a thing. You're either using machine learning, natural language processing, that's visual, Generative AI, there's a lot of different categories. And they're all being used right now by different companies to great effect. And I think it's the proper application of the right kind of AI, the right, call it a flavor, to your data. And then your goal that you have to decide when you're game boarding it, and you're thinking about how to do it. It's the data. Where did the data come from? How are you going to use the data? How are you going to use the data in AI? And I'm separating those intentionally because people seem to have a notion that data and AI are one thing, and they're not. Its data sets are MLMs that have propelled generative AI in the last year to be as good as it is. And datasets have their own. You have to go through the whole thing. You have to go through where did the data come from? Do I own it? How can I use it? Does a third party have rights? You have to check all those are their contracts where I'm supposed to give that data back? If they terminate? If so how am I going to do that? All of those things are not AI. It's just data. It's big data. We've all been working with big data for decades. Yeah, yeah. Add AI to that. And you have to ask those questions. Again, only you're asking them, you know, slightly differently. And you're asking also about security, privacy, intellectual property, contract law, again, in data ownership and use. So it's, they overlap on a Venn diagram, but they're two very different things. And I haven't seen a lot of people cottoned on to that I get more of the feeling. And this was the joke that we made that I made, that people think AI is magical pixie dust that you can just sprinkle on something, and it gives you a 10x in your validation. That's not the case. And also, it's not the case that it's fantastic. And it's gonna save the world and do everything for us. It's also not the case of the opposite that it's going to cause the robot apocalypse, and everyone's going to be replaced, and people aren't going to have a job again; that is very, very black and white. This is not black-and-white technology. It is gray. All of your answers are in the gray.


Debbie Reynolds  15:40

Wow, I never thought about it that way. I think that's all true. I think the best way I can describe it I think of generative AI as an evolution as opposed to a revolution. Right. So to me, it's an evolution in computing. So, like you say, it's not a new thing. It is just the newest version, the newest improvement or iteration of a thing. And then also when people say things like oh, Generative AI is like the invention of the Internet or a smartphone or something, I think of it as like a washing machine. It's going to have a horizontal impact on almost any type of industry if people use it the right way, right? So the capabilities it has now is not human. It's not humanoid, it's not sentient, it's not going to be sentient. But for people who understand the technology, understand what it can do, I think they'll be able to definitely go far. But you touched on something that I think is very important that people may not realize when they try to move into digital transformation with AI products. And that is the importance of data lineage, where I feel like a lot of companies, the way they handle data now it's like, okay, data is in a castle, and you lock the castle, we protect everything in the castle, but companies never have to truly consider where the data came from. Right. So I think a lot of it was the assumption that, okay, we have data in our possession, we have a right to have data, that's our possession, and we can do whatever we want to with it. But now, companies are going to have to really check that data lineage issue, and they have to make sure that the data that they have, they should have. What are your thoughts?


Julie Schroeder  17:42

I've heard people call this many things lately. Data lineage, data ownership. It's basically data. And can you use it? And do you own it? It's really quite simple. And yes, a lot of companies have data and have not thought about the data. And there's basically data on the floor, and then they have to figure out what they're going to do with it. But the companies that I've been in, because of that extra analytics and machine learning, AI component, have been very different. I've had a very different experience. Because data is everything, the data, where it goes, how you use it, and how you power it is everything. And I also agree with you that what we're looking at is a tool. I liked the washing machine. And I agree that I think it focuses on that when you're picking your goal. Your goal has to be something that's more automated; it can't be something that's very complex and nuanced. Because that's not going to go well. So if you're talking healthcare, and I've seen people say this is going to revolutionize healthcare and Generative AI is going to give the diagnosis to people. I'm sitting here going, no, that's never gonna happen. Right. But can it save time and automate tasks? Absolutely. That's the whole point of its creation. Yeah, it was to help to make things faster and automate tasks, and your take on Generative AI being an evolution is absolutely correct. I think this past year with near sentient chatbots, I mean, they're getting quite good and large datasets and the ability to tag data better, in addition to computer processing just being much, much, much faster and parsing the complex better. It's led to where we are now. But it's a tool, such as a tool. If you look at a Boolean curve, right? You've got the people who think that the tool can do everything for people who think that the tool should never be used because it's evil. And then you have the middle, which is some ideas of how you could use it. But this whole 60 weeks ago, when people were like, AI is here. We're having AI hackathons, you know, come join us. I was like, what is an AI Hackathon? What the heck would that even be like? What are you guys doing? So I'm with you on that and the sudden appearance of AI expertise on people's resumes. Lately, I think it's problematic, mainly because I think a lot of the people who are doing the hiring don't know enough about the topic. And so if someone is very, very loud about their expertise, even though that they've had no experience, and they're using the right buzzwords, over and over and over again, they're gonna get selected, and how is anyone going to know that they're not doing what they should be doing? It's going to take a while. And in two years, they'll find out that they've made no progress and still have the same risk profile. That's what I'm really worried about. I agree with that. Two things that you said that are very spot on, of course, and so one is, I agree about, a lot of people decided they want to hang their hat on the AI arms race, right? But I feel like even though it may take time to find out that someone may not know what they're talking about, in my view, it's not a fake it till you make it thing because eventually the truth will come out, and you actually have to do things, you can't just think you can't just fake it can't talk, you know, there has to be some action behind that. But then also, back to your medical example, I am thoroughly horrified when I see those articles, like, oh, AI is going to cure cancer, and it's going to do diagnosis. And I'm like, that's not even an appropriate use for the tool. Right? So you're basically saying we want the tool to take over human judgment, right, and it can't do that. It's not made to be a way to advocate your human responsibility. I think my greatest fear was always that people would abdicate their human judgment to AI. That's actually what's happening. And I'm horrified by that. So, what are your thoughts?


Julie Schroeder 23:20

That's how people are talking. But when we talked about the bullying curve, those are the people who are AI will be able to do everything far left. And my thoughts are, I completely agree with what you're saying; it's never going to be able to approximate human judgment. And you're going to try, you're not going to use Generative AI, you're gonna use NLP, the human in the loop of semantic analysis, sentiment analysis, as close as you can come, that I'm aware of now. And even that makes mistakes. So let's talk about that there is an error rate, the way that data scientists call it hallucinations. I use error rate because it's a little easier to understand. And back in 2007, when no one was using this, and Harry Potter had just come out, we called the mistakes Howlers because the machine it makes mistakes in ways that humans can't understand. So if someone is trying to code, like neck bone, for example, it's very likely to code for the thigh. And a human would look at that and be like, the whole examination was about the neck. You know, this is horrible. This is terrible technology. And it doesn't make any sense. And this example is completely off. An example is what happens when machines make mistakes. That's what it looks like; doesn't look like when humans make mistakes, which should be your first clue that it can't replace humans.


Debbie Reynolds  24:47

Yeah, I want to go deep on a data theory with you. And this is something that I think is problematic in AI and the way people use it in a way, because some of these systems are programmed to give you answers, even if they're not right. So where a human maybe wouldn't be able to give you an answer, the AI will probably give you an answer, even if it's not right. And so maybe that's not a problem; we'll say you're shopping for a dress online, and gives you the wrong dress. You know, you're not harmed by that. You may be annoyed, right? But, like, if you're trying to use these systems to pinpoint a person who robbed a store or did a crime, right, first of all, the potential harm level is astronomical, right to that person. And then the accuracy, isn't there, right? Because it's trying to give you the best answer, not necessarily the exact answer. What are your thoughts about that?


Julie Schroeder  26:00

I think that all gets back to the error rate. And when you're thinking about your goal and how you want to use a particular type of AI, you have to also think about the percentage of error rate. Are you looking at Generative AI, which tends to have a higher one, depending on the models that are out now, which we don't need to go through? Certain natural language processing types have smaller ones, but even if it was less than 2%? You have to go. Are you comfortable with getting 2% or less of these answers wrong? So when it's identifying a human being for crime? No, I'm not comfortable with that. When it's a diagnosis, I am not comfortable with that; as you say, it can't replace human judgment. And so one of the things that you have to do I think, is flip it on its head and say, okay, what judgment does this do? And how does it do it? And what are you looking at in terms of hallucination error rates? And is that acceptable as a risk? In a lot of things? Is it acceptable? It's where you draw that line; I think that becomes important.


Debbie Reynolds  27:18

Right? If a 2% error means two people went to jail, that's not acceptable. Maybe that 2% or you get 2% of the wrong item that you look for on a website, that may be acceptable. Right. So I think you're saying that that's true. Also, I will talk a little bit about AI audits. So AI audits now are going to be like the rate, we're seeing laws being passed around doing some type of due diligence, around AI, definitely, a lot of this started in Europe, we're seeing that come over to the US and certain places, maybe not even called audits but like assessments and things like that. One thing that concerns me about data and how people use it is when you're trying to come up with human logic to try to figure out how to get the best result. You know, there are holes in our logic because there are blind spots that maybe we don't think about in data, right? So, we're programmed to try to do our best to find our answer. But sometimes, the best way to find the answer is to ask the opposite question. Have I gone too deep here?


Julie Schroeder  28:34

No, no, no, not at all. I think you're spot on with that piece. They need to understand where the data came from. But what the data is, let's talk Generative AI, ethics, and bias, are you know, surprise, the same thing? Are they used as their separate things? And while they're taking the forefront, they're not the first or second or third thing I would look at. But in your question, they are important. So yeah, the data set has an inherent bias. is someone scraping the web, there will be bias against women, there will be bias against many things. And Google came out fairly early and said that it's introducing unlearning bias, which is a person's going to go in and look at the things that that are biased and fix it, which is a biased person is going to go in and try and solve for bias. That does not work. You cannot do that. There is no way that I know of to counteract bias. You just have to deal with it being there when you use it and then figure out how you're going to use it in a way that is not as much of a problem.


Debbie Reynolds  30:00

Wow, that's a deep answer. I love that I love these data-bought things; I have another deep question that I want to ask you or something I want to propose to you, which irritates me to no end. So here's the scenario. Let's say a town has a red light camera program. And they put cameras on the South side of the city but not the North. Okay. So the statistics that they get from that, they may interpret that as there's less crime on the North than the South side. To me, that's one of the ultimate data problems, which is an absence of data. You're making an assertion based on an absence of data; you're not feeding that into the equation? What are your thoughts?


Julie Schroeder  30:54

That's an excellent point. And I haven't heard anyone bring this up. But that is, again, it gets us back to your starting point, which is, what exactly is your data? And what are its limitations? You need to understand those kinds of things. Because human beings have a tendency, like you said, their blind spots will translate into these kinds of things, like your North side, South side, and they only have cameras on one side, and then they draw a conclusion from it, I could see a human being doing that quite easily as well. So, you have to consider the completeness and complexity of the data set you're using for particular types of AI in order to avoid that absence of data, skewing the results.


Debbie Reynolds  31:54

I love this; I could talk to you for hours.


Julie Schroeder  31:59

You've done that before.


Debbie Reynolds  32:00

Yes, yes, this is so much farm. If it were the world, according to you, Julie, and we did everything you said, what would be your wish for Data Privacy, legal, and human behavior? What are your thoughts there?


Julie Schroeder  32:15

I would just love to go back to that Boolean curve again. Let's just take a holistic approach, please. And also separate the data. We talked about this from the type of AI that you're using because each deserves a separate analysis, that you overlaps on a Venn diagram, but still need to ask the questions. Oh, data lineage, and data provenance are the two things that have come up recently, which made me laugh because I've only heard of data provenance in health care. And that's, you know, going back to what patients were using, but that's now being used. Instead, to talk about data use rights and ownership, I think that's confusing; we need to be using the same methodology. So that's the holistic part. But also using the same terminology; a lot of it is being randomly made up at this point in time. I don't know what it means sometimes. And that is something that could be easily remediated. But with the holistic approaches, there's no one thing you need to look for; we talked about use, the type of data, the quality, the complexity, where it came from ownership rights. And then there's if you're going to take that data, so you've figured that part out, now you're going to take that data set or LLM. And you're going to use some type of AI; you've got to go through the whole thing again. And then you add privacy, curity, IP contract law, third-party licenses, and a bunch of other things; you have to look at all of those. So, a lot of companies right now are hiring someone to fix this problem. And I don't think they know what to call it. So they're also using different titles like we need a product attorney who does this. We need a privacy attorney who does this. And for me, if you had an attorney that was a kind of one-note flavor, you've got a problem because you're not going to have a holistic approach to what you're doing. But people are catching up. So, I think those things would be the ones that I would like to tackle. And then the last one is an understanding of that kind of appropriateness for a task. I'd like there to be some consensus on that.


Debbie Reynolds  35:01

I agree. I agree with that wholeheartedly. Wow. I'm so glad we got a chance to do this. I'm sure that this episode will be very illuminating as it was for me. It's fun for me to talk. Geek out about data. Thank you.


Julie Schroeder  35:17

Anytime you want to geek out about Data.


Debbie Reynolds  35:21

Thank you so much. I'm sure we'll talk soon. Thank you.


Julie Schroeder  35:24

I enjoyed it. Thank you for having me.