"The Data Diva" Talks Privacy Podcast

The Data Diva E117 - Brendan Sullivan and Debbie Reynolds

January 31, 2023 Season 3 Episode 117
"The Data Diva" Talks Privacy Podcast
The Data Diva E117 - Brendan Sullivan and Debbie Reynolds
Show Notes Transcript

Debbie Reynolds “The Data Diva” talks to Brendan Sullivan, CEO at SullivanStrickler LLC. We discuss how he arrived at his current status from his early work on data retrieval, memory, and backup data systems as an Electronics and Telecommunications Engineer, the importance of data backup, and how companies can use this technology to protect themselves from data loss. We also discuss the privacy implications of data backup and how companies need to be aware of the regulations surrounding this area. Brendan Sullivan discusses the legacy data problem, which he says came with Windows and UNIX. He says that companies keep data too long and often don't have a good strategy for dealing with data at the end of their data lifecycle. Sullivan says that the legacy data problem is a huge challenge because companies too often don't have a good strategy for dealing with it. The conversation explores the differences between the old way of managing data and the new way of managing data. The old way of managing data was much more transactional while the new way can be much more efficient. We discuss data in the cloud as a backup, de-duplication, and identifying what needs backup and what needs archiving which are different, what are data sessions, the right to be forgotten vs. the right to deletion in privacy laws, the lack of access to legacy data, obsolete data access, air-gapping, chain of custody, the need for data audit trails in tracking data access, and his hope for Data Privacy in the future.



Support the show

51:05

SUMMARY KEYWORDS

data, backup, tape, companies, people, problem, largely, work, create, technology, privacy regulations, backup tapes, software, backed, terms, legacy, regulations, privacy, developed, risk

SPEAKERS

Debbie Reynolds, Brendan Sullivan


Debbie Reynolds  00:00

Personal views and opinions expressed by our podcast guests are their own and are not legal advice or official statements by their organizations. Hello, my name is Debbie Reynolds; they call me "The Data Diva". This is "The Data Diva" Talks Privacy podcast where we discuss Data Privacy issues with industry leaders around the world with information the business needs to know now. I have a special guest on the show. Brendan Sullivan is the CEO of Sullivan-Strickler. Welcome to the show.


Brendan Sullivan  00:38

Hi, Debbie; it's great to be here. Looking forward to it.


Debbie Reynolds  00:42

Well, this is going to be a fun podcast. So when we met on LinkedIn, you called me up, and we ended up; you invited me to Atlanta to go to a roundtable which was great with us, some of your clients and you're in the data retrieval space. I think it's really fascinating because I think this is something that all companies deal with. But they think it's in the back room, we don't have to worry about it. But what we're seeing in terms of especially privacy, you know, we have laws and regulations that are saying, hey, what are you doing with this data? How long are you keeping this data? Like why do you have it? So tell me a little bit about your trajectory into your work in data and to where you are now.


Brendan Sullivan  01:37

So I graduated as an electronics and telecommunications engineer and then moved into a company that developed 64k static RAMs, which is really aging me. So that was back in about 1984, 85. And then in 85, I moved to a company that was developing the world's first square tape cartridge. With IBM, like, I didn't work for IBM; we were working with IBM, it was a 34 ED tape cartridge back in 85. So since 85, I haven't really left tape or backup since then. So it's been a kind of a lifelong career in that area. Initially, I would say the first 15 to 16 years of that was in development, production, development, marketing, and selling of backup tape technology 34 ED; we built the first DLT tape cartridges that were put in, and I was a member of the team that did that. So a lot of the old technologies and we also retired the last nine track tape technology. So it's always been tape and backup. And then at around about the late 1990s. We were getting squeezed out if you like from the tape manufacturing from being an effective tape manufacturer; the technology was getting extremely dense in terms of the recording that lent itself much more to companies like Sony and Fujifilm to develop and we needed to reinvent ourselves. And so if I was tasked with part of that, you know, what, where do we take our technology? So at that point, we moved to services, and I moved to the States in 99; I should say I worked for a company called Amecom at the time and Amecom Magnetics many of the listeners out there might know that Amecom Magnetics actually got spun off to form eMag solutions. So it was originally Amecom. And so as we were reinventing ourselves that we we had the idea that we would offer recovery services from backup tape, rather than just manufacture of backup tapes, and we stumbled across the Enron matter, and we stumbled into doing work on the Enron matter and then doing work on Silverlake versus UBS. And that really was the trigger for us to get into more of the legal compliance side of data services. So from that moment on, we built technology to be able to restore and produce data from backup environments. In the 2000s 2011. I had a couple of years I left the company after 20-something years and had a couple of years working for Alex Partners as a legacy data remediation consultant, and 2013. I came back into the specific backup industry, so to speak and formed Sullivan-Strickler. So we're coming up on 10 years old. And we've been building a company that specializes in restoration production advisory consulting services in and around, primarily that kind of the backup market. But I would say in the last five, or six years, the vast majority of our technical development has been triggered, if you like, by changing regulations and Data Privacy requirements on that data. And that's kind of lent itself to the tools and services that we now offer.


Debbie Reynolds  05:35

Fantastic, and I got a chance to tour your facilities. Very impressive. I would love to talk with you a little bit just about legacy data in general, right? So this is a topic I like to talk about because I feel like not enough people talk about legacy data. So all companies unless they're brand new, right, have some type of legacy data. So maybe something you know, stuff that has maybe aged out in some way, is storing data from maybe software that people no longer have or use. So tell me about just the legacy data problem. So to me, I think it's a huge problem. So I think legacy data is a huge problem because companies keep data too long, first of all, and then a lot of times they don't have a good strategy for how to deal with data at the end of its data lifecycle. Tell me a little bit about that.


Brendan Sullivan  06:37

Yeah, there's a lot to unpick in that question. So I would start by saying, you know, from my perspective, I think the legacy data problem so to speak, really came with Windows and UNIX. In terms of, you know, if you go back 30 years when the vast majority of data that was being created and backed up were managed on things like Unisys systems, IBM mainframes, IBM AS-400s, and particularly the mainframes and the way that they manage data in the 70s and 80s. It was a much more transactional type of management process. And in that, you would create a data set, write it to tape, update that data set, and rewrite it to tape, so there was a much more daily backing up of data. And of course, what that meant, largely was that the terminal was dumb, so to speak, and that it didn't have its own storage where this data was created. And, but what it also meant is that you weren't backing up junk that you didn't have to back up. So, as the PC came to fall and processing in and around Windows or Unix environments developed, it became much more economical, or it was economical for programs to be run locally, as in software application programs to be run locally and then archived and then backed up either on a daily, weekly, different methodology, you know, incremental, differential, and full type backups. And of course, what this meant was that data was dumped to disk, and then backed up to tape, and then subsequently backed up to disks, like a NAS or SAN-type environment. But it was all together. And so the actual data that was been backed up the IT administrators of the day, their major challenge was their backup window, which meant that you know, you get a certain amount of time, at the end of the day, or at the end of the week, to backup the data from the company from all of these different people that would network on the system. And the push was, we got to back it up faster, and we got to back it up cheaper. And so the software companies answered with, Okay, we've got a great new way of multi flexing or multi threading, or we'll put a whole string of tape drives together and will backup the different portions of data that you've got to backup as fast as you can. And if you imagine it's like a bucket, a bucket of all sorts of different types of data that gets backed up as quickly as it possibly can. And then that bucket is your media repository, whether it be a tape or disk. And that's great. You've backed it up; you've done it quickly. You've got it safe. But regulations that have come in, litigation that comes in that sense shows itself requires the unraveling. And I think the key is, you know, when you look, when you talk to people about backup software, the key is in the language, you're talking about backup software, very few, back in the day called it backup and recovery or backup restore software. So there was much more focus on backing it up quickly rather than retrieval of it. And what privacy regulations and other regulations have since proven is, is that it's the finding of that data and working off that data that has made it a problem. And so, consequently, I think you could really term two ways to organize data. Two fundamental ways you either get it right at the start, where you push data into known locations so that they can be easily identified and therefore retrieved. Or you have to index the lot to figure out what you've got after the event. And we've been going through a 20, 25-year transition where both those sides of the field have been worked on. In that records, information management, techniques and roles. The ability to categorize and classify data correctly has dramatically improved with companies as they've taken this far more seriously. And the ability to index and search with, you know, technology and machine learning AI-type technologies and search platforms have made it easier to find it. So there's that we're solving the problem from two angles, or trying to solve the problem from two angles. But we have a massive bucket of largely unstructured data, that is legacy data, that we don't know what it is. We don't know where it is. And we do know that regulators think it's important. Companies think it's important. Litigators think it's important. And candidly, that's why Sullivan-Strickler exists. That's what we're in business for.


Debbie Reynolds  12:10

Very good. I would like to talk about the cloud a bit. So before the cloud, so people have data off servers, right? They have these backup routines. So and I want to touch on something that you and I had a chat about on LinkedIn; I thought there were very interesting insights that you made, which was backups were only seen or should have been seen as a kind of short-term storage. And now people see it as something different. But I want to talk to you about that. And then also, when people started putting things in the cloud, there were some people who will say, well, since it's in the cloud, I don't need a backup of it. Right? I don't think that's true. Because unless you pay extra for it to be backed up in some type of way, the way that the cloud environment works, historically it's real time for you. So let's say John is a disgruntled employee; he goes into the cloud deletes all your stuff, you go back to the cloud vendor, you're like, well, you know, this happened, like two days ago, John deleted all this stuff. And they're like, well, it sucks to be you kind of, right? So tell me about that issue with the cloud. And then also, your thought about, you know, backups really, initially be supposed to be short-term storage until you can update it.


Brendan Sullivan  13:34

So it's heavily misunderstood in terms of, you know, what a backup is or what it should be. And I think it's been corrected largely as professional companies, over the years have recognized how it was, how data was being managed inappropriately, and now it is largely getting managed much more appropriately. So I would define backup as a short-term duplication of data. So if I'm creating data in a company on day one, and I want to, you know, if there's a disaster of some description, whether a server goes down, there's a fire or an accident or something like that, I need to get that data back following day or the following week, or I write it over, I might write that data over. So if I'm creating data on day one, and then I leave, and I go home, and I come back, and I create data, slightly different data the following day, then it makes perfect sense for me to have a backup of that data on day one, and then I discard it on day two. So because I've refreshed it. Now, if I don't do that, then what I'm doing is if I'm creating 100 gigabytes on day one and 100 gigabytes on day two, then you know I've got 200 gigabytes when maybe I only need 105 gigabytes, so to speak. So backup needs to be recognized for what it is. It's a short-term duplication of data. Archiving is different. It's data that you have to keep, data that will not change in a short period of time. And data that you might have to keep for seven years or five years, or whatever it is. And so you push it to a similar repository, a similar media repository for safekeeping. And I think, you know, what's happened over the last 20, 30 years is we backed up everything. And the terms archive and backup have not really been distinguished adequately. And they resulted in a huge amount of what I would call junk, you know, when you're creating any data in an organization, you might have data that you need for business, data that you need for regulatory reasons for compliance, audits and data that you might have to preserve for litigation. And outside of that, everything else really is junk. And so when we're, you know, and it's definitely in vogue, to move a lot of data to the cloud. Now, it's probably a good archive platform; I don't think it's a good backup platform, and I don't think anybody would argue that it was a good backup platform for that. Because it's largely been shown that actually, egress data, pulling it out of the cloud is extremely expensive and can be slow. Putting it in there, if you're not going to touch it frequently, can be very cost-effective. So I think it makes a lot of sense for companies to put the minimum amount of data in the cloud as possible. And data that is actually going to be archived and not looked at on a regular basis. Otherwise, it's going to get expensive. And it's going to cause problems, not least of which various organizations like Gartner have done research into typical data that gets put to storage repositories, whether it be archive or backup, and outside of that business, regulatory or litigation pool, the junk, you know, which might be you know, a personal photograph that's attached to an email, you know, this friend has sent you or something like that, do you really want to pay money to keep that and in five different instances across your backup repository, not least of which there's all sorts of risk data, that you don't want out there data that might have been risk-free to have in your organization on day one, but by day 10, it's then a high risk, it's a liability. So I think pushing data to the cloud needs to be extremely intelligent; records management needs to play a very, very big role in that to avoid what will certainly be expensive, problematic issues going forward when it comes time to retrieve it.


Debbie Reynolds  18:09

Let's talk about duplication or de-duplication, right? So keeping the same copy of the same thing over and over and over, as you say, gets very expensive. And this has been the way that people did it forever because they felt that was just the safest way. But we know as a result of a lot of, you know, current and pending regulations. Now there are regulations around privacy; they're saying, You should only keep data unless you have some other reason to keep it no longer than is necessary. So that's like a whole new world in terms of how people kept stuff because I know, I used to remember when people were running out of room, let's say they were doing like a hot backup on a server, they will say, well just buy a new server, right? Just throw the backup on there that just keep going on and on. So, I know companies still have those issues, but talk to me about the challenge that companies create for themselves when they store duplicates and how you help them with that.


Brendan Sullivan  19:15

Yeah, so, again, a lot to comment on that. I think you know; if you take a typical or a standard organization and that has a backup or an archive system, it can be very cost-effective to push that data to whatever the media might be, whether it be you know, tape typically in the older days, but also NAS and SAN and various other network storage type applications. The duplication of data is largely initiated by the problem of a poorly understood retention schedule, how long you need to keep stuff. And therefore, when you aren't backing it up? Are you backing up the same data on multiple occasions, and where is that data going? Then we would classify data by type by category as sessions. And that's largely how backup software companies would manage data. And, so that data, for example, if I have a print server, or if I have a Microsoft Exchange Server, or a SharePoint, or Oracle Database, whatever it is, the servers that get, have that data backed up from the backup software, organize them into sessions. So I might backup my Oracle data as a backup session. And I might back up my print server as a backup session. And then all of those sessions can go to the same repository. But they may have different retention schedules; in fact, they almost certainly will have different retention schedules. And once they're gone to the same storage repository, the ability to manage them to different retention schedules becomes problematic. So if I have data that might become a risk or might become a legal issue. And it could co-populate a storage repository with something that has to be kept for 30 years, then clearly, one session, one backup session data makes it problematic or forces you to keep something that you don't want to keep for legal reasons or for risk reasons, I would say. And the ability to remediate data by sessions can be extremely useful. And it can be extremely useful for regulations that might be driven by Data Privacy, for example. So it may be very difficult to take somebody's PII that might be backed up maybe or PHI that might be backed up on a storage repository somewhere; it might be very difficult to take that person's personal information out of a storage repository without taking everything else out. But it can be much easier if the whole data class doesn't need to be kept or a certain data class doesn't need to be kept; it can be a lot more cost-effective to actually take only the type of data by class out of the system, and maintain the stuff that you have to keep for long term. So in fact, I would hazard a guess that 80 to 90% of the technical development work that our software engineers do in our company is focused on elegant and efficient ways to unravel data rather than restore any particular data. Because I think it's increasingly the case that companies are recognizing the risk and recognizing potential Data Privacy challenges and recognizing the regulatory problems that they may be getting it right from day one forward. But they've got this reservoir of potential risk and potential problems. And unraveling it is the challenge. And we've developed technology. And I think it makes the most sense to try and unravel the data that is at risk of falling afoul of a Data Privacy regulator; it makes sense to unravel it by a session of data or a server if you like of data rather than the individual custodians themselves.


Debbie Reynolds  24:20

I agree with that. That's a tough problem. Because we're moving from a situation where a lot of companies, too many companies say, well, let's just keep everything forever, and then we don't need to think about it. But now you really have to think about it, right? Because now you have all these issues.


Brendan Sullivan  24:39

So I have a question for you, Debbie, if I can.


Debbie Reynolds  24:41

Yes, yes.


Brendan Sullivan  24:42

So, you know, on this topic, it's really, like I mentioned it's probably 80% of our technical development is unraveling this data on we see a lot of the Data Privacy regulations. You know the one that seems to be the most appropriate one, regarding legacy data is the right to be forgotten or that, you know, that's the term that was, that's been largely used within GDPR. Although a lot of the privacy regulations have something similar in, you know, wherever somebody wants their data removed, or somebody requests the data to be removed. So what we don't see a great deal of is where that it, you know, we see many instances and sanctions if you like, where there were instructions to remove data is affected. But we don't see too many instances where companies are forced to respond to those right-to-be-forgotten requests on legacy data. And I'd love to get your take on why do you think that is? Is that something or they're going to get sharper teeth as Data Privacy regulations evolve? Is it a lack of knowledge as to how to go about it? Or a lack of ability on how to go about it? Where do you see this developing with regard to legacy data?


Debbie Reynolds  26:14

Good question. I think a lot of it is around lack of knowledge and lack of ability, in some ways. And so in the US, we don't have a right to be forgotten. But in some places, we have a right to deletion. And I think sometimes those rights are time bound. So you know, later how to go back forever and ever and take your data. But then also, you know, it's pretty interesting. So there are some cases, right, we've got cases in Europe, and these around like Google and stuff like that. So they were basically saying, you know, Google wants you to deep link this data on the Internet. So it doesn't erase the data; it just makes it harder for people to find the data. And so I think also, some, there have been some cases of you know, this is not a blanket thing altogether. But there have been some cases, especially in legal circles, the thought was that, especially if something was on backup tapes, it will be too difficult to access, difficult to deal with, or too expensive. So I think it's more on a case-by-case basis. So I don't see anything that says like, you definitely don't have to go into you know, legacy data or whatever. But I do see, you know, if it is easier to access in the future, I'm sure that people who do these cases, we'll probably want to do that. So I think, you know, I remember when people have stuff on the premises, and, you know, lawyers will say, oh, we can't get data for backup tapes because it's not reasonably accessible, right? But that's no longer true, right? With a lot of technology that you have some of that stuff is easier to access. It isn't super laborious, or maybe it couldn't be, it probably isn't as expensive, as you say, you know, you don't necessarily have to unfurl everything to be able to do that. So I think, as more technology comes to bear, I think the capabilities are there to be able to do that. But I think, you know, the people who are asking for this or thinking about it, they need to have more education about what is possible. So I think that the thought had been, oh it's not reasonably accessible because it's on the backup tape or whatever it's like, because we know the technology is more sophisticated than that. And maybe it is right, maybe it is, or maybe they need to change the way that they handle their data retention.


Brendan Sullivan  28:56

Yep. I couldn't agree with you more. It's you know, the challenges that were evident and the costs and the challenges that were evident 15 years ago and not necessarily the same today. So yeah, I couldn't agree with you more.


Debbie Reynolds  29:12

Yeah, yeah. Now, one thing I want to talk to you about is obsolescence. And you and your team do something fascinating that I've not seen anyone do. So I'll give you an example. And then I want you to tell a little bit about the thing that you do around writing software for obsolete data. So let's say someone had a court order, and they have to keep maybe backup tapes or some type of hardware, or they have data on it for 10 years, 15 years, 20 years like this is my you know, I have actual cases like this, right? So the problem with that is that and maybe this is very similar to what you think about legacy data backup tapes from people, you know, like, right to be forgotten? It's like if someone if you have a court order where you have to keep that data for this long period of time, the knowledge about what's on those data systems are being lost because people are leaving organizations, are retiring, the applications that store that data are probably no longer available. So in some ways, I think if you feel like you're doing your duty by just locking this data into a room for like, two years, I think that's a problem. Because also tapes degrade, right? There's degradation there, and there's obsolescence there. So tell me about how you help people in those areas, so they have data they have to, for some reason, it may be old, it may be legacy, they may not know, the software? Maybe it's on media that is degrading or going to degrade after some point; how do you handle those situations?


Brendan Sullivan  31:09

So in terms of tape degrading, you are absolutely right. It does. There's no doubt about it. I think it's largely an old problem, though; I don't think it's, you know, if I was to go back in time, 30 years, then I think the vast majority of potential risk due to aging, preventing anybody from getting hold of a file that they wanted, was probably on the tape. I think these days, the risk of anybody not getting a file is more likely that the software is no longer available or supported. Or maybe the hardware is no longer available or supported or not easily available or supported. The tape itself, it's developed, you know, 30 years ago, it was iron oxide material was largely used to backup certainly with nine-track tape. And then a compound that came out was chrome dioxide. And so on 34 ED and on various other tape technologies, chrome dioxide was used. And then metal particle or metal pigment was the coating material used. And more recently, it's become barium ferrite. And now strontium ferrite. On the latest LTO tape technologies, so each of those materials that have been used in backup tape as they get harder and harder, and they require weaker and weaker or a lower and lower signal to be able to read the flux transition or the actual bit of data, the one or the zero, that's recorded, so they last longer on the tape surface, they require less current to read. And the particles are much harder, which means that the stuff that we used to see like edge damage to tape or debris being generated is much, much less. So I would say that, you know, if you have purchased a tape technology in the last 10 to 15 years, you're going to be reading that in 30-plus years’ time. It's much, much more solid. Now in terms of the software, what we do is we write routines to enable data to be read from not just tapes but also from things like you know, disk environments like data domains. We write routines from all of this backup software that essentially trick the storage media, whether it be tape or disk into believing it's in the original environment that it was in. And here's what that means. If a backup and recovery software company developed a backup software 15 years ago and no longer supports it five years later because there's a new, latest, greatest way of backing up data. Then when we get that storage media into our engineers’ hands. They write routines. They understand how they learn from looking at the data at its lowest, at its rawest level. And they work out how that data was backed up to that storage media and then they write code that enables that data to be restored in a standard Windows environment so we can pick up something that might have been created in a Unisys you Unix, Sun Solaris, AIX, AS-400, MVS. All of these environments have 1000s of different backup routines that have been created over the last 30 years; we create a repository in our own software that can call upon each one of those emulators that trick the media into believing it's in that environment, and therefore get hold of the data. And so what that means is with the case in point that you had, that if somebody locks in a room, and that software or the hardware is no longer available, we've got a repository of code probably written that will read it. And if it hasn't, then we'll be able to read the data at a hex level and figure out how to write routines to restore it. So that's kind of the end; it can be extremely effective, not just for restoring the whole file but for also learning about the data in itself at a metadata level. So often, you know, nine out of 10 times you can make decisions on data based on the metadata rather than the content itself. So it can provide a lot of elegant ways to do that, also.


Debbie Reynolds  36:17

Yeah, I thought that was fascinating when I saw you and your team do that. Because that's such a huge problem that people have, they thought, oh my God and software; I don't have it the hardware, I don't know how to access it, how do I get this data? How do I use it or get insights is really cool? Well, one thing I will talk to you about is air gapping. First of all, we've talked about; maybe you can explain to people what that is. And the reason why I'm bringing this up is I remember, you know, maybe I'm dating myself now, right? So back in the day, when I started working in technology environments, I remember we used to have meetings about whether we're going to put something on the Internet. Right, so now it seems like everything's connected to the Internet. Everything's either on the Internet. And so I think, you know, one area that people can go to, to try to reduce some of their risks, is not putting everything on the Internet. So give me an idea of first of all, what is air gapping and why that could be a benefit to people in terms of securing their data.


Brendan Sullivan  37:25

So an air gap is simply a computer system or a piece of data that is not connected to the Internet; it's as simple as that. And so when anybody or any company acquires a backup environment or any kind of computer environment, if it's connected to the Internet, then it is not air-gapped. And I'm a large tape advocate. As you might guess, from somebody that's been around in the tape industry since 1985. It's certainly not appropriate. These days to use tape for backup, there are far more effective backup solutions out there than just using tape. For archives, I think it's still the perfect media. And the three factors that make it a perfect medium are that it lasts a long time, it's portable, secure, and it's unhackable. You can't hack a piece of data that's air-gapped. Certainly, that's residing on a piece of tape. So increasingly, you know, when dealing with companies, risk departments, this is what they want to know, they want to know that if you're going to be working on my data, you know, everybody has ever done a risk assessment, a vendor assessment, you know, any vendor, they've had these 60-page documents that come from financial services firms or other firms. And there seem to be 20 to 30 pages that are you know, tell me about your vulnerability testing, tell me about your penetration testing and all this kind of stuff. And we run systems that are air-gapped and what that means is that it really is irrelevant because the data that we work on is static, it's inaccessible on the Internet, and we process it separately. So that is what the air gap is. And that's why it's critical, certainly in our space of servicing clients that are highly regulated and have risk at the top of their minds. And obviously involved regularly in litigations. They want to know that their vendor and service providers have air gap solutions. So even within our data centers, outside of the tape, obviously being the easiest definition of an air gap media. We have our gap intranet so, within a company, you can't log into them from outside, as their network is internal.


Debbie Reynolds  40:08

That's great. That's great. I love that. I'm glad you guys are doing this; I was very impressed with how you all handle yourselves. Let's talk a little bit about the chain of custody. So a chain of custody is kind of a boring term, I guess. But it's very vital in terms of who has data and what they're doing with it. And having that chain of custody, in terms of knowing who has what will definitely lower your risk profile because then you know what your exposure is. Tell me a little bit about the importance of chain of custody and data.


Brendan Sullivan  40:51

So if you're in computer forensics, and you talk to potential clients about the chain of custody, it largely starts and ends with, you know, here's my laptop, and I'm handing it over to you. And here's the serial number, and here's who I am and when you sign here, that you've received it, and will, and I'll give you my signature back when I hand it back, and that's a chain of custody. And then whenever, you know, when once that chain of custody has been transferred, then you're responsible for that asset, that digital data asset until it gets handed back. And that you know, it's part of an evidentiary route. That is important. But the chain of custody is a much more detailed event, especially when dealing with major corporations and highly regulated industries. And in that, it's not just the vendor or the recipient of the media that needs to be involved. Or that is often required to be involved at a much more extensive level. It's tracking of both the physical asset and the digital assets within that physical asset throughout the whole process. It's audit trails, its logs that are created at every stage of the processing of data. So that the data controllers so to speak, might ultimately end up being the data processes of data that you receive. It's a, have I got your media, whether it be a laptop, a backup tape, or a disk B, when did you assign control C? Where was it when it was in your organization? And D? Where did the data go? Once it was in your organization? And then I think we're up to E or F? And then what people have access to it? When did they have access to it? When did they pick it up? When did they hand it over? And then finally, how was it modified if it was modified? So the chain of custody documents that we often see with highly regulated, big businesses can end up being 25-page documents, or more from the time that the asset is picked up to the time that the asset is either destroyed or handed back. So it can be an extremely extensive piece of tracing of every, every part of that data and that physical asset that's involved.


Debbie Reynolds  43:50

And I think from a privacy perspective; we're going to see a lot more emphasis on especially like I said, those tracking logs, like the audit trail about who touches data, because some of these laws, you know, not just data breaches, right? So data breach is one thing, but then we're seeing a lot of companies get in trouble for unauthorized access, right? So it's not necessarily a quote, unquote, breach. But if someone has some type of unauthorized access, there probably is some type of audit trail that if that exists there tells either that that happened or it didn't happen, right? So I think the world is getting into a very data-intensive part of the world or part of life right now. Especially because there's so much onus on maybe the lineage of data, like where, you know, who has it, where did it come from? Where does it go to who has access so.,


Brendan Sullivan  44:54

Yeah, you're right; Debbie and I might you know, broadly classify that as just defensibility or I'm increasingly, you have to have defensible, demonstrated defensible processes. And that means the chain of custody tracking logs and metadata at every single stage of the process. And that can be the larger part of the actual work that needs to be done.


Debbie Reynolds  45:25

Yeah, I agree with that. So if it were the world according to Brendan and we did everything you said, what would be your wish for either privacy or data handling? What do you want in the future if you had your wish in terms of how people handle data?


Brendan Sullivan  45:43

So I think it's a good question. So I think there's some silliness in the world of Data Privacy regulation. You know, right now, I think it's obviously, it's an evolving area, you know, obviously, there's been, not just talk, there's been a lot of work towards more Federal Data Privacy regulations in the US and Europe has largely led. But there's still some, some real, some real silliness. And I'll elaborate from my perspective on what it looks to be a little silly. Yeah, you know, there's, there's Google in France, for example. You know, if you search for something here, you don't have the authority to view it outside of this country. But yet, the Internet is everywhere. And so, you know, so how do you apply regulation to a particular country when it's so clearly easy to breach that regulation, you know, if I was to view data in any other country, then I would simply get a VPN in that country, and I would simply go and view it. So you know, there's regulations that appear to me to be created, and, you know, huge amounts of money and time go into creating these regulations, when they're, they're so easy to breach, and they're so easy to not be enforced. And so I think, you know, to translate that to a wish, you know, this is not a country problem. This is a global problem. And I believe that Data Privacy regulations can only truly be solved in the modern world at a global level. And so maybe more time and effort spent between countries on creating standards than having bodies that develop various regulations in each individual country. Would that be too much of an ask, do you think that'd be? I don't know.


Debbie Reynolds  47:58

That's a great answer. I would love it. As you know, I've talked about that a lot, too, because I feel like we all have the same problems, even though we're in different jurisdictions. So I feel like we probably have, so many have similar problems. And maybe if we can come up with some baseline understanding, you know, about some baseline things, we can actually get more done. So I agree with that. I agree with that.


Brendan Sullivan  48:22

Yeah. And if you kind of follow through from that wish, you know, it shows itself in huge inefficiencies as a result of it. We, again, different countries create things like Safe Harbors. Or yes, you can look at my data, but only if you're in this country, or only if you're in this certain jurisdiction, or maybe I can create a Safe Harbor for this, but not for that. And if there was a much more standardized global regulatory oversight, then, you know, those could be dramatically simplified as well, which would really lower costs and make discovery a lot simpler, I think.


Debbie Reynolds  49:07

Yeah, I agree with that. And I'll also add to your wish, is that I wish that a lot of these regulations were written with how technology works in mind. So some of the things that are written, it's like, well, that's not how technology works. So you're asking for something that first of all, either can't be done or can't be done efficiently, or people try to develop something to answer that question that doesn't exist. So I think we're kind of chasing our tail on some of the stuff.


Brendan Sullivan  49:07

Yeah, yeah. 


Debbie Reynolds  49:17

Well, it was great having you on the show. I really enjoyed spending time with you and your team and people in Atlanta and I'm happy that people get to listen to this episode because I feel like everybody, all companies, you'd have to backup of some sort, right? So these are problems that almost any company can have. So being able to shine a light on this and this to me, you know, the legacy part of data or the end-of-life state of data. This is where I feel like a lot of companies fall down. So being able to have someone like you who knows a lot about that, I think it's really helpful.


Brendan Sullivan  50:24

Excellent. Thanks so much, Debbie. It's been a pleasure.


Debbie Reynolds  50:27

Thank you. Thank you. Talk to you soon.