Episode Description
What We Discuss with David McCaw:
- 00:00 Intro
- 08:54 What is Data Security in Cloud?
- 19:38 Importance of Data Classification
- 23:37 Data Security – Cloud vs On-Prem
- 27:45 Access Control is not Enough
- 29:46 What controls should we have?
- 35:06 Is DLP harder to manage in Cloud?
- 37:40 Is relying on Cloud Native Controls enough to Secure Data
- 42:09 Structured vs Unstructured Data in Cloud
- 47:28 Data Hygiene Controls
- 55:37 Fun Section
- And much more…
THANKS, David McCaw!
If you enjoyed this session with David McCaw, let him know by clicking on the link below and sending him a quick shout out at Linkedin:
Click here to thank David McCaw at Linkedin!
Click here to let Ashish know about your number one takeaway from this episode!
And if you want us to answer your questions on one of our upcoming weekly Feedback Friday episodes, drop us a line at ashish@kaizenteq.com.
Resources from This Episode:
- Tools & services, discussed during the Interview
Ashish Rajan: [00:00:00] hello. Hello. Hello, and welcome to another episode of virtual coffee with Ashish. And I’m presenting this on behalf of cloud security podcast. Today’s topic is a new month and we are talking about all the trends that are going on in cloud security, but we haven’t been able to cover them throughout the year, but somehow they’re going to have a huge impact in 2022 and moving forward and to kick it off, I have a really interesting topic that I don’t think we have spoken about ever on the show.
So I’m so glad that I’ve got a friend of mine coming in to talk about this. Hey David, how’s it going?
David McCaw: I’m doing well.
Ashish Rajan: Good. Good. I’m so glad you could come in, man. I think you, and I’ve had so many conversations and its not even funny that how often we were like, oh, there is something here. There’s something here. And I’m so glad I could bring you onto the show as well meant.
So I’m really excited for this, I wanted to start with, because I know a lot about man, I think, but some of the audience members may not, they may or may not even know you’re a celebrity status as well.
So I would love for you to kind of tell the audience a bit about yourself.
David McCaw: Sure. you and I, of course cross paths maybe about five years [00:01:00] ago when, , I was the first go to market person at a company called red lock, which, , was one of the early leaders in the cloud security posture management space.
And obviously, , it was acquired by Palo Alto networks and became prisma cloud. But I’ve been in industry for roughly 20 years and, the path that I followed has been one that’s been very, carefully selected by myself , from kind of, , organization that I’m joined to space that I wanted to tackle.
So, , when I came into the industry, 20 years ago, I was with a company that was doing, testing software , off testing automation, to help people improve software quality as part of the development life cycle. , we have some interesting stories there, and Ashish, I shared one with you where, , during that time in 1998, actually sold the first piece of commercial software ever to, Sergei Brin when he was in college.
And they were working on Google as a prototype. they had some memory leaks that they needed to be able to resolve and they called and asked if we could provide them with a student discount, , with some software to find their memory leak and And so, , we did just that.
And [00:02:00] lo and behold, Google becomes, , one of the biggest companies on the planet, but , I spent a good amount of time there really focused on, , how do you improve SDLC? How do you create a more predictable SDLC , through automation and continuous verification?
And, we really pioneered some of the early principles that, , have. Kind of created what dev ops is today. And so while initially the focus there was on code quality I then joined a company called Coverity where we started to shift from just focusing on quality to application security and, , kind of taking AppSec, which was historically kind of an end stage process, just like testing was an onstage process and helping CISOs and leaders at that time, bring AppSec into something that would scale by shifting it left in the development process.
So, that was kind of the early phase of my career and as new technologies have emerge, particularly cloud, that are a lot of the principles that were applied in, , application quality that now we’re also applicable in cloud. [00:03:00] So, at the moment that , you could start provisioning infrastructure as code, that also was the moment that you should start validating that infrastructure as code the same way that we’ve always validated code historically.
So, as I mentioned, I was very, very early in the journey , of red lock and building that out as a company. And since then, I’ve had the opportunity to work very closely with the founding team at cloud Knox is a leader in infrastructure, entitlement management, for cloud. And most recently I’ve joined as a co-founder at does Dasera , where I’m very, very excited about what we’re doing next.
Ashish Rajan: That’s pretty awesome. And I wonder how many people think that if Google would call anyone now for student discount, it’s going to be the other way around with like, Hey, can you work with us please? But it’s amazing story, man. I think you, to be able to have, I don’t know, use the word veteran goes sound pretty old, but you kind of our veterans cyber security space, having met the celebrities, being a mini mini celebrity yourself as well.
And redlock definitely was one of the first ones to kind of stand out and the whole the hack [00:04:00] that was discovered by the breach of, in Tesla was discovered by them that that definitely caught them in the news.
Because we’re talking about data security and a lot of people have like a very broad understanding of what data security is like, but I’m curious to know what does data security in cloud?
What does that mean for you?
David McCaw: Yeah. Great question. , I think , to fully contextualize that you need to take a little bit of a step back and think about what has cloud meant to industry as a whole, right. And how has cloud enabled digital transformation? How has cloud enabled businesses to start moving faster?
Right. So, if you look at , this whole kind of push towards digital transformation, , it’s really fueled by data. And there are a bunch of interesting statistics out there, but I’ll share a few of them for the listeners in case they haven’t heard these, in, 2020, it’s estimated that there will be 175 zettabytes of data, , that are collected and stored across cloud environments.
Okay. And to put that into context, 175 zettabyte is the equivalent of [00:05:00] all data that’s been collected and stored in the aggregate history of mankind. Wow. Yeah. So if you take our entire. And benchmark that against just what we’re collecting in terms of data this year it’s even, and that number is growing 30 to 40% year over year.
And if you think about that in terms of like our daily lives, right, like organizations are now trying to capture every piece of telemetry that they possibly can to assist in automated decision-making and extracting as much value as they can and helping their business move at a much faster pace. Right.
So,, it’s what comes first, right? Does the cart before the horse, in terms of what is data and what is data security and cloud, , like cloud enabled the mass collection of data that fuels this digital transformation. And we need to look at , things kind of in harmony versus individually.
And so, , the data collection is massive, because of, , modern cloud architecture and how easy it is to deploy infrastructure in cloud. You’re also announcing a [00:06:00] challenge where, , you’ve got cloud velocity is moving faster than governance is able to keep pace.
The data environment is now decentralized compared to how it used to be before people were in the cloud. So, , back in the day, you open a ticket with a DBA or your IT administrator, and you’ve asked for a database and they’d go rack something into your colo. And that was it.
But now people are like spinning up Redshift instances and RDS is and snowflakes and big queries. And so now you have this very distributed, environment . That’s housing much, much, much more data. So that just creates a problem and challenge in terms of governance and security because of the complexity of that environment.
And then, , throw into the mix, the fact that there are over 160 countries now that have passed privacy legislation around how you can actually use and how you need to protect that data for consumers. So , there’s a lot that comes into, call it a, a data governance program, and , even more that needs to be considered , when you look at that in the same [00:07:00] context as your overall cloud security program, and that’s what we’re helping people try to understand.
Ashish Rajan: It’s always shows the depth of the work that you guys have done in this space as well. For data security is to what you just mentioned between privacy and the fact that now in a cloud world, where.
A lot of people don’t really have the database administrator title. Like that used to be a thing. Now, as a developer, I can just create a database whenever I want, because I just need like the right template. And I can just create the right database of any size that I want. That can be really scary as well.
But is that what I mean? I know the real concern with data security atleast , a lot of people think about data security from a, how do I put this? They put it from a, or my data breach perspective. They go, oh, I don’t want to get data breach. That’s like, , when people talk about cloud and data, they’re really thinking, oh, I just want to prevent a data database now, is that really like, I guess sounds like there’s a lot more depth to it because what you just said is another layer that no one’s talking about, people are talking about, Hey, my S3 bucket is open or something else is open.
They are talking about, there is this silo [00:08:00] between my data privacy team. And then there is my data storage. So, is that a more to this the whole database thing than just like S3 bucket being open on the internet?,
David McCaw: there’s quite a bit more and , transparently, I’m also learning about this every single day as, we’re trying to, , get to what the root of the challenges are , across industry.
And, , one thing that I didn’t say when I framed the problem is, , as there’s studies that have gone out , that have indicated that, , more than 50% of security professionals have little to no confidence around their data security posture and the number of breaches, although we’ve been applying methodologies like DLP for many years, the number of breaches has not gone down at all.
In fact, it’s growing, And in fact, one other key metric Ashish that I’ll share with you that I just thought was, was, egregious considering all of the investment , that’s taken place over the years for data protection, because when you think about it, our job than security is really two things.
It’s first it’s protecting the resilience of the infrastructure , and it’s [00:09:00] protecting the secrets, protecting the data. So if you look at those as being the two core things that we’re trying to accomplish and security, we’re doing a pretty bad job , on that whole data side. And the metric I was going to share was according to IBM’s data breach, It takes on average 287 days to detect and contain a data breach 220 days.
This is the detection side of it. And then 67 days , is the containment side. But , like I said, with all the investment that we’re making, why does it take so long to detect a breach? And I think your question , was also, , is it just breached that we. Well, if you look at the big picture as a security professional, we care about protecting the, I guess, the breach, right?
As a privacy person, we care about adhering to the law, protecting our organization from any potential risk of legislation and enabling our business to grow faster in various regions around the world, by complying with the laws , that are mandated in order to operate in those parts of the world. And then the data team themselves, your [00:10:00] CIO knows your data scientists.
They care about being able to harness the value of data, so they can push the business faster and gain competitive edge by being able to use data more freely. So if you look at that as driving the business, , oftentimes right now, security , has to, , introduce themselves as friction in that process.
So they’re actually slowing the business down while they’re thinking about how to protect against breaches and not enabling the business to use data more freely. So when you think about like what’s the right strategy for data in the cloud, given that the landscape has changed there’s new business drivers, new architectures, multiple constituents.
I do think we need to think, more than just the breach aspect and, and think about what are the drivers from each pillar within the business. And is there a way to help them collaborate harmoniously that achieves all of their objectives? And that’s what we’re trying to sell.
Ashish Rajan: Oh, my God.
There’s so much to unpack there because as [00:11:00] you were talking about privacy as well, around one 60 countries that maybe we just had a moment because when most of the security professionals talk about data privacy, they just told me, oh, it’s not security. That’s that’s legal problem. That’s that’s another silo that we’ve created, unfortunately, in our security teams where data privacy has become more of a legal thing rather than a security thing.
So GDPR it’s technically, oh, I need to hire a lawyer in the UK for GDPR. And it’s really interesting that you mentioned , the time , it takes for someone to respond and contain it. Because I think as you mentioned, like, oh, actually, yeah, that makes sense. Because if I have to deal with another team to find out, Hey, who owns this?
What’s the right thing to do over here. I would never have the information as a security leader to know what the legal requirement for a particular countries inevitably in a global. Oh, so time to breach is probably definitely relevant. Sounds like the cost of the breach probably as well. I can’t imagine it being the cheap exercise, but then growing a problem zettabytes of data that you’re storing.
I did not even know zettabyte was a term for like now I [00:12:00] zettabyte
David McCaw: after that as well. I don’t know. I should Google
Ashish Rajan: it. Yeah. It was like zettabyte I’m like, so I would get, well, I’m going to Google it for sure. Later on. Ive got a comment here from Vineet as well. Data classification is also important part when making strategies for data security, yeah, a hundred percent any do you want to add onto that as well?
David McCaw: Absolutely. So there’s a lot to unpack there. What we’ve learned is , data classification is the starting point for any data protection. However, what we’re seeing is that the context around data classification isn’t really, truly centralized in most organizations today.
So you’ve got the business who understands the data. They understand the purpose of the data. They understand the sensitivity of the data, the nuance around it, and they contain all of that context. Then you have security people who aren’t the experts on the data who are depending on their security tools to interpret automatically what the context of the data is.
And, , these tools, there’s no tool out there that just doesn’t an amazing bang up job of data classification. It’s impossible. There’s too many custom data [00:13:00] types or too many different use cases from business to business where something might be sensitive for one organization and not sensitive for another.
And then you have the legal team who understands like the implications of law on the data based on the type of data. . So, what we’ve seen happening is each of these constituents runs their own technology stacks to power their core objectives. So data teams have data catalogs to facilitate data sharing security teams have data security tools to try to block bad things from happening.
Legal teams have GRC systems where they document their compliance with various laws, the legal team. I need to go bother the data guys, to understand the context so they can populate their GRC system. The security team needs to bother the data guys to help them configure their security tools. And it’s never a perfect exercise.
It’s never set in stone and it constantly evolves. So it’s constant overhead puffs at work. Right. And the fundamental thing that’s broken there is the [00:14:00] context that is owned by the business. Isn’t shared and passed through in a continuous fashion, through technology with the security and the privacy stakeholders.
So that creates a lot of overhead and everybody’s reinventing the wheel trying to reclassify using whatever tool set that they have. And that’s where we’ve seen a lot of the breakdown in the processes. So we need spot on 100%. The foundation for an effective data security program is effective data classification.
The more granular you can get, the more effective that program can be, but , it all boils down to accuracy and context.
Ashish Rajan: That’s pretty awesome. And I think you kind of touched on someone’s at risk cause I’m I was gonna ask in terms of the scale of data security challenges, and you kind of touched on really interesting things where it’s only multiple touch points for, at any given point in time for data in any organization.
So a good one though, there, by the way, thanks Vineet. I think Bazillion is the next term after the zettabyte actually, I don’t know if it’s supposed to be just. It’s a proper term, but [00:15:00] everybody’s like, yeah. Bazillion times. So it’d be really interesting. It’s actually a term or actually a the numerical value.
So I was going to ask, so we spoke about data security and cloud context. Is it like we have had collected data in traditional environments as well? Not that we haven’t. Is that quite different? Like, I think what’s the difference between that and what you’re trying to do in cloud, I guess, because it’s not that data security, wasn’t a challenge in on-premise we probably , still had privacy team.
We still had legal teams. So that’s security towards difference in the cloud context.
David McCaw: Yeah. Great question. Ashish, all your questions. You always ask.
Ashish Rajan: I appreciate that, man.
David McCaw: So few things. Number one is, , in the traditional world, everything was a little bit more centralized and it was easier to deploy kind of, , a moat around , the crown jewels, a perimeter around the crown jewels and try to protect it through infrastructure.
Right. So you’re creating this perimeter around your data, and that was the effect of , your security program. In the cloud, you’ve got a few things that have fundamentally changed. One is, , that there is no more traditional [00:16:00] perimeter, right? And so you can’t , quite easily, , build the same type of insulation around your data that you did in an environment where, , things were not ephemeral and things were not as dynamic and everybody across the organization, didn’t have the ability to deploy their own infrastructure.
Number two is, , once again, this chicken and egg thing, because of cloud and because, , the, the CSPs themselves has have made it very cheap to store data, right? Because, , storing data doesn’t cost you the money. It’s when you’re actually moving it around and using they’ve encouraged the collection of these large data sets.
Right. And, , with the advent of data science, et cetera, like businesses are seeing more and more reason to gather more data. And then, , with that, you have more functions in the business that are now using data, right? It’s not just an application. That’s, that’s interfacing with the database and using data just to serve the purpose of the application.
It’s really, , forecasting, machine learning, all of these different models. , so you have. More and and more types of [00:17:00] sensitive data. More people using it, more decentralization, and more legislation. So those are all kind of factors that come into play around why it’s becoming more complex in the cloud.
Now there’s always a trade off, because while there’s more complexity in terms of the problem set, the advantage that we have in the cloud, is you can deploy data infrastructure as code. So, if we go back to kind of the theme that I shared with you guys around how I built my career, kind of following application security or application quality into application security, then into, infrastructure security, via cloud security posture management, then infrastructure, entitlement, on the cloud mock site and now data.
, people moving their infrastructure to cloud was, , it was the impetus for being able to now go validate that infrastructure and understand the context of the entire environment through utilities like logs and APIs. So you can contextualize and get a whole picture of the landscape and all the [00:18:00] interactions with data in a programmatic fashion today, that was much more difficult to achieve in a traditional on-prem world.
So while you have the complexity, you also have the ability to create answers around that that are novel, that are only empowered by, kind of the, the visibility that we can get into a problem.
Interesting because , as you were saying this thought that come to me when I’m ready to talk about data security and to someone who’s probably coming from a traditional environment.
Ashish Rajan: One thing that I get thrown at by people is, Hey, I’ve got an access control. I’ve totally got this under control. And so are you saying access control is not
David McCaw: enough? Access control is not enough. Access controls absolutely needed. Right. , but it’s not enough because you can never get it right.
Ashish Rajan: Oh, I totally get it right. I don’t have any extra permission that I need to have
David McCaw: David. Yeah, exactly. Exactly. Yeah. And then when you couple that with all the different people who need access in your organization these days, and the movement of people, between organizations and , the pivots of use cases around data, , access control needs to be there.
But access control , is generally a little bit. , in most [00:19:00] cases, , most users are overprivileged. And , it doesn’t protect you against insider threat. It doesn’t protect you against the event where somebody’s credentials have been compromised. So you want it as a first line of defense.
Absolutely. But then you need to triangulate that with like, what. Is the sensitivity of the data that your users actually have access to. What are the business obligations against that data? How are they actually using the data and are they using it in line with the business purpose? Is there any indication of compromise, whether it’s their credentials have been stolen or they’re disgruntled employee, and they’re trying to take all the important information with them before they leave.
Like you can only access control is not going to solve any of that. Right. You need to be able to contextualize really what are the interactions with the data and what does interpret that to figure out if it’s something that’s safe or not? Yup.
Ashish Rajan: Yup. Because that makes me go people listening to this and going, oh, if access control is not enough, then what two people do I guess.
So maybe it would be a good [00:20:00] step or at least next question to understand what do you recommend that some of the controls people could be like, From I guess having a bit more control over their data security in cloud.
David McCaw: Yeah. Yeah. So from a specific controls perspective, you’re always going to want your access control in place.
You’re always going to want to make sure that anything you can encrypt is encrypted, you want to be able to, , hold your own keys. , things like data masking are also, , very, very valid techniques and appropriate techniques to apply. But , the, the thing is every single one of those techniques are exactly that , they’re tactical solutions to one part of the overarching challenge.
Fundamentally, I think where we need to do a better job as an industry is, , With empathy across the different functions that we have that are all trying to solve a similar problem, but today are operating in their own swim lanes. And so, , it’s funny that I say, Hey, apply empathy as a security approach.
But what I mean by that , is, , if you’re only looking at [00:21:00] one aspect of the problem without ever taking a step back and looking at the, the end to end flow and the challenge holistically, then you can’t ever iterate and improve on that problem. Ashish , it was less than a week ago. I saw, Eric block, who’s a , very well-respected security guy.
So shout out to him on LinkedIn, posted an article from Harvard business review, that I took a look at in the article. , the title of the article was something along the lines of, , based on how you define the problem, determines whether or not you solve it. And in the article, it talked about, used Dyson as, as an example.
And he said, , Dyson built the world’s best vacuum cleaner and , their, innovation initially was around like the quality of the filtering within the vacuum bag. And for many, many, many years, all of the competition came in and they all tried to iterate on, , new innovations on the vacuum bag itself.
Right. Because that’s what they had honed in and focused in on as the [00:22:00] problem. And it wasn’t until Dyson rethought the problem. And instead of saying, Hey, how do I make a better vacuum bag? They said, how do I more effectively separate dirt from air that they made the quantum leap in innovation and created the industrial cycle.
So to me, I think this is so applicable to the space that we’re living in. Like, everybody’s like, how do I create a better DLP solution? How do I do a better access control? And , those things, you still need them. I’m not saying you don’t need them, but if you, if instead you say. , how do I better empower my business to use data safely?
And why can’t I do it today? It’s the lack of context across the entire life cycle. Nobody has visibility to what’s happening with the data and how it’s moving and being interacted with on a day-to-day basis while it’s in the organization. And you’ll never understand that by just deploying a, a single infrastructure component that, , only has only sees one layer, nothing before or after it.
So you have to have [00:23:00] that context and you have to have the empathy to understand that even if I’m in security is the work that I’m doing, creating friction or empowerment for my data unit users. And as the work that I’m doing in enabling. Compliance team to be more confident or, , am I just doing my job and they have to do their own thing.
And even if it means we’re all like repeating work, being redundant in the functions that we’re doing, I don’t care because it’s somebody else’s job. I think that needs to change. , and that’s, how we’re trying to look at it a bit differently,
Ashish Rajan: that’s a deep, deep point there
damn man. Cause I was going to also say that as you were saying, this, that kind ofthought of the one challenge that I come across quite often is the data sprawl and do at any given point in time to know what data was used by who in a, in a company. No one tracks that that is no. I mean, at least, I don’t know, who’s tracking it.
I don’t think anyone out there, any security professional out there would know. Whether it’s sensitive or non, so that’s a complete different conversation but like how data’s fraud across the entire organization, unless you’re a startup with Polly, just [00:24:00] one AWS account, I guess, different. But then I think there’s a few more layers to it.
You’ve got a few more questions coming in as well. Just going to quickly, address them. Hey Tom. So Tom’s asking is DLP harder to manage in cloud environment.
David McCaw: DLP is hard to manage everywhere. And we’ll go back to, I think it was Vineet’s comment around, around data classification. Like DLP is only going to be as good as the data classification.
And unless you have a process that enables high fidelity data classification, as part of the data provisioning process, then the burden of doing that data classification falls onto security guys who just don’t have the bandwidth to constantly tune and keep those DLP systems up with the speed of business.
So, Tom, earlier in the conversation we discussed, the average time that it takes to detect and contain a data breach. And the average time was 220 days for the detection side. What I didn’t point out was, Hey, most of these organizations have DLB, right? But they’re still getting breached. There was another [00:25:00] article.
I think it was dark reading or security Boulevard, maybe about two months ago that said, , over 90% of organizations have suffered a data breach in the last 18 months. And over 60% of organizations have suffered at least three. So when you triangulate all of these statistics, , , what it tells you is that DLP is deployed in many of these organizations, but nobody’s actually blocking, nobody’s actually turning it on with a level of confidence because the fidelity is so accurate that they can feel comfortable.
Blocking somebody’s flow during the business day-to-day business cycles. So these tools end up getting configured and having somebody who has to sit on top of them and try to manage them. But, if you can never get that fidelity to a point where you have confidence in it, then it sets on in a passive mode and becomes more of a forensics tool for you to investigate what happened after the breach, after your data’s shown up in the dark web, then something that actually signals that you have a problem and prevents it from happening.
Ashish Rajan: Yep. There’s a resourcing issue. There’s not a DLP, right? Cause you have, you need to have another person [00:26:00] Manning the DLP on an ongoing basis for the alerts that are raised over there. So there is , that challenge as well. Hopefully that answers your question Tom Thanks for that man. Really good question.
I’ve got another question here. Is relying on cloud native controls enough to secure.
David McCaw: I’ll turn this one around to you as she, what do you think about that?
Ashish Rajan: I think , based on our conversation, I think there’s , the challenge that is not being solved by cloud native problems. Yes. You’re right. Access controls. We spoke about that. It definitely solves yes, you can solve the access problem.
There’s a lot more context around it, but the amount of data that is spread across, I don’t know if that, so first of all, cloud service providers, they have data stores. You can control the access to it, but they don’t have the context for, Hey, this is the data which is sensitive. This is the data is not sensitive.
I do care when Ashish logs in over here, but I don’t care if Ashish logs in or does whatever this will be, cause it’s public data anyways. So that context that we were talking about, I think that cannot be achieved by cloud-native control. Someone has to kind of use an open source or some kind of a tooling to get some context around , that whole layer.
The other thing that I always talk about [00:27:00] is, I don’t know how many people talk about data lifecycle in cloud as well. It’s something that we kind of say, oh yeah, of course I’ve started sucking data com I guess summer telecom provider, like I was Verizon, I guess there’s nothing of a name. I just collect data from people, get the username, username addresses everything.
But what happens to this data throughout the company? Like it, that no one manages the data life cycle as well. And there is no cloud data control for that. For how your data, that’s all. I mean, I’m going to use the cloud service provider terminology. It’s a shared responsibility. So what you might want to be.
The sh the long and short of it is it, you say yes and no. Yes. It helps you do access control, but no, because it doesn’t have the context for what you need to, I guess, protect from a privacy law perspective or from a data classification or sensitive perspectiveHopefully that answers your question.
But that was a great question, man, or lady, I guess I don’t, I don’t know who that person was. Do you want to add anything else?
David McCaw: Yeah, I’ll layer into that. So of course , your traditional managed cloud services like [00:28:00] RDS and an S3, these things all have , some security capability that’s available within.
Assuming you’ve turned those things on and configured them appropriately. Right? , so you can make sure encryption is turned on in each of those environments. You can make sure that, your buckets , your, managed cloud instances. Aren’t public to the internet. You can make sure that you’ve enabled backup and recovery, but once again, in the shared responsibility model, all of those things are optional.
So it’s up to you to make sure that you’ve actually done those things to ensure that the infrastructure itself is protected. Then you get into the question of multi-cloud environments or hybrid environments or things that aren’t managed data services from the cloud provider themselves. , you’ve got the snowflakes of the world and these other data lakes , that need to be part of your overall data governance program.
And certainly, they’re not gonna have. Cloud native controls for those, , unique solutions that are provided by third party vendors. And then lastly, to Ashish’s point, , once you get past the infrastructure layer, then the cloud service providers, aren’t looking at it at all. They don’t understand your business obligations, the [00:29:00] data, they don’t understand the sensitivity of it to the organization.
, they just know you’ve got a Redshift and based on however you set it up or however the administrators set it up, they’re assuming that that’s correct. And that’s not always the case.
Ashish Rajan: And I just realized that the person is it’s. Chris Glanden , I’ve like I know, so Chris, Chris Glanden is also a podcaster as well, but he runs Barcode podcast.
Definitely ask you to check it out. He’s got another question as well. So Chris has an, also another question around structured versus unstructured data in cloud, How does the protection strategy differ
David McCaw: awesome. And first I do want to give a shout out to Chris , cause I had the opportunity to meet Chris in the past and , he’s got an awesome podcast and, and he’s one of the, very, very well well-versed data protection experts that’s out there.
So, , he’s probably another great person to bring into this topic at some point that Ashish, but yeah, it indicates structured versus unstructured data in the cloud , how does the protection strategy differ? Great question. We could probably unpack that and spend a lot more than just the time that we have left on this podcast,[00:30:00] on that topic alone.
Let me broaden the question a little bit and say, okay, when you look at a cloud environment and you look at data security, , how should you be thinking about it? You should be thinking about, , you’ve got infrastructure as a service. And, and you’ve got your, kind of your structured data stores, that are holding , the structured data, which oftentimes are the crown jewels.
Okay. And, then you’ve got your data that’s in your SAAS applications. And then third, you’ve got your your data that’s in file formats or unstructured data. That’s could also be in SAAS applications, could be in cloud storage and could be floating around the endpoints of your various users, you mentioned this term of data life cycle.
Okay. And. , one of the big things that I’m an advocate of , is if you can understand the life cycle from creation to deletion around data, then you can actually enable process in a continuous fashion that instruments that lifecycle and helps you protect the data on every step of the way.
So in structured data, there’s definitely a well-defined lifecycle. Like people create data stores, they configure [00:31:00] them, they load data, they provision the access, they use the data, and then they archive that delete the data. All these things are happening continuously and concurrently across the organization in many different places, but it’s always the same steps.
And , if you build the right technology, you can always understand each one of those steps and contextualize it fully. So with that, that allows you to create controls and best practices for ensuring. All the interactions with structured data are safe at every point in time. Unstructured, there’s a little bit more of the wild, wild west, where things are floating around on different desktops and getting copied and pasted and packaged at different file formats.
And, , so there are solutions that are out there for that, , but the solutions that I’ve seen have , been more of, iterations on some of the traditional DLP type of technologies. So trying to create a better mousetrap, versus kind of looking at the problem differently.
And I think on the structure side, we’re able to look at the problem a little bit differently today. Unstructured. , is still that, , end point type of approach , and trying to track the catch the exfiltration. And then , there’s the whole question about SAS and , , there’s a hybrid approach.
You can take there [00:32:00] because a lot of SAAS applications actually have structured data stores behind them. And so you can pull those in and manage those as part of your data, life cycle, like you had any other structured data, and then there’s SAS applications that have unstructured data within them. And so, you can track that unstructured data, via these next generation DLP tools, like, , cyber Haven, et cetera, , once it’s floating around on the end users and points on their, various machines.
And, but there are also , some companies that are innovating around creating layers of protection, around some of the SAAS providers themselves, like, , O 365 or Google drive, like looking at net effective permissions to content that’s housed in those systems and then doing behavior monitoring around interactions with those systems.
So, I know that was a lot that I’m just dumped there Ashish, , but the point of it , is think of it as three pillars, your SAAS your structured and your unstructured, and make sure you have a strategy for each, but there is no concept of [00:33:00] best of breed. That’s going to solve all three. If you try, to put a generic, , a one size fits all across all of those things, you’re destined to have a lot of gaps.
Ashish Rajan: Hopefully that answers your question, Chris, but that was a great question as well. And I definitely feel Chris is another plus for data security, as well. So Zinet has mentioned taking a holistic approach, right? No one’s security is ever efficient.
That is a hundred percent right. I think that’s what she’s trying to say with that. Thanks for, I have one other question from Kapil , aKapil Bareja, Welcome Man what do you think about data hygiene controls beneath the iDaas layer? Do they solve the big data store issues. Think I got the question, right?
Maybe it’s more. So I think the question is if you have proper data hygiene on big data stores, does that solve this problem? To some extent I have an option on it, but I can try and answer it. If you want David, I’ll kick it off and maybe you can finish it up. And if I understand that correctly, what you are trying to to refer to it?
Data hygiene in terms of data sanitation, having proper access controls for user audit, like who has access to data store, I’m assuming you are referring to things like that. I think the challenge [00:34:00] that what we were at least what Dave and I were referring to was just the broader scale of the data sprawl challenges, as well as not just the fact that we have a data life cycle.
We have the different data kind of data classification defined, but the place where things get lost is how many people actually know how many data sources are active at the moment. Considering now, Database administrator that you go to for a database, you just basically spin it up on yourself. So I feel like a lot of controls that we used to have earlier, don’t really apply anymore, in saying that yes, access control, is still valid.
If you have like an overarching, control over it, but it definitely solves a problem for, at a smaller scale. But if you’re trying to kind of go beyond that, and that’s where we were at least saying, when we were talking about this as well, where the, the challenge has become now, the data is available to everyone in the organization and there not that it’s wrong.
Yes. Transplants should be there. But what that also meant is that we have kind of lost control over, [00:35:00] Hey, can David take out the PII that had been taken from customer. Yes, he can, but where my DLP pick it up, most likely not because he used this obscure copy thing, that there is a DLP solution for he’s copied it across into our data store.
That’s important out. And I think. There’s room for improvement for hygiene control. That’s kind of where I’m coming from, but I don’t know if you want to add something to that, David.
David McCaw: Yeah. And, and in fact what I’ll add to that rather than just tackling this question directly is some more kind of scenarios that we’re seeing around data security and challenges that are coming up, that are just concrete examples that people are running into, right?
Regardless of how good their data hygienists, number one, we find many places where somebody has spun up a database, and. It houses sensitive information and nobody even knows it exists. It’s just there. It’s just, it’s sitting there. The person who created it is no longer within the organization.
Nobody knows why it’s there or what it’s done, but it’s just there waiting to be compromised. Other scenarios, we see people who have large data lakes and within those large [00:36:00] data lakes, they have multiple schemas within the data lake. And some of those schema is, are meant to be dev or staging environments or public environments in sandbox environments.
And others are meant to be restricted. And, , so they give everybody in the organization access to, anything in the public environment, but they want to make sure that there’s no sensitive information in that public environment because , that would be violation , of law and violation of trust EMEA with the organizations we’re working with.
And we almost always discover data. That’s sensitive in places where it shouldn’t be. The next thing that we see happening is, situations that, , effectively , are impossible to identify through traditional DLP work. So for example, Have a customer service organization and they have access to all of your customer data because that’s part of the function that they have a fulfilling their job.
So, , from a privilege perspective, , they’re supposed to be able to look at your customer data, but from a behavior [00:37:00] perspective, , should they be running queries on a bunch of customers that they have no business purpose or relationship. , for their job function or should they be downloading lists of the entire, , customer roster?
And this is a different question than, , just should a person have access or should they not have access. So , there’s a lot of nuance within this process. Now you can actually create controls and monitoring capability to look for these types of scenarios continuously, but. The underlying prerequisite for that is understanding all of the context around the data.
And what I mean by that is data classification to the nth degree, not just is this thing, a social security number, but is it a social security number? How are social security numbers supposed to be used by my organization? What applications and what user audience should have access to it? What legal obligations do I have to this?
And if you understand all that context around every piece of [00:38:00] data that you have, then you can monitor all this stuff really accurately. And with.
Ashish Rajan: Awesome. And hopefully that answer your question as well, but I wanted to quickly add something here as well. Coupled with, I did a gig for big data security consulting work that I did sometime ago.
And it was really interesting that to David your point about data classification, because that project was a silo inside, it’s like a, oh, it was just when big data was the hype. It is still the hype, but then everyone’s saying, oh, we should do big data, big data. We have so much data and everyone started doing it.
And I think , my gig at that point was how do we do security for this and in the cloud? And how do we kind of go around securing the data? I think the first question that came up with was how do we do data classification? So we started working towards this data classification list. Right. And you kind of realized to your point about the Nth degree, there’s so many layers to data classification because just because.
You initially feel that you are dealing with social security number, like, oh yeah, that’s totally sensitive for sure. But then we start talking to all these [00:39:00] data scientists who you need to give the data to like, oh, but they can’t access it well. Oh, okay. So what kind of data can they access or, okay, so it’s sensitive, but cannot be accessible by outsider.
Like, so then you have another layer, right? Okay. So what data can be accessed by that you can go for hours in terms of the different layers. You can add your data classification. That’s just one part. And the other one was in terms of having a degree of assets, it’s like.
We couldn’t that we could never figure it out at any given point in time. How many assets were there for which we’re actually storing data because technically a, virtual machine can store data as well. It’s not just that it’s the S3 bucket or it’s just a Redshift or whatever the big data storage is.
And yeah, , so much complexity, so many layers, but a hundred percent great answer. And I think we answered Kapil’s of questions as well, but that was a great question. Kapil Thanks for this man. I’ve got Dr. Charlene K Coon as well. . So she agrees classification is really important as well.
Thanks for that. Dr. Charlene, I was going to say we’ve been having such an interesting conversation, man. And , I would love to keep going on., I have a limited time with you. [00:40:00] So this is the challenge is having celebrities on the show.
So this is kind of like the final part of the podcast where I ask like three fun questions. So I’ve got three fun questions for you, which I just asked you, I guess, with people get to know a bit more about yourself as well. The first one being, where do you spend most time on when you’re not working on data security challenges in the.
Today, I’ve been spending a lot of time working on data security challenges in the cloud,
but what are you going to do now?
Often, often you’ve done this. Yeah.
David McCaw: So, I just love technology and, , I actually love the way that our industry is evolving. So, , to me, it’s not just, , individual vendor, AB or C, right? It’s like, Hey, what are we doing as a collective? What are we all marching to in terms of the same goals?
And I really liked the dynamic of working with and creating innovative technology and working with leaders and working with VCs and solving problems and looking at how we create the future. So from a, work and business perspective, that’s what I like spending my time on other than just data security.
From a [00:41:00] personal perspective. Family, , we’ve been in this pandemic, and we’ve been fortunate enough to find ways to take advantage of it. So when my daughters had remote schooling, like let’s go learn from somewhere else where I can work virtually and they can school remotely, but we can capture some experience and not feel like we’re cooped up at the same time.
So trout finding a lot of time with family. That’s that’s really been, , one of the big things that we’ve been focused on and especially the last couple of years, it’s been tough. Oh
yeah.
Ashish Rajan: I, I assume it’s funny. Not many people were creative enough to I guess apply some kind of like a creative layer to it.
So I’m glad you were one of them to be able to allocate a layer to the whole pandemic thing as well. Then, second question. What is something that you’re proud of? Part is not on your social media?
David McCaw: That’s not on my social media that I’m proud of. I ,
really when I think of the, the sources of, of pride, one is around my, around the business. I’m proud of being able to create things that are actually helping people. And, and also like mentoring [00:42:00] up and comers in the industry , on both the go to market side and helping them understand like really what opportunity is there for them.
Because a lot of people don’t fully understand how to embrace all the opportunity that’s in front of them as individuals. So helping them understand there are no barriers there , and to be able to do more. And, and number two is like, the way I was raised was just super big on family.
And so, , for me, my, what I’m really proud about is being able to, , share any successes or the journey that I have with, , my children. Fortunately, I still have both of my parents still around my extended family. And, and, and my friends, , it’s, it’s just like, w. If you can be proud of the people that are closest in your circle, then you’re living a full life and that’s that’s, how I look at it?
Ashish Rajan: That’s very well said, man, very well said. And I definitely be family and friends are definitely the, our source of pride and yeah. I can, I I’m a hundred percent with you on that one. Last question. What is your favorite cuisine or restaurant that you can share?
David McCaw: Oh my goodness. Well, if you’ve [00:43:00] seen me and especially , as I’ve been in and go to market on the technology side, I’ve every year I’ve gained more pounds than I should, because I paid a lot of dinners. I’m a, I’m a big foodie Ashish. So, so it’s, it’s never a specific cuisine. It’s just the quality and the richness of the flavor.
And , when we launched red lock, I was fortunate enough to come to your part of the world. I was in Melbourne and there were some really good, Asian restaurants that I’ve found in Melbourne and also a great a great Greek place. But yeah. So on that note, what I’ll do is I’ll offer I’m like an offer to the view.
So I’ve been fortunate enough to travel through almost every, , all but two states in the United States and, , a fair amount over the world. And as a foodie, I’ve always gone and looked up and research, what are the best places to eat and all of these places. So if anybody ever wants a recommendation anywhere hit me up and I’ll point you to something amazing.
Ashish Rajan: Perfect. I’m definitely going to do that cause so, so which still states have you not traveled to the,
David McCaw: so April’s going to kill me for this one. Cause she’s our VP of engineering and she lives in Mississippi and I’ve never [00:44:00] been
Ashish Rajan: there.
David McCaw: Never been to Mississippi and I’ve never been to.
Ashish Rajan: Alaska. Oh yeah. That’s two quarter then. It’s like, I never as good. Although I would love to go there as that one. So I definitely know I’m going to get some tips from you and I’m definitely gonna encourage other people to reach out and find out some tips from me, at least maybe from the local states, I guess.
But, so that was really good. That was really awesome. conversation . I do really appreciate that you kind of hung out with us and spoke about data security and cloud. Where can people find if they have any follow-up questions or.
David McCaw: Absolutely. You can find me on, on LinkedIn David McCall. And you can certainly reach out to us at the Dasera where we’re working on solving these problems.
My email address is davemccaw@dasera.com. And love to talk to you about what you guys are doing in data. And if there’s anything that we didn’t touch on today,, that, , we should be thinking about or vice versa, if we can offer, , any insights based on the experiences we’re collecting we absolutely want to be able to do that.
So thank you for the opportunity Ashish, , , it’s always a pleasure, and I can’t wait for the next one.
Ashish Rajan: Yeah, same, same. And thanks for everyone who came in as well. I, it seems like everyone else has [00:45:00] had a great time as well. So thanks so much for, it seems like everyone found it valuable as well. So next week we I’m coming on with a different topic, but I’m looking forward to having more conversations with you. And maybe I can get Chris as well.
And Chris and David come in and talk about data security. So thanks so much, everyone. I will see you on the next week’s episode, but until then stay safe. Talk soon, peace.