Realities of Cloud Networking in AWS

View Show Notes and Transcript

AWS networking isn’t as simple as it seems and when you’re dealing with regulated industries like healthcare, the stakes are even higher.In this episode we sit down with Kyler Middleton and Jack W. Harter from Veradigm — who have navigated complex AWS networking challenges while migrating from on-prem data centers to the cloud.We speak about:

  • The real struggles of moving from data centers to AWS
  • Why networking can feel like a black box
  • The anti-pattern that surprisingly worked best
  • How to build secure cloud networks—without losing your sanity
  • The hidden security & compliance challenges in healthcare cloud migration

Questions asked:
00:00 Introduction
01:55 A bit about Kyler and Jack
03:18 Security Challenges in Medical Industry
06:01 Where to start when migrating from data centres to AWS?
07:42 Networking Challenges for Regulated Industries  
11:26 Networking in On-Prem vs Cloud
19:24 Security by Design considerations
29:31 The Terraform pieces  
34:34 Network Firewall in Cloud
39:46 Lessons learnt from the project  
46:21 The Fun Section

Jack Harter: [00:00:00] Goes directly through the S3 endpoint. When it tried to come back, it didn't know to go back through that endpoint. And then we would have asymmetric traffic, which the firewall would drop.

And in turn that required time dealing with AWS saying that. Hold on a minute, what you actually need is, and correct me if I'm going a little bit off the rails here, Kyler, what you actually need is

Ashish Rajan: And the pattern is a new pattern.

Yes, you heard me right. When you're moving from on premise to the cloud, especially AWS cloud in this context, this might be the answer. In this conversation with Kyler and Jack, we spoke about the lessons they learned from moving on premise resources onto the cloud, specifically AWS cloud, the networking challenges, like especially if you think about things like hey, you know what, whatever goes through the firewall.

I want to know if Ashish was the source of that rather than my NAT gateway. A lot of those conversations, which would be normal conversation on premise, but how they achieved it in AWS is a great story to hear and all the [00:01:00] lessons that came with it. If you know someone who's working on moving from on premise to the AWS cloud specifically in a regulated environment, I would definitely share this episode with them. And if you're listening or watching the episode of Cloud Security Podcast for a second or third time, I would really appreciate if you can drop us a follow or subscribe on your favorite YouTube LinkedIn or Apple Spotify platform that you may be listening or watching this on.

I really appreciate all the support you have shown us so far and continue to do so we can keep creating episodes like these. Thank you so much for your support. I hope you enjoy this episode with Kyler and Jack.

Hey everyone. Welcome to Cloud Security Podcast. Today I've got Kyler and Jack. Thank you for coming guys.

Great to be here. I'm excited for this conversation cause especially now I have a firewall expert person who's going to tell me about, who's going to wave the hand at the data centers for why do we need the actual physical firewall, but to set some context, I would love for you guys to share a bit about yourself.

Kyle, if you want to start with a bit of introduction about yourself, you're obviously a returning guest. We'd love to hear about you so that we can introduce you to the audience again and maybe followed by Jack after.

Kyler Middleton: Absolutely. Hey, everybody. I'm Kyler Middleton. I'm based out of Madison, Wisconsin. I have [00:02:00] been on here before, so look it up.

I had some very great earrings that I didn't match today. It's very sad. I have been a systems call center, a database network engineer for many years before I found cloud and DevOps. And I'm now looking at this cool thing called AI that I'm sure you'll be hearing about soon. It's going to be big.

Just really loving it. And I was very excited not to have to do network engineering anymore. And then that's the subject of today's podcast. Cause we did. Jack, over to you. Yeah. Hi, my name is Jack Carter. I am based out of Chicago, Illinois, senior ish DevOps engineer. Got into this field because I took a different route than some do. I initially started actually with a BA in English, but found out that there are no jobs in that field. But eventually wound up going down initially into the IT sector world, and from there thought, these dev people, they seem to have it so easy.

I should figure out how to get on that side of the equation. And went from ops to dev and now we're getting further into the ops field yet again. Got to [00:03:00] learn, worked as a contractor for a couple of years, and then in various spaces in the high frequency trading and medical industries. And now I'm back in medicine, working with Veridigm and having the absolute time of my life working across all the cloud platforms and all the various ways to deploy what we need.

Ashish Rajan: Actually, considering both of you have had experience in the medical industry and being a tech in a medical industry, I imagine it's very different to just how other people see it. And obviously the topic today is primarily going to talk about people who are migrating from data centers to AWS and network architecture and all of that.

Just to give some context for people, I'm curious, what's unique challenges about the medical space? As a tech person, because a lot of people can hear about, oh, I've got Netflix, I've got, I dunno, CapitalOne. They all hear about all these big stories. What is happening in a financial or in a medical sector.

I'm curious, like what are some of the challenges that you may have seen that are unique to your field, when you talk to other people?

Kyler Middleton: It's an interesting contrast, [00:04:00] we have to move fast and break stuff, right? That's tech move fast and break stuff, build things. But also this is a regulated industry.

It's like finance. It's your healthcare data. Which is very private and personal and regulated and so we have to move fast and break stuff, but also don't break anything because it's very like there are lots of fines and that would be sad. But also there's criminal charges that could hit our C level if we mess up bad enough.

You have to be very cautious when you're moving fast to break stuff, and that's a weird experience some days.

Jack Harter: Yeah, the the level of security concern you have here cannot be overstated, and one of the interesting elements about working in the medical field is you have some unbelievably smart people that don't always understand computers as well as we would like them to be able to so you can use a you're perfectly adequate with a laser scalpel. But, this the 2 factor authentication is just a bridge too far. I tell you what. So when we're building this stuff out, security needs to be baked in from the get go.

We need to put it on ourselves to understand that, okay, when we deliver this [00:05:00] from soup to nuts, it needs to be. Rock solid, we can't have this thing, it can't be throwing itself out midway through or dumping information where it shouldn't be. So we need to take a lot of care, take your time, go slow to go fast.

Ashish Rajan: Oh, the two factor one hit home for me as well. I was trying to do an identity project back in the day for a hospital. And I think the conversation was more around. If you were on the bed, would you want me to put my two factor authentication or treat you? And I'm like, I guess you, I would want you to treat me and not wait for two FA SMS message on your phone.

Obviously. The stakes are very different in the medical field as well, because the people who are our quote unquote customers, they're dealing with situations of life and death as well to Kyler, what you said, it's all our medical data. We obviously want them to have access to it at the right time and all of that.

There's a whole thing around it, but it's technology. Also, at the same time, they're moving. And today we're talking about migrating from data [00:06:00] centers to AWS as well. I think Kyler may be from a set the scene for what I guess because you did the whole architecture and everything, I'm curious from your perspective, how does even one start doing this medical regulated field obviously has its own challenges.

So how does one even start planning to migrate from data centers to AWS?

Kyler Middleton: It's a lot, but thankfully we have cookie cutter stamps from other projects. So Veradigm grows by acquisition, a lot of enterprises do these days. And so we have done this many times before, but this is the first time I got to run it, which is very exciting.

So the migration here was from an on prem data center and the database, the API, the queuing, the front end was all on one server. There was just one and it was just directly on the internet and we and they were like, okay, we'll just move it over and I was like, I, we could lift and shift that architecture or we could establish like tiered layers of security and a WAF and a cloud front caching and can we do [00:07:00] that?

And they said, yeah. Okay. Yeah. Let's do that. Let's be as secure as we can. Let's cloudify it. Lift and shift is, a pretty poor prospect for any cloud migration ever. So if you have the time, the budget, and the expertise, you should be making it cloud native wherever you can. So thankfully, we got the ability to do that.

There's a ton of bumps in the road of like, how do the servers talk to each other? How come I can't just put Nginx in front of everything and do reverse proxies? Because we have ALBs. Because we have actual resources that can do that and I don't need an Nginx to configure and monitor. So we just tackled it as well as we could and it's coming along really well.

It's been a ton of fun to delve into it and grow in skills and stuff.

Ashish Rajan: Was there anything specific about regular industry that comes to mind when people are migrating from data centers? Because I imagine, because it's not that these companies were established yesterday, so they've obviously done this for a long time in data centers.

There's a lot of learned practices, learned processes, and [00:08:00] how things are done and all of that. And I'm going to get to physical firewall for Jack as well in a bit, but there's a lot of physical firewall that we have to go in and physically unplug sometimes or plug into sometimes, in terms of being regulated does that add networking challenges? Maybe Jack, if you want to talk about that as well, man.

Kyler Middleton: Let's see. Does it add challenges? Yeah. For starters, when we're working with an on premise firewall, you can really easily track things just visually speaking. I actually got started out in managing a very small on prem environment and you could always say, okay, here is the D mark, here is the hole that gets drilled into the building, here is where the responsibility for the cable company ends and our responsibility begins and we're, depending on how good your data center looks, this might be where the jungle starts.

With all that, it's easy to remember that, yeah, okay, everybody who VPNs in, they can have really tight granular control over the kind of stuff that, you want to be able to give them access to. When you're [00:09:00] moving out of that Into say everything's a virtual machine. All of your switches are virtualized There's a little bit of abstraction pain you can think oh you know inherently because I can't see every single cable in this building that we're going to have some difficulty with being able to map out the entire network.

We don't actually know where things are being dropped on, and this is all taking place inside of a black box as far as I'm concerned, microsoft or Google or Amazon or DigitalOcean or whoever else you're using has full control over your stuff, and we can only take them at their word so far.

There can be a little bit of fear inherent in that, but I like to think that the flexibility and the agility that you gain is more than worth it in terms of a trade off. I remember back in the day when I, would, I'd get a ticket from our our network guy, living a few countries over who would say, Hey, go into our primary climate controlled data center where you've got the fans blaring at, I don't [00:10:00] remember how many decibels and, slot this Cisco module into this box and from there, they'd be able to say, okay, yeah, we can read that from over here. That's 100 percent good.

You're free to go. It's a little bit more difficult when you're doing that with somewhere like AWS, where you're going to have to test things out a bunch of times all on your own. You're probably at some point or another have to go back to support maybe two or three times to make sure that you're able to get everything right the way you want it to and just gradually develop it out.

It's equal amounts of hops and software, but you wind up trying to say, okay, let's take this one thing that works. Confirm that it works, make sure that you're 100 percent sure it's doing what you want it to and that's because you did what you wanted to on the code and from there build out it's less so okay, this looks good because all the cables are plugged in and the lights are turned on that I can confirm, is that light red or is it green? Because green usually means go based on what I've learned in driver's ed [00:11:00] and from there we can extrapolate that everyone's happy and you can see the logs, you're on a firewall, you see the blinky lights, you can read the logs, you can dump to the console and a lot of these like AWS networking, you can't even see the BGP routes, you can't see a ton of it, you need to open a ticket and tomorrow they'll tell you what the route table looks like.

And that's just it's not good enough. And I hope AWS and Azure both improve that because It's not good enough

Ashish Rajan: Because one would think that if you're trying to challenge the norm with data center with virtual machines and firewall technically, quote unquote, software firewalls, one would think that should be equally quick as well, going down this path. I think one thing that, as you mentioned the networking part, one thing that came to mind was the first time we were having a, at least I was personally having a networking conversation with someone in a data center, the whole idea of there, there were definitely a few myths that had to be busted in terms of how it works in data center versus what it would be like in this new age of [00:12:00] cloud that we're going to move into.

Out of curiosity are there like top three that come to mind for you guys that you had to introduce to the world in a way? And obviously how did that get impacted in how regulated industries do networking?

Kyler Middleton: Things that are different from on prem versus cloud?

Ashish Rajan: Yeah.

Kyler Middleton: Yeah, absolutely.

WAFs exist, which are fantastic. It's hard because if you're purchasing a WAF, you're probably paying about 140,000. I don't know why the security industry has decided that every product should cost 140,000 a year. But they have, and thankfully this fractionalized costing in clouds is really helpful.

But you can put a WAF in front of all of your inbound stuff. You don't have to give a public IP to any service. And get it hacked from someone overseas. It's wonderful. But you just have to learn how that complexity works. There's also just the lack of insight like we talked about. If you're deploying a transit gateway or a VPN gateway or something like that.

Monitoring them and [00:13:00] diagnosing troubleshooting them can be really hard because you when you start to work with these AWS or Azure network resources, you get a lot of this trust us. It's cloud. Why are you worried about the network? Go spend some money, go deploy some resources and I don't want to trust you.

I want to prove it. I want to look at the logs. Network engineers are very much like I want to be on a terminal. I want to type some commands and have it show me the happy green dots. Yeah, and you just can't in, in these cloud networking environments, and that can be really challenging. And that's supplemented because often it does work.

Often the trust does work fine. The network works fine. And this network firewall in particular, this project, it did not work fine. It's built like a Lego block. AWS loves their little Lego blocks who snap together. But with the networking got pretty complicated and we had to use this traditional network skill set of we need a whiteboard.

We need to draw this out. We're using three different route tables and they all have to point to different things and that is not something you typically do within a [00:14:00] VPC. Normally you're just pointing at the NAT gateway and you're done and you just move on with your life. But now we had to understand exactly where all the VPC endpoints worked and what type they were and how we routed to them and associated to them.

And we just had to go way more in depth network wise than we ever have had to do before with networking in the cloud.

Ashish Rajan: Oh, all because the information was not there in terms of logging information that people had to, people normally look out for.

Kyler Middleton: Yeah, it just wouldn't work. And why wouldn't it work? Dunno.

Good luck.

Ashish Rajan: Jack, you have some thoughts as well on this, on the whole network firewall?

Jack Harter: Yeah. Oh I wanted to hark back to the top three things that, change over when you moved yourself from the on prem into Amazon's DC. Let's say one of the, one of the biggest ones, and this is one of the earliest things you're usually taught when you're becoming a devop is shared responsibility model. The whole idea that you're not going to have that kind of deep ingrained control if you're using something, anything as a service at that point, you're [00:15:00] not going to be able to, when I was working in the trading space, from time to time things would go terribly wrong because at the end of the day, you're trying to leverage, leverage the utilities you have to make as much money as humanly possible.

And sometimes that leads to someone in ops having to run into the DC and rip, literally rip the ethernet cable out of the back of a network of lines at some point to make sure things stop doing what they're doing. That you really can't do that as well. You've got to either, pop onto the CLI and say, okay, we're terminating this instance.

It's almost terminated. I'm doing this as fast as I can. And, at that point, that's the only real method you have to be able to access this stuff. Another interesting element is that duties really get, pivoting from there is the duties really get separated when I started in this industry again as a, basically just moving from desktop support to more network management stuff. You could really see, okay, this is where the cable plugs in and it goes down into the basement and it leads into a switch. And from [00:16:00] there it's a little bit easier to learn from that level when you can physically see everything nowadays, if you want that, you're going to need to work out of your home lab and say, okay, I've got this switch that I bought off work.

And from there, I can determine things and then we can apply those concepts in a cloud, in a more clouded oriented environment. And from there, your infrastructure naturally changes. When I was building out this, individual labs that I'm trying to, just tool around and see what cool stuff I can build out.

Initially, it's just, double click on an exe that you downloaded and all of a sudden, boom, you've got Plex running on your home server and now you can have your at home Netflix that never crashes. And even when your internet lets out, it's all good now. And at this point, we're starting to pivot to say, okay, that's cool and all.

But how can we just make it a Docker file that we've got everything totally, separated out and movable and motile and, I'm not going to have to go through the same kind of nightmare every time I set up a new workstation here. So all those things are just new challenges that we need to wrap our heads around.

And while I can't say I was there for it, I imagine that when people were [00:17:00] switching over from from IBM Big Iron to microcomputers, they must have been going through similar issues at that time to say, what are you saying? It's, we're not all going to do time sharing. Instead of running off of the centralized computer.

You mean everyone's got one at their desk and, pretty soon we're going to have C suite people doing programming on their own. Still hasn't happened, despite the best plans that we've put into place. Thankfully there's still a little bit of space for the for the mid to senior level developer out there.

But yeah, honestly, who knows how much longer that's going to last. My personal opinion is forever, but none of these things ever end.

Kyler Middleton: You said that there's this combination of, or separation of roles, excuse me, that everyone is siloed. And I think it's the opposite in a problematic way.

And that's the promise of cloud is everyone's enabled to go make changes to everything, both with your IAC, if you're doing like Terraform, but in the console too, you can go and click and restart the databases and database engineers are very uncomfortable with anyone being able to even look [00:18:00] at their databases.

And there's this sort of confusion about who's in charge of what because I can get to so many things and segregating those roles really in cloud is really hard and not what they want you to do. They want you to teach everyone to do everything which means developers and ops and dev ops are all muddled together and it's hard to know who's in charge And what we should be doing that's just a hard problem in the cloud because it's muddled

Ashish Rajan: Yeah, I think it's a good point because quote unquote, DevOps was supposed to be developed as an ops people combined.

But if you are in an organization which has been there for a long time, where those two roles have been separated and now you're asking them to work together suddenly, they're like what parts of these are mine and what parts of these are yours? Yes. I don't want the shitty parts.

I want the good parts. What are the good parts over here that I can take? But actually to bring it to the whole network architecture as well, then I guess the network firewall to what you said, Kyler, earlier, traditional approach would have been, we put an NAT gateway, you guys went down the path of saying, Hey, we want a VPC gateway endpoint, we want [00:19:00] network firewall, was logging the only or having access to the logs was the main driver or were the other security by design or deep layers of security that you had to think about before you went down the path of, okay, these two are going to be I guess I'll go to for this particular project instead of going for a simple NAT gateway.

Kyler Middleton: I can talk about that architecturally and then maybe Jack can talk about that specifically. Yeah, sure. The previous project that I helped out on, I was the junior architect, was in Azure, and if you wanted to switch over from just pure network access to filtered Layer 7 permissiveness, you just changed the route from pointing at Internet to pointing the route at Azure.

Firewall internal and it just worked like it works great like accolades to the Azure Firewall team because it works fantastically and it has its own natting. It's wonderful and that is not what we found in AWS, but for the impetus of why we wanted it is our security team has set a goal that we should be filtering [00:20:00] all of our egress access outbound at layer seven at the URL or the TLS SNI header, which is a really good idea. You should be doing that everywhere. A sort of old school layer four filtering just doesn't work in cloud worlds, particularly because eBay doesn't live on a server with an IP. It's behind this enormous global CDN.

And if you're doing multicast, it's different IPs, no matter, if you go to a different Starbucks, you might be hitting a different node. Just filtering at layer four with IPs is not good enough. We need to be filtering URLs for any kind of real provable security. So that's the goal. And I promised when we started this project, it's so easy in Azure.

I bet it's really easy in AWS. It's probably a checkbox. Give me like an hour. And how did that go, Jack? How did our implementation go so far?

Jack Harter: We may have been off by a few factors of magnitude in that case. It's taking a little bit of time. Let's start from the get go.

Initially, I was thinking, this is good. Network firewall, all firewalls are network oriented. Then started to [00:21:00] delve a little bit further into the actual documentation and thought, the easiest thing to do is just going to figure, okay what can we terraform out of this thing and figure out how to get a get all assembled like a block style?

Got my duplos in hand right here so then from there we figured okay the firewall appliance itself in turn has a one to one relationship with a policy and the policy itself can have multiple rule groups inside of it I don't know why they decided to have that middle layer in between, but, it's, it looks interesting.

That's how we're going to work it out. Now, from there, we started to really get stuck in the mud. This is where we reach I don't know if, what can I say, everyone's a Princess Bride fan. This is where we hit the swamps of despair. Only because we began to run into some fascinating issues with traffic not operating in the way that it wants.

I initially thought, okay, great, I'm here, I've run Terraform apply. Now I just need to SSM, SSH into the box that I'm testing this out of, and [00:22:00] everything should, be perfectly clear. I've already, I've allowed egress to S3, so I should be able to, grab the standard status box and, return in some HTML.

Little do we know the way that Amazon insists that would prefer that you architect your access to S3 right now is with an internal endpoint that lives inside your VPC and allows faster, cheaper access to wherever you're storing everything in S3. They like it that way, they can bill you more.

No, or less in this case, but that's another story. We found out that when we try to reach out to S3, it would leave our protected instance, pass through the firewall, goes directly through the S3 endpoint. When it tried to come back, it didn't know to go back through that endpoint. And then we would have asymmetric traffic, which the firewall would drop.

And in turn, that required a little bit of time dealing that required time dealing with AWS saying [00:23:00] that hold on a minute, what you actually need, and correct me if I'm going a little bit off the rails here, Kyler, what you actually need is the NAT gateway behind the firewall. Excuse me. Let me try to drop a diagram here.

We have instance. What you should do at this point is, NAT gateway follows its entrance to the internet, then firewall, then internet gateway. This is suboptimal architecture, if only because, it will then appear that everything coming out of the subnet, going towards the NAT gateway is all coming from the same location and at that point we can't say okay something is trying to do a curl request to bad actor. net but we have no idea who it is. We don't know.

Kyler Middleton: Let me chime in there. Please. That is an old school recommendation. When NAT Gateway was first built by AWS, it didn't support natting after the firewall on the way out. They wanted you to hit the NAT gateway first and then hit a firewall and then go to the internet gateway, [00:24:00] which is fine architecturally.

It works, but all the logs from the firewall that's doing deep inspection that we're paying quite a bit of money for would show the source is in that gateway. So if you see a bunch of malicious traffic, who's doing it? Oh, it's the NAT gateway. That's who's doing it. Oh, I guess I need to go look at those logs too.

So clearly that's not a good solution. So we said AWS support, please don't make us do that. That sounds really bad security wise. I don't care if it'll work, it sounds like it's a bad choice. And so that's what Jack and I said to support.

Ashish Rajan: And what did you guys end up with after?

Jack Harter: Support came back and said, that's our most advised solution.

I don't know what you want. We worked with our TAM and it turns out that they had recently, I think, in the past 3 or so months support this new model where it's the firewall 1st. And then the NAT gateway and then the internet gateway. And just that, that this is so challenging to describe on a podcast where I don't have a whiteboard behind me like Charlie day.

And it's always sunny with the [00:25:00] means. Like I'll put the I'll put the network diagram in the description as well.

Ashish Rajan: I've seen the diagram it has that gateway, it has the endpoint and then it has the virtual machine as you go deeper into the network. But to, to your point. How would it be different? Oh, maybe, how is it done on premise?

Maybe let's just give people a perspective for what's a good way of doing it and what, why is it a bad thing in the cl? Obviously now we know why is it a bad thing in cloud, the way it's done? How is it done on premise? Normally?

Kyler Middleton: It's a checkbox on your firewall to see do dnet and you say, okay, and it does dnet well, obviously you can get a little more complicated at the enterprise, but Oh yeah.

Usually firewall boxes do natting themselves, which means you don't have to worry about this sort of tiered, layered, routed complexity. You don't have to worry about asynchronous routing because your firewall and your natting is a box. So all your hosts route to the firewall and your firewall is the internet.

It's routed to the internet directly. And that's not the case here. [00:26:00] They are separate. Natting is a separate box from firewall. And I hate that. I wish they would change that. First of all, it's just two things to manage and two things to pay for, which I suspect is the reason that it exists that way. And I would love if the firewall just had a checkbox that said also NAT.

On the way through, and then you don't need a NAT gateway tier. You just have a firewall, and it does the same stuff on as on prem. It NATs on the way through. That's the way the Azure Firewall works too. And it's just really simple. You don't have to be a network engineer. You just have to put it in there and send all your traffic to it.

This is definitely a case of engineer brain taking prominence over, I don't know, project manager brain? What was the way we put it when we discovered this particular Lego style application here? It's really It's how AWS builds stuff. So it's hard to knock them for that.

Exactly. That's their methodology is you build individual services with their two pizza teams. I want pizza every time I bring up this story. And so clearly like the NAT gateway team is different from the firewall team. And so they thought, [00:27:00] NAT solved. Why worry about, don't worry about NAT, just worry about firewall.

And I get that, but the amount of complexity it injects in cloud networks is unfortunate.

Jack Harter: And it's really strange, if only because when you spin this firewall up, for instance, it spins up its own ALB. It needs that in order to be able to communicate properly. And those things are inherent to the firewall.

You can visually see them in the GUI, and you can't delete them. And it doesn't tell you why until you delve into the documentation and find out, Oh, okay, no wonder those are tied directly into something else. When you delete it, it deletes those, when you delete the firewall, it deletes these endpoints seamlessly.

VPC endpoints you mean, right? VPC endpoints, yes, you're correct. Yeah, no.

Ashish Rajan: It deletes the endpoints as well, so each VPC endpoint has an ALB that's attached to it?

Jack Harter: I'm confusing my load balancers with that, but yeah, it creates some redundant points.

Kyler Middleton: I can clarify that. So firewalls exist as a resource that is not in a specific availability zone.

And you [00:28:00] would hope that means they're accessible from every availability zone. You would be wrong. For a firewall to exist and be reachable in an AZ, it has to have a VPC endpoint for the firewall service. created in that AZ. So we created a firewall and we have services across several AZs and we had to create a VPC endpoint for that firewall in each AZ.

And then for routing on the way out, the servers have to find the VPC endpoint for the firewall. in the same availability zone. So there's two or three, and they're named with a GUID, and you have to use your Terraform logic to find look up each GUID to find its availability zone, and then select the same one.

And even with Terraform, that often solves a lot of this complexity, we had to invent some stuff. That we didn't want to invent. We wanted to just hit the checkbox, darn it. But we had to invent some Terraform things to make this happen.

Ashish Rajan: Wow, so you guys are building Terraform modules? Because there is that nuance as well.

Can you use Terraform on premise though? You can't, right? There [00:29:00] are

For firewalls specifically? You can choose Terraform on premise, but as in, I meant more in terms of configuring your firewall, building your firewall, all of that. Can that be done with Terraform?

Jack Harter: I'm pretty sure they have a SonicWall provider.

That being said, that, that gets real far into the weeds that I haven't messed around with yet.

Kyler Middleton: I think Cisco does have a provider now. They're working on it, but primarily you would use something like Ansible to do it. And I built solutions like that before, but it's nowhere near as clear as most of the cloud Terraform stuff.

It's just not as intuitive as that for sure.

Ashish Rajan: Because I feel like there's a, if I was to call this like lessons learned from moving from on premise to or cloud firewall for I would think that one of the lessons I'm actually taking away from this is that you also obviously need to know the best practice from an on premise perspective, but also know a bit of Terraform.

Cause it's your point earlier, scaling out. You not just building one network firewall. You wanna build a pattern that you can repeat again and again, especially from an architecture perspective. And to Jack's point, you don't wanna be sitting there deploying it to every damn [00:30:00] server that you find, which is gonna add this extra component of ALB in there.

And then you're doing a, trying to find this what is GUID ID, GUID ID that for each one of them and try and figure that out. And then for heavens forbid, if there was an incident, you're trying to figure out what, where is this GUID that I need to look out for? So Terraform is also a skill that people had to pick up.

I guess Jack, you had to pick it up.

Jack Harter: Yeah, thankfully I have been working on various Terraform levels for about a leap year so far, four or five years at this point, actually came into it from the Ansible perspective thinking. Okay. My relationship with HashiCorp goes back a bit longer than that.

But it was a great opportunity to say, all right, sweet. Now I no longer need to go. Okay. Which EC2 instance was this again? What was it hooked up into? Things just slide together a little more naturally that way. And yeah, but it's also not just knowing, okay.

Terraform apply, Terraform destroy. Neat, we got our home lab set up and wrecked all over again. Kyler's made this awesome script that we call the Bear Hug which [00:31:00] will allow you to take currently extant stuff in a VPC, and then pull it down directly into Terraform code by, via an import, some importation magic.

That's another thing that you need to learn a little bit about by saying, okay what happens when our when infrastructure is breaking stuff and moving fast and we haven't been able to get it codified yet when we don't have the infrastructure is code all set together beyond that.

There's also just flat out using the getting real down into the weeds and using the Terraform console to figure out, okay, which outputs are created from here and how can I then extract those when a module, when I'm trying to create a module here. Yeah, if you need to like use in depth logic to be able to say, all right, we need to have a root table pointing to one of these endpoints that this firewall has created.

But it's not necessarily available as a vanilla output when we're assembling it through Terraform. How can we do that? You get a little bit hacky with it. And [00:32:00] honestly that's when you can really get a lot of help from people, from things like Copilot to be able to say, okay, we've got this long string of brackets and interpolated nested stuff.

How do I just do that as a data call? But yeah, having a couple of years in Terraform has been endlessly helpful with this. Just knowing how it's able to link in with all the stuff we use. Thankfully, AWS got a big ol head start in there. But that's constantly changing as well. I remember three, four years ago, building an S3 bucket was you know, whether you were going to keep everything encrypted and the policies are going to attach to it.

Those were all just one lines that you'd pump in with your S3. Now, those are all independent modules all on their own. They need to be linked in with it. So things have gotten, yeah, the wild area just keeps on expanding and growing faster than you can that you can tame it. So it takes a lot of energy and not even energy, just, the time and the input to be able to say, okay.

This is what we want to do. A lot of suffering. Blood, sweat and tears. [00:33:00] Exactly.

Ashish Rajan: Talk about your project as well. That helped Jack the, is that on a GitHub project or just a personal project that you worked on?

Kyler Middleton: It's just something I just find these like flights of fancy where I think that'd be cool if I built a thing and then a week later, I'm awake from a caffeine haze and I've built the thing.

And so I initially called this the hostile takeover, but this is for environments that have significantly drifted from your infrastructure as code, or maybe you built it with CloudFormation or ARM or something. And you're like, I really would like to manage this with Terraform. It's clearly won the war.

There's use cases for all those, I want Bicep in my thing. Really, Terraform is everywhere for everything. It's a sequence of scripts that uses some open source libraries, and it just generates your code for every existing resource in an AWS environment, and puts it into Terraform in just maybe 10 or 15 minutes.

Ashish Rajan: Wow, across all AWS accounts, or just the one that you pointed at?

Kyler Middleton: Just a single AWS account, yeah. And it's not [00:34:00] clearly formatted yet. I'm working on that to get it to format after it does the imports to like, put it in folders or reference a module instead of directly the resource. None of that polishing is done.

This is very hacky at this point, but for something like what Jack was working on, where you've to hell with it. I'm just going to build the firewall myself by hand and then we'll run the bear hug, which is that we decided the much less aggressive terminology than I'm going to hostile takeover.

The it's just, it's easier to say okay, cool. I got it working. What should the Terraform look like? Let's do the bear hug and just see what the Terraform spits out. I guess that's what I should have written in the first place.

Ashish Rajan: Oh, interesting. And I guess to Jack, your point then, as you were finding that the pattern suggested by AWS was like an anti pattern and you guys have to build the firewall build that with the Terraform configuration so you can scale it across multiple accounts as well. Dev, test, prod, all of that. Were there other challenges that came across the, I guess the project that you are open to share in terms of whether it was the use of the firewall, because you obviously had some [00:35:00] experience on premise as well.

How people describe network firewall, at least the way I remember data centers versus what it would be as a software on a cloud console, was there stark differences in the expectation was the reality?

Kyler Middleton: Sure. Yeah. As I was mentioning earlier, in terms of, visually and tactilely being able to interpret what your firewall is doing.

Here, there is no DMARC. There is no place where, AT& T or, whoever your cable provider is stops their services and your there is in some sense, but it gets real hazy, real fast. You've got your internet gateway in AWS, right? That's, essentially your analog for whatever the DMARC would be.

This is where, beyond here, this is the open ocean. There be dragons. Now, behind you is, your nice cozy shire. And, you can control everything that goes on back there. And that's, I'd say, what we would interpret the place where, you put your I'm going to continue here with the Lord of the Rings metaphors.

This is where, you want to have your iron gate. Yeah, you want to make sure that [00:36:00] everything behind you is within your control and everything in front is without and that's okay there's some raging seas out there and we don't necessarily want to have them, you know laughing at our shores so with that in mind, we this product itself is designed to, again with a Sauron reference, this is not designed to keep the Kings of the West out, this is designed to keep your orcs in.

You want to make sure that they're not going out and attempting to do stuff that would go against what, whatever you want. And, honestly, with a normal I say normal, they're not, they're pretty much, they're becoming pretty pretty beyond the pale pretty rapidly. With your standard internal firewall, with your standard physical box, you're doing both at the same time.

You're saying you know, bad stuff stays out, good stuff stays in, good stuff doesn't reference bad things out there. And we have full complete control all in this SonicWall and or Fortinet appliance, whatever we're working with here. In this case, you, when you're working [00:37:00] with cloud in general, AWS in particular, you've really got to embrace the defense in depth.

Admittedly, that happens with on prem as well. You've got your climate control, you've got your giant glass box with a key, you've got your person at the front gate, you've got your man trap, whatever you want. In this case, you don't just want to be utilizing network firewall, telling outbound traffic, don't go here.

This is not a safe space. If you wind up there, you're going to be compromised. That part of the woods is bad news. But here you've simultaneously got to use the old fashioned AWS stuff like ACLs and security groups and saying, okay, this subnet only can access the internet through this particular CIDR block.

With the on prem, it was a lot simpler. You could just go ahead and say, this, we've checked all these boxes. We've said that, badguy. ccp or whomever else you don't want your your network to be connecting to is hereby banned. That's it's as simple as that. In this case, you need to do it on multiple levels to make sure that when you get your product [00:38:00] out, when you have your boxes humming along and optimally making your organization a little bit of revenue that all this stuff is not just safe on one level, but safe on every. Sometimes you're going to run into your cloud support saying, this is, our recommended architecture. Other stuff is wrong. And then you might come back around after talking to your TAM and say, hey, actually, we've proven out that this theoretically un kosher way of doing things Is actually completely fine and, and I got to have a lot of fun discussions, friendly discussions with with the brains that be at Amazon to say, listen, check out, you're going to be able to see this the network map when it's all posted on the pod, this is the way we've been able to get it to work.

And honestly, I think a better method than the one that Amazon is presented to us. And at that point they go, Oh never thought about it that way. Fancy that, good to, it's Maybe you should write a blog about this. I, I've got a, I've got a little rage bait LinkedIn post up [00:39:00] about it already.

I'll see if I can do some slightly more in depth dives into everything that happened to me, and the polite discussions that occurred along the way.

Ashish Rajan: I think, I guess one of the reasons why I find this conversation fascinating also is because I think this is like the third or fourth time we're having a conversation about the, what was quote unquote the anti pattern by AWS is actually the pattern that the customer wants to use and works for them.

And similar reaction to what you were saying, I think the funny, I think they did I think the lady's name is Meg and I think there's an episode on it as well. And she did a talk about this at a conference as well, about how the anti pattern is the right pattern for many organizations.

Which I thought was really interesting. I'll definitely ask you to check that out as well. Maybe from a lessons perspective, Kyler, any thoughts on lessons you've learned on obviously coming from an Azure world into this, what should I be in a tick box is now like a whole, I don't know how many months project still running.

What were your lessons as [00:40:00] you are walking away from this?

Kyler Middleton: I think it was just reaffirming something that I already knew that everything is foundational knowledge, everything. But particularly here, like you need to know how your network works. We like to think that in an Azure or AWS environment, the network's just magic and it mostly works and that's true.

But when you're adding features like this, that I don't think they have quite the polish that AWS generally puts on it. You need to know how your network works because you need to build your network. It's not AWS magic. It's not Azure magic. It's you building your network in a way that's secure and functional.

And so you need to know how it works. So your networking knowledge, those network engineer brains, if you're worried about coming to cloud and you won't need that skill set anymore, not true. We use that here a lot. We exercise those brains today.

Ashish Rajan: I guess you have to both your point as well, because it's it's a foundational piece that helps you understand.

What your organization needs and how you can achieve it, rather than, Oh, AWS can only do this, so we only do this.

Kyler Middleton: [00:41:00] Yep, absolutely. Your goals stand apart from what AWS can provide and you gotta put the puzzle together in a way that satisfies your needs. And compliance, regulatory stuff, that's on you.

Figure out how to match those puzzles together.

Ashish Rajan: Actually, yeah, going back to shared responsibility that Jack mentioned, when the auditor comes in, they are not going to be Oh, AWS doesn't allow for it. Oh, great. So I'll just pass this check for you. Then I guess,

Kyler Middleton: don't worry about the law. Don't worry about AWS. It's an inconvenience.

So don't worry.

Ashish Rajan: That's an anti pattern by the way, Mr. or Ms. Auditor. This is anti pattern if you haven't noticed, but I guess maybe for people who are trying to start this journey, I think that sounds really good, but I guess in terms of the maturity for this for how people would deploy this and where to start now that you guys have done majority of the project and found this anti pattern that is a great pattern for you guys, is there, I guess the way you approached it, would you change anything about the way you, where you started versus where you landed in terms [00:42:00] of if people are thinking of doing this moving from on premise to AWS, would you just go, Hey, you know what, if your requirement is to have more in depth logs of the firewall, just go straight here, or I guess where I'm coming from is that is there something that you would have done differently in the way you approached it.

Kyler Middleton: Kyler, you're the architect. You want to spearhead this one? I wouldn't have promised that it would take an hour. I feel like I would have annotated that a little differently. But no, not really. The architecture that AWS recommended we do, would not have satisfied our security constraints, right?

So no matter how much they say I don't know, this one works for sure, that one might work. We have to do the one that satisfies our needs. And so the architecture we settled on works really well and I wouldn't make any changes at this point. We were gonna get here no matter where we started. 'cause we had to use the tools they gave us to satisfy the constraints. And this does that.

Ashish Rajan: Yeah, fair. And I guess to your point, the network firewall being there is the first example that they clearly are other customers as well asking for the same thing. Otherwise, [00:43:00] there was no need for network firewall to begin with.

Kyler Middleton: Yep, absolutely

Jack Harter: I want to say the nice thing about the way we are able to work with AWS here is, yes, it is very Lego oriented. But the nice thing about Legos is you can take them apart and put them back together any way you want. At the end of the day, we also don't work for AWS. We work for our internal people.

So we don't always have likewise when you're building whatever block set you want to be able to, you're not working for Lego. You don't need this thing to look like the pristine, whatever is on the instruction box. If you want you can make your Death Star a trapezoid.

Who cares? Yeah, it's you bought the blocks. So it's nice to be able to have this kind of flexibility and to be able to say, okay, this is going to provide what we want it to. Is it going to be as pretty as the one on the side of the cardboard box? Not necessarily, that's why we have the experience and the flexibility that has been able to get us to be, the kind of reactive, responsive team that we are. And that [00:44:00] is at the end of the day, a lot nicer than just having a bunch of steel humming boxes around us that we had to take a hundred percent care of. This gives us a lot more ability to do what we want to do and less, less requirements to do what we have to.

Ashish Rajan: Fair. And would you say now that you guys have gone past this, have you guys already worked, started working on phase two of what's the next phase of unplugging or replugging or areas when AWS that you're finding that are next challenges that you guys are trying to tackle?

Kyler Middleton: From my perspective, no, because it's basically built how we want it to look.

And we're waiting for the business to give us approval to do the next thing and to work at the scale out. Because it works for the size of our deployment, but as we grow, which hopefully we do, all of our businesses hope to grow to the point where their architecture doesn't satisfy their needs then we have to look at scaling out and sorry, Jack, I interrupted.

Jack Harter: Oh, you're totally fine. At this point all I'm doing is adding a little bit of polish onto it. It's already behaving in the [00:45:00] fashion that we wanted to, but when you're working with cloud, there's no end to the to the kind of logs and backups and additional things you can find out along the way. So right now we're trying to figure out, okay, how do we export these things into a better logging service than the one than just CloudWatch for example how can we, how can we make the firewall operate on, say, some esoteric level ports? How can we, event, we've already set this thing up to be extremely. suspicious of traffic. How do we make sure that we've got all the rules in place that make it work in the way that we want it to.

So there are still just a couple of pain points. It's already doing the stuff that we want it to by and large, we can open and close a route. And now the question just becomes, how do we make it do everything we want it to? We're at 90 some odd percent, the way there we want to get to a, to 99.

We want to get to five nines, I'll say.

Kyler Middleton: Yeah, for some of the non standard ports and protocols, you have to use the Suricata [00:46:00] language and push it with Terraform to create the policies. It's not the GUI or the Terraform resource config, which is It's a little befuddling for us, and hopefully that is developed.

Maybe by the time you hear this podcast, that will have been resolved, but if not your Suricata knowledge is going to be useful.

Ashish Rajan: Fair. That's most of the technical questions I had. I've got some fun questions for you guys as well. Kyler has some before, but maybe I'll start with Jack first, that way.

He has less time to prepare. Basically, I'll throw you guys in deep end here. The first one being, and maybe I'll do this. I'll ask a question to Jack first and then go to Kyler. Sure. Since she has actually heard the question before. What is the so when you're not trying to solve network firewall challenges what's the most time you, what do you spend most time on, man?

What's your hobbies like?

Jack Harter: Yeah, let's see. I already talked a little bit about my Homelab. It's a little docker setup that I run in a in an older Mac that I keep at home. And through it, build out my build out my film library and try to get all the, old black and white dramas that I can [00:47:00] pull down pop before, they disappear from the internet.

Outside of that, spend lots of time with my cat, Lewis, my wife, Carrie, and my daughter, Eden. And yeah. Try to do my best to keep the rest of the house to be slightly less of the chaotic mess that it's gradually turning into. I'm big on being able to continue to fool around with hardware oriented stuff.

Recently set up a very funky RAID system that sometimes works and sometimes doesn't. And yeah, beyond that, just take it, do my best to take care. I'm a very recently crowned new dad. So the so hey, like any cloud product, you're never a hundred percent ready for it. And there's always going to be lots of documentation that you should have read months and months ago.

Hey, when you're in a freshly minted position, these things are always going to come. Yeah. The things that YouTube video teaches you, and then there's the actual thing. Yeah. Yeah. Yeah. There's, there's Linus text tips. Yeah. You can see all that nice, clean, newly, fresh out of the box stuff with a guy with a big fake smile doing his thing versus, you [00:48:00] actually hurting your hands with a screwdriver and trying to slot something in there alongside some cat hair.

So yeah there's the prettified version. Then there's the gridified version and you're going to find yourself on the grittier end of the spectrum when you get down to it.

Ashish Rajan: Oh, fair. Awesome, man. Thanks for sharing that. What about you, Kyler?

Kyler Middleton: I as a break from work, I do more work and I'm not sure how I do that.

Ashish Rajan: Yeah, your bear project was self answering in that way. You just have high caffeinated moment. You just, I don't know what, I'm just going to make an open source project.

Kyler Middleton: I do. I do a lot of blogging on let's do dev ops. com. If you want to come read, you can give me money if you want, but most of it's for free.

I'm building an AI chat bot that uses Bedrock and Lambda. I host a podcast with Ned Bellavance called day two cloud and outside of all of those jobs, I'm volunteering with a social media startup to build a safe place for queer folks to connect and meet up. Because politically, the US has some challenging times ahead, so I'm trying to help as much as I can.

Ashish Rajan: And maybe good to ask you a second question as well. What is something that you're proud of that is not [00:49:00] only social media?

Kyler Middleton: Oh my daughter, her name is Kennedy and she's wonderful and she's the best thing in my life. This morning she sat me down at my desk and went to get my water and she said, okay, are you good?

And I was like, I'm good. Stay here. You get on your podcast, you do a good job. And I was like, okay. Ashish and Jack are counting on me. I better sit down and do a good job.

Ashish Rajan: Oh, wow. Okay. Kennedy's pretty awesome. I can't wait to meet her. What about you Jack? What's something that you're proud of that's not on your social media?

Kyler Middleton: Oh man, I also totally missed in the previous chat you can read some of my blog posts at h a r t r dot net. Actually, excuse me, that, that will result to nothing. It's blog dot harter dot net. Oh, okay. Which is something I've been able to spin up entirely on its own accord. Little bit crazy.

The analytics don't properly work for it. But in terms of stuff that's off my social media. You, as you may have guessed, I'm a big Lego nut , , so that's one of the things I'd like to spend additional spare time on. Got, yeah the set continues to expand. Got a pretty neat [00:50:00] little Marvel diorama set up between let's see, Loki and Miles Morales.

I'll see if I can post some of that later on,

Ashish Rajan: wow. Your re your real stuff is your stuff is all pretty awesome, by the way, it sounds

Jack Harter: oh, thank you.

Ashish Rajan: Lego is the thing that you've been hiding away from the rest of the world, so maybe time to put that in your blog as well someday.

Jack Harter: Oh, boy. That will be, nothing wrong with that.

Ashish Rajan: Final question as well what's your favorite cuisine or restaurant that you can share? Oh boy,

Jack Harter: Kyler, you want to start on this one or should I?

Ashish Rajan: I'll make it more spicy. If you guys were stuck in an island individually and you only had one dish or cuisine that you could go for, which one would it be?

Oh man,

Jack Harter: I guess for me it would be home run in pizza. Unfortunately it's a local Chicago thing. It's not it's not an island.

Ashish Rajan: And you want to have one pizza again and again.

Kyler Middleton: Absolutely. No, this stuff is fantastic. It's not too cheesy, it's not too saucy, it's not too thin. It's been a long time enjoyer of mine.

Some of my friends prefer Jacks. That's not so much for me. That's too [00:51:00] thin. Too New York style. I like something that has a little bit of thickness to the depth.

Ashish Rajan: I imagine the comments are gonna go on Oh, I know. favorite Chicago pizza place. But thanks for sharing that man. Kyle, what about yourself?

Kyler Middleton: Definitely paella. It's wonderful. I went to Spain for a couple of weeks and it was just, I wish I could eat it every day forever. It's wonderful. Sometimes there's bones in there. Don't eat the bones. Go a little slow. It's wonderful.

Ashish Rajan: Eat the bones anyway. It makes you stronger. It's called calcium is what, or what our parent would say.

Yeah, not the chewy kind. But I want it. But that guys this was pretty awesome. Thank you so much for sharing all that as well. I have your blog links as well, but if people wanted to connect with you to talk more about the whole network architecture, I'll leave your LinkedIn links in there as well.

But what other social that you would wanna plug in? For people to connect with you. And maybe Kyler, you want to go first.

Kyler Middleton: The big one is letstodevops. com. It's a Substack hosted blog where I'm just writing tons of introductory and medium skill level DevOps [00:52:00] stuff. So if you're new to DevOps, or you want to build your skills around Terraform, cloud, and now AI as a Slack bot, check it out.

Most of it's free.

Jack Harter: Yeah, and again, you can find me at blog. h a r t r. net. Look through my demented ramblings and occasional fiction short story, and, enjoy what you find. It's all there for free. I sometimes track it, but usually not.

Ashish Rajan: But I will leave those links in there as well. Thanks guys. Thanks everyone for tuning in as well.

We'll see you next episode. Thank you so much for listening and watching this episode of Cloud Security Podcast. If you've been enjoying content like this, you can find more episodes like these on www. cloudsecuritypodcast. tv. We are also publishing these episodes on social media as well. So you can definitely find these episodes there.

Oh, by the way, just in case there was AI cybersecurity, we also have a sister podcast called AI Cybersecurity Podcast, which may be of interest as well. I'll leave the links in the description for you to check them out. And also for our weekly newsletter, where we do in depth analysis of different topics within cloud security.

Ranging from identity, endpoint, all the way up to what [00:53:00] is a CNAPP or whatever the new acronym that comes out tomorrow. Thank you so much for supporting, listening and watching. I'll see you next time.