eBPF is recent graduate in the CNCF family and this means that the world of Cloud and Kubernetes, networking looks very different with more security capabilities. Cilium the project from Isovalent has been gaining traction for network security for Kubernetes as blindsides have been called out in the managed Kubernetes deployments.This episode was recorded at This episode was recorded at Kubecon + CloudNativeCon North America 2023 with Thomas Graf from Isovalent to share what the blindsides are and why eBPF provides better network security capability for Kubernetes deployments of any scale.
Questions asked:
00:00 Introduction
03:42 A bit about Thomas
04:11 Traditional Networking in Kubernetes
06:52 What is Cilium?
07:52 What is eBPF?
08:46 What do people use Cilium for?
11:31 Starting with network security in Kubernetes
13:02 Complexities with Scale
16:02 How do projects graduate?
17:02 The eBPF documentary
17:27 Opensource to Company
18:52 Practitioner to Founder
19:57 Building an open source project
21:13 The Fun Questions!
Ashish Rajan: [00:00:00] Are you looking at doing network security in your Kubernetes environment, specifically in the managed Kubernetes environment, then you may have looked at things like eBPF. If you haven't, then this is the episode for you. We are talking with Thomas Graf, who's a CTO of a company called Isovalent. They were the original creators of the open project called Cilium.
Now Cilium is a very popular open source project in the network security context of Kubernetes. Yes. You might think, why do I need a network security tool when Kubernetes should already have all that? Surprise. Apparently it doesn't. So in this episode, we talk about some of the blind side you may have if you are using managed Kubernetes and the network security capability in Kubernetes.
It can be limited. What are your options? If you are looking at having that initial network security elements, like isolating your assets, using some kind of a firewall policy, or maybe thinking about 2FA or other capability that you would think as given in any other product like containers or AWS, Azure Google cloud, you'd find that.
Yes, some of those work, but in a more granular Kubernetes context, [00:01:00] if you have a managed Kubernetes, you probably need to think about this in a different way. So where eBPF comes in, this is where Cilium comes in and there are other open source projects as well that are using eBPF, but in this specific conversation, Thomas, who is the original creator of this really popular open source network security project of Kubernetes called Cilium.
He was kind enough to share some time with us and talk about what made them start the project. The Cilium project, why it got popular, why are there blind sides to Kubernetes network security, specifically in managed Kubernetes? At what point should you think about applying these? And also what are some of the stages that you might think about needing this for a network security perspective for Kubernetes environment that you may build in?
As mentioned earlier, this is an open source project called Cilium, which you should definitely check out. It's probably one of the most popular ones at the moment in that network security. And this conversation with Thomas Graf was held at KubeCon North America in Chicago, and we got the opportunity to even go into the fact that why would someone extend an open source project and build a business around it as well? What were some of the challenges around it? As someone who has come from a software engineering [00:02:00] background, like Thomas, he was kind enough to share what made him start the open source project and later on start a business called Isovalent behind Cilium as well.
If you know someone who's working on the Kubernetes security project, specifically networking, then you may want to share this episode with them. I think they'll thank you for this later. If this is your second, third, or maybe even 10th or maybe 50th episode that you're listening to of Cloud Security Podcast or maybe watching on YouTube channel.
And you have been finding us valuable. I would really appreciate if you could take a few moments to drop us a review or rating on your popular podcast platform, like iTunes or Spotify. That is if you're listening to this, if you are watching this on YouTube or LinkedIn, definitely give us a follow or subscribe, it definitely helps us spread the word.
And lets other people know as well that we have a community that we would love to welcome them into. We are a growing community of about 50, 000 people so far. So we would love to keep growing that and keep spreading the good message of cloud security and how to do that this was a conversation we had at KubeCon North America where we were and thank you to everyone who came in and said hello and took pictures with us and took videos with us and was kind enough to come on my video of the LinkedIn videos that I post for my [00:03:00] daily vlogs for conferences that we attend. Now, the next conference we're attending is AWS re:Invent and if you would be there as well, I would love for you to come and say hello and then reach out maybe if you are available.
So if you give me some heads up, I should definitely be able to make some time to meet you at re:Invent. And I look forward to taking more pictures and videos with you folks. When I see you at AWS re:Invent, I'm always grateful when you folks give me hugs, say hello and say, thank you for all the work we do.
It really means a lot. So thank you for everyone who came and said hello to us at KubeCon. I can't wait to say hello and hug more people at AWS re:Invent as well. If you are there, definitely reach out. This is the episode with Thomas Graf from Isovalent who talks about Cilium and the network security for managed communities.
I hope you enjoyed this episode and I'll see you in the next one. Peace. Could you tell us a bit about yourself? May we start there?
Thomas Graf: Absolutely. Yeah, I'm Thomas. I started my career in kernel development. Oh wow, okay. So 20 years ago, I was at Red Hat on the kernel team. Working on networking and security, software defined networking, and so on. Working with some of the kernel legends. Stayed in networking. Spent a couple of [00:04:00] years at Cisco during the OpenStack virtualization days. And then launched and founded Isovalent and Cilium.
Ashish Rajan: Oh, nice. And talking about Cilium as well, congratulations on graduating. Thank you. It's a great achievement. How is networking done traditionally in a Kubernetes environment?
Thomas Graf: Yeah. Networking, as expected, was taken over, inherited from the network virtualization age. So in the very beginning of Kubernetes, it was assumed or taken, that a container is a little bit like a virtual machine. So whatever was present, For a network virtualization was just applied to Kubernetes as well, right?
Layer two bridging very old school networking. Oh,
Ashish Rajan: wow. So this is basically where this is Kubernetes 1. 0 for lack of, obviously this is another version, but when people started looking at networking in Kubernetes is the very first time, they were just trying to apply a traditional concept onto, a microservices first kind of world.
And so what was missing?
Thomas Graf: So I think there are lots of things we're missing, right? Initially in Kubernetes, it was assumed that it would be mostly running in the cloud, where you have a VPC of a network that's provided by the cloud provider. [00:05:00] Yeah. And actually lift a lot of the requirements of the network.
Also security was essentially declared a... Day two problem, fair enough. Kubernetes did not come out of the box with a firewalling solution. As an example, Kubernetes did specify how network policy, how firewalling rules should be defined, but it did not mandate that even any network plugin would have to actually implement it.
Encryption was a day two problem. So a lot of the security aspects around networking were, it wasn't declared that it's not important. It was just not part of the initial phase of building up kubernetes, right?
Ashish Rajan: Oh and it would that be the same between the first version was obviously self hosted one before the cloud service provider adopted this Was that the same concept in the unmanaged or self hosted Kubernetes as well networking was missing there as well
Thomas Graf: Yeah, there's not really a difference between managed and unmanaged in general.
Overall, Kubernetes was initially not really used by really large scale, right? In the first couple of years of Kubernetes, everybody was learning about it and playing with it, [00:06:00] and it was still being built up. In that initial phase, what really mattered is just connecting pods together. You need to have some form of networking to actually get it working.
But scale didn't really matter yet. Security didn't really matter that much yet. There were a couple of early aggressive adopters of Kubernetes that also wanted to run it at scale. And they definitely then drove the need for the next generation of networking tools.
Ashish Rajan: And once the cloud service providers adopted this, did that change the security posture at all?
Or was the first adoption in the cloud service providers as well similar without much security? As long as you can do network.
Thomas Graf: I think the managed services, they did see the need for security. A couple of years into the Kubernetes journey. And they do provide network security for the solutions. But it's not as extensive as some of what we've built with Cilium later on.
But they of course do see the need that network security is absolutely required to have a compelling platform. Oh
Ashish Rajan: And with the Cilum, specifically talking about Cilium in project. For people who are not aware of Cilium and may know networking Kubernetes when they're starting this. [00:07:00] What's the Cilium project about?
Thomas Graf: Yes. Most of us creating Cilium, we were working on a project called Open vSwitch before. Okay. And Open vSwitch was the defining open source software during the network virtualization age that allowed virtual machines to talk to each other on Linux. VMs running on Linux to actually provide network.
And with the rise of Docker and containers, it became clear that Open vSwitch is not going to fit well into that ecosystem because of scale, because of agility of containers, and also with the upcoming age of hybrid and multi cloud, where users will run Kubernetes in a variety of different cloud providers, on prem, and in the public cloud, and so on.
And that was the starting point of Cilium. So we have been working on Open vSwitch and on eBPF, which is this incredibly powerful, lower level technology. And with that technology in the hand, and with seeing a need for a better networking layer, we started building Cilium.
Ashish Rajan: And what is eBPF as well?
Because a lot of people don't even know that word.
Thomas Graf: So eBPF is, started out as a kernel level technology. Think a little [00:08:00] bit about it like, JavaScript for your kernel. Okay. It means you can load a program. Yep. You can load it into the kernel. Get it verified. So the kernel will say this is secure to run and then you can run it when the kernel performs certain operations such as receiving a network packet or the application performing a system call or you accessing a file.
And then you can extend the logic of the kernel with your own program. And that revolutionized the entire kernel field again, because at that point the kernel had become super, super hard to change because everybody's running Linux today, right? Yeah. Everybody wants a boring and stable Linux kernel.
Yeah. So everybody's saying, please don't touch the kernel. It needs to be stable and boring. Yeah. And with eBPF you can essentially extend it and inside of the kernel without risking stability of the entire ecosystem,
Ashish Rajan: right? Oh so , it's almost like adding features to an existing boring kernel and in terms of security capability, because I think we started the conversation by talking about how security probably has not been the number one thing when people started using Kubernetes, what are the [00:09:00] network security capability that people should think about?
And maybe obviously Cilium is a popular project and now its graduated. What is some of the things that people use Cilium for in the open source community.
Thomas Graf: Yeah. So Cilium in the initial use case for Cilium was clear, like the high scalability. And we had users like Datadog, which has been using Kubernetes for a long time.
They've been early adopters of Cilium they are running Kubernetes at insane scale. And they really pushed the boundaries. So that's what kind of the first real use case for Cilium was for real power users that really needed to scale. And then security was the second use case that came in. And inside of security, you should really think about it in three pillars.
Pillar number one is network segmentation or firewalling. Yeah. This is called network policy. So who can talk to whom, what pod can talk to what other pod. Okay. And in there, Cilium has brought a new style, a new way of doing this. Instead of what traditional solutions would do, which is match on IP addresses.
Yeah. For example, with IP tables. Yeah. Cilium has implemented what's called an identity based mechanism to enforce. Oh. So we're thinking about. [00:10:00] pods, not in terms of IPs. We're thinking about them in terms of services with identity and enforcing segmentation rules based on identity. That's pillar one.
That's segmentation. Yeah. App one cannot talk to app two. Okay. The second pillar is encryption. You need to encrypt in transit. So all the data that is flowing through your network, you need to be able to encrypt. Cilium can do this for you with WireGuard and IPSec. Yeah. Encrypt all the traffic on the wire.
Important for compliance. Of course. Yeah. And then the third pillar is mutual authentication. That if you want to go beyond just network segmentation and encryption, you can also validate and authenticate your identity using an MTLS based handshake. That's the third stage. Not all users will use all three pillars, but often if you want to have like security in depth, you want to have all three pillars.
Ashish Rajan: Awesome. And I love how it's using IPSec, which For people who are probably from a networking background, it's that's traditional. That's been there forever since we're talking about Linux. So it's just, what you said earlier, it's just adding extensions to the fact that now you can use [00:11:00] that technology to encrypt your traffic as well instead of just something else.
Okay, fair enough. In terms of organizations adopting it, you mentioned Datadog was the first that scaled it out. And now we might have a mix in the audience, which may have heard about Cilium right now for the first time, so may have been using it for a while. How do you see organizations adopt it? What's the maturity stage from the beginning, and it's gonna goes on.
If I was someone, I'm gonna pretend to be an audience member who's starting Kubernetes today. I've gone ahead, I'm going, Oh, Thomas is saying the right things. Where am I starting if I've not done any network security before?
Thomas Graf: You may actually already be using Cilium without knowing it. Oh, okay.
All cloud providers have adopted Cilium in some shape and form for their managed Kubernetes platforms. If you're a user of AKS, if you create a managed Kubernetes platform on Azure, you're actually using Cilium under the hood. It's called Azure CNI powered by Cilium. If you run a GKE cluster or an Anthos cluster or an EKS Anywhere cluster, you're actually using Cilium.
You may not just not see it because it's like networking is typically invisible. Yeah. So that's the starting point. That's not the full [00:12:00] version of Cilium. So cloud providers typically ship a slightly reduced version of Cilium open source. Simply to make it a little bit easier for them to support the whole platform.
So it's not the full feature set, but that's the starting point. And then you can go and actually use the full version of Cilium and upgrade.
Ashish Rajan: And so once people start using Cilium, is that more like the first stages to what you said, network segmentation what am I jumping on first as a starting point?
Thomas Graf: So I think the first jump is usually to either security or to observability. Either it's the, I need to make my platform compliant and secure. When you jump onto security, encryption, virtual authentication and so on, or you're saying I'm actually struggling. troubleshooting my platform, I want to be able to have really good visibility into the networking layer of the platform because I do not want to have an incident that lasts two days.
I want to resolve that in a couple of minutes. That's Hubble. Hubble is the observability layer of Cilium. Usually one of the two, observability or network security is [00:13:00] a starting point for advanced Cilium use.
Ashish Rajan: Interesting. And oh, I love the idea also because it is actually quite a bit of a problem sometimes when people look at and I think when we were catching up earlier, you spoke about this as well, that one of the reasons why a lot of people probably not look at network security, because there's a blind side in the managed Kubernetes part, where if there is something going wrong, nine or 10 times, I imagine people who are not used to Kubernetes I have no idea what's going on.
I didn't even know what's going on. Where to start this conversation? What logs do I have? Or what part of the network is what I'm looking at? Like the traditional security approach for an incident is, where is it? Isolate that network, to what you said, the first starting point of isolating it. You don't even know where to start and be like how am I isolating?
What am I doing? But to what you said, Kubernetes doesn't provide that by default, Cilium comes in there. Now, taking this to the next level where I think when you spoke about whether going down the observability path or using network security, if I'm slightly bigger organization, like I think Datalog, I would say is enterprise will probably come after this.
From a large to medium sized organization, I've started doing network security. I've said encryption [00:14:00] because I need security compliance as well. I've got observability. What other complexities do you see that come in as companies grow as they start using more Cilium?
Thomas Graf: Yeah, step two is clearly something like multi cluster.
So if you start, that's actually becoming the norm that you're not only running one Kubernetes cluster, but you're running multiple clusters. Could be simply for availability that your own clusters in multiple availability zones or multiple regions. Maybe you have an IOT type use case where you have some public cloud infrastructure and edge clusters that run somewhere closer to your customer.
That's multi clusters. So Cilium can connect clusters together very seamlessly and easy with. What we call global services. So you can do a service discovery across clusters. You can apply policy across clusters. That's definitely a step two use case. But another, I think very popular step two use cases, how do I connect my Kubernetes infrastructure with what I already have on premise, like a database running in the cellar?
So how do I connect that? Because in Kubernetes, you're running containers and pods. They constantly change IP address and then you have existing firewalls in your infrastructure. [00:15:00] They really struggle if there's like constantly changing IPs of parts and containers. How do I do that? So Cilium Egress Gateway is like a very popular step too.
That's capability of taking a bunch of parts with changing IP addresses. You can scale up and down and actually map that to a stable set of IP addresses that then don't change. To make the life of the firewall easier. That's like super common. Overall, I think that's what we're seeing in general is a lot of demand.
How do I connect my... New world of Kubernetes. Yeah. To my old world of, let's say it's a VMware data center or it's an on-prem data center, running physical networking hardware or even just a bunch of VMs running in the cloud.
Ashish Rajan: Yeah. I love the fact that you called it, 'cause I was gonna ask about the on-prem server.
'cause at certain point, and this is where most companies find themselves and they do adoption of cloud as well that initially it's just an isolated network by itself. Yep Managed Kubernetes in AWS Azure Google Cloud doesn't matter whatever but sooner or later you have to start talking to actual data that's going to be useful for the application So that's a really good use case now [00:16:00] and I probably explains why Cilium got graduated and all of that By the way, what's the process for graduation?
It's what happens during the graduation thing?
Thomas Graf: So it's a pretty long process. Typically, you start as a sandbox project, and then you can graduate to an incubating project. Okay. And then from there, you need to prove that you have very solid production users. So you go through a due diligence process.
Yeah, TOC and so on. And they will do end user interviews and actually talk to your end users. Whether you are actually being used in production, you need to document end user use cases. There's a security audit that is being done. We've actually done two of them, just to be on the safe side. Oh, perfect.
Where you have a third party company that looks into your source code and does penetration testing. And there's a whole lot list like license scanning and like lots of things and governance check. Are you running an open governance project? Yeah. Yeah. And so on. Do you have contributors from multiple companies, all of this?
And when you pass that, it essentially gets to a vote. Oh, it's the voting as well. There is a vote, there is a vote. Okay, as usual, there's like a lot of governance involved. So there's the CNCF TOC will [00:17:00] vote on whether your project can graduate.
Ashish Rajan: Oh, wow. Okay. And also now graduated, you guys have a documentary out as well.
Thomas Graf: We have an eBPF documentary coming out actually this Wednesday on November 8th. Oh, wow. Which shows the foundation and creation story of eBPF, the underlying technology for Cilium. This started like in 2014. So that's a couple of years back, right? Yeah. But for a kernel technology to become really mature, it usually takes about 10 years.
Ashish Rajan: Yeah, I'll definitely put the link of the documentary in the show notes as well, so people can go check it out. One more thing I'm curious about is, because you came as a technical software engineer person and now you're obviously running IsoValent, so what was it that made you go from, hey, Cilium is good, growing, but I need to have IsoValent as well, like at what point do people switch over to that I need something more than what Cilium is offering?
Thomas Graf: Fundamentally I was very interested to actually start a company. We didn't create Cilium with a strict goal in mind that we will absolutely create a company around it. But it's definitely in the back of our heads. And then we found it about a year after we had written the first lines of Cilium code.
And I think just [00:18:00] having an open source project on its own, it's great. You can build a very successful open source project without having a company backing it. And there's many examples of that. Yeah. But for Kubernetes and Cloud Native, I think it's also important that many customers are actually looking for a company that can help them, right?
Because open source projects, there's great, it's great technology, but typically you have to then build the platform on your own. You have to figure it out. You have to go through a learning process of what can I do with it? What are the things that you can trip over? And you have to really build it and you have to commit an entire team to then support that open source project.
Because if you have an incident, you're on your own. You have to fix the bug, right? You cannot just call somebody. And from that perspective, Isovalent is helping Cilium become popular because there is the path to let me, let Isovalent help you build the platform faster. I can call somebody if I have an incident or if I need a bug fixed and so on.
And that's also accelerating the open source
Ashish Rajan: project itself.
Now that you
have the company, what was the hardest challenge to transition from a practitioner to leading the technology? That's a tough question, right?[00:19:00]
Thomas Graf: There's so many challenges along the way for anybody building a company, and you are changing yourself every couple of months, what you have to do, and what the company is asking of you, and what the customers are asking of you.
So I don't think there's like a single challenge. I was definitely asked a ton because I was obviously still writing code the first couple of years. Yeah. I wrote a lot of the Cilium code initially and a question number one I got very frequently is like, When do you think you will stop writing source code?
Oh, yeah. I could never answer that question. Yeah. I don't know because I like it. And at some point I just woke up and said it's been more than half a year since I wrote a line of code and I haven't even noticed. Oh, the transition was like, it was very smooth, but I literally just stopped writing at some point and it wasn't just like that.
I would only spend two hours a week at the end. It just dropped off pretty quickly at the end. Yeah, and I think that also explains the transition is very smooth and very iterative. You don't have to change yourself overnight, but the company will demand just different aspects from you.
Ashish Rajan: And for people who are listening this [00:20:00] and have a CNCF project or even an open source project was your advice for them in terms of what does it take to build their own, successful open source project, but also have something which is value driven behind it as well, that can be a potential business as well. So what do you tell those people who may be,
Thomas Graf: I think in particular, if you're considering to build like an open source based business model, I think honesty is absolute key.
Like you need to fully commit to open source and build a It needs to be a pure open source project and it needs to be company culture goal one or value one that open source is what really matters. Because if you don't, people will notice and it will, the open source project will struggle. And because the open source project will struggle, you as a company, you as a business will also struggle.
So being very honest and upfront about this is open source and this is our business model. That's absolutely crucial because that's how people will commit. To using the open source project and not actually have in the back of their mind. Most likely they may actually just remove stuff from the OSS or they will [00:21:00] monetize and then I will ask me for money and so on.
You need to be very clear and upfront about what you're committing to.
Ashish Rajan: That's absolutely key. Awesome. And that's like most of the questions I had. I'll probably put a link for Cillium and the documentary and Isovalent in there as well for people to follow up. I had three fun questions for you.
If you are not doing Kubernetes or cloud native, what would you be doing?
Thomas Graf: Probably something in nature. Like I'm absolutely an outdoors person, hiking, biking, skiing, ski touring. So if you don't find myself working on tech, I will probably outside with family and friends, enjoying myself in nature.
Ashish Rajan: That's awesome.
If you could have a superpower, it could be a cloud native superpower, what would you want to have? That's good.
Thomas Graf: We created a superpower with eBPF. If I wanted to have a superpower, I think often, I'm not sure what superpower I would want.
Ashish Rajan: If you would change something, what would that be? I guess it's like, what's the one thing that maybe you would wish you could just snap a finger and it would change about cloud native as well?
Thomas Graf: Yeah, probably not in cloud native, but I think just thinking a little bit more globally, I would wish I'd have the power to resolve some of the global [00:22:00] political challenges that I think we as a society have to go through. I think that would help a lot of people, and it will also help tech. And it would that's, so maybe that would be the superpower, to actually just find solutions for some of these really complex issues.
Ashish Rajan: Awesome. Last question. What's the best part of being at KubeCon?
Thomas Graf: Oh friends. I just it's been amazing. I think I've missed, I don't think I've missed any KubeCons and it's been just amazing seeing people every single time, getting the feedback from people, talking about something and talking to end users, talking to other contributors.
Sometimes I'm not actually seeing anybody face to face that has been contributing code for ages and for years. So that's my favorite aspect, just hanging out with friends and getting to meet them. Yeah.
Ashish Rajan: Awesome. Thank you for sharing that as well. That's most of the questions we had. Where can people connect with you and find you to know more about Cillium and Isovalent?
Thomas Graf: Absolutely. So Isovalent, you can find on Isovalent. com, Cillium on Cillium. io, where you will also see a link to the Slack channel. So you have a very popular Slack channel. I think almost 18, 000 people now, [00:23:00] and that's always probably the best way to reach me. Yeah. Okay. Cool. I'm very notification driven these days.
So I think popping on Slack and sending me a DM is great. And obviously you can also find me on like the social media channels as well.
Ashish Rajan: Awesome. I'll put that in the show notes as well.. But thank you so much for coming in. Awesome. Thanks. So glad we got to have you. Congratulations on Cilium's graduation again. I'll put the link in the documentary as well, so you can actually watch the eBPF documentary.
But thanks so much for coming in, and we'll see you soon. Thanks everyone.