Categories
Podcast Season 5

Building Resilient Infrastructure at the Edge with Craig Nunes of Nebulon

Edge infrastructure is susceptible to many of the same security risks as datacenter and cloud, but is often run in less protected environments. This episode of Utilizing Edge features Craig Nunes, Co-Founder and COO of Nebulon, talking to Brian Chambers and Stephen Foskett about the provision of reliable infrastructure services at the edge. Nebulon’s product presents storage to servers in a managed way, monitoring and protecting storage in real time. Edge servers must have a known-good system image to ensure that they are secure, yet this is difficult to achieve in remote devices.

Edge infrastructure is susceptible to many of the same security risks as datacenter and cloud, but is often run in less protected environments. This episode of Utilizing Edge features Craig Nunes, Co-Founder and COO of Nebulon, talking to Brian Chambers and Stephen Foskett about the provision of reliable infrastructure services at the edge. Nebulon’s product presents storage to servers in a managed way, monitoring and protecting storage in real time. Edge servers must have a known-good system image to ensure that they are secure, yet this is difficult to achieve in remote devices.

Hosts and Guest:

Stephen Foskett, Organizer of the Tech Field Day Event Series, part of The Futurum Group. Find Stephen’s writing at GestaltIT.com, on Twitter at @SFoskett, or on Mastodon at @[email protected].

Brian Chambers, Technologist and Chief Architect at Chick-fil-A. Connect with Brian on LinkedIn and Twitter. Read his blog on Substack.

Craig Nunes, Co-founder and COO at Nebulon. You can connect with Craig on LinkedIn. Find out more about Nebulon on their website.

Follow the podcast on Twitter at @UtilizingTech, on Mastodon at @[email protected], or watch the video version on the Gestalt IT YouTube channel

Transcript

Stephen Foskett: Welcome to Utilizing Tech, the podcast about emerging technology from Gestalt IT. This season of Utilizing Tech focuses on edge computing, which demands a new approach to compute, storage, networking, and more I’m your host, Stephen Foskett, organizer of Tech Field Day and publisher of Gestalt IT. Joining me today as my co-host is Mr. Brian Chambers. Thanks for joining me today Brian.

Brian Chambers: Hey Stephen, good to see you. I’m Brian Chambers I’m the Chief Architect at Chick-fil-A where we do a lot of things with the edge and you can also find me at brianchambers.substack.com where I write the Chamber of Tech Secrets.

Stephen: So as we’ve talked about in previous episodes the edge is an interesting area because it’s not like we’re doing anything completely novel there, but everything that we’ve done in the datacenter, in the cloud, needs to be rethought when we’ve got remote, you know, maybe not secured many many many locations and one of the things that we need to rethink I think is basically the security aspect and specifically making sure that, you know, the servers that are located out there are actually booting up correctly booting up the right software, running the right things and dealing with issues that are going to arise, which may include ransomware in the future.

Brian: Yeah security is one of those things that I don’t think we’ve actually spent a lot of time talking about so far on our episodes this season but definitely you know an absolutely critical feature of any edge stack that anybody’s going to run to get business value and, as you mentioned, the ransomware thing is a hot topic lately. One thing I think is interesting about the edge is with cloud and data center, we generally got to a point where we could take physical security for granted. You could kind of assume you’re in a safe facility with good security controls and if there was some sort of intrusion or something, you know, it would get handled pretty quickly, edge maybe not so much. Maybe a little bit of a different environment out there and the physical side, you know, in addition to the many different copies, many different points of potential, ingress creates a whole very large, very interesting, you know, and possibly attractive attack surface for the bad guys so it’s an interesting topic to dig into today.

Stephen: Yeah absolutely I would think that, you know, I mean we’ve heard a little bit I think about ransomware hitting industrial IoT and so on. I think it’s only a matter of time before it’s going to come down hard at retail manufacturing, you know, it’s just such, like you say, such an attack surface and it’s just so much possibility to compromise these systems. And of course, you know, when we think about security, we have to think about confidentiality integrity and availability. Availability is one I think that is really going to be a challenge as well because, you know, in many cases we’re dealing with systems that may not be homogeneous they may be running well whatever they’re running out there and, you know, you really need a better understanding of the server and so that’s why today on the podcast we are joined by an old friend of mine, Mr. Craig Nunez, co-founder and uh CEO at Nebulon. Thanks for joining us.

Craig Nunez: My pleasure to be here Stephen, super fun. Yeah, look forward to it.

Stephen: So Craig, tell us a little bit I guess, what is Nebulon and then we’re going to get into sort of how the this fits into the edge and what this means for security.

Craig: Yeah, Nebulon is a company that, basically we turn your favorite industry standard server into efficient cyber resilient application infrastructure and we do that by mirroring an approach that we’ve observed by the hyperscalers effectively taking you know something like, think of AWS and Nitro, that approach to offload a lot of infrastructure services off of your server CPUs and run it on this controller in your server and in doing that you gain great efficiency. It gives you some wonderful capabilities around security. You get quite a bit of flexibility in terms of the, you know, operating environment you’re running and it also lends itself well to cloud-based management in a lot of ways both security operational efficiency and the like. So you know we’ve kind of taken this different approach to, you know, and until we came to the party, really only something you do in the cloud. We believe there’s great value for that architecture for enterprise and MSP data centers.

Stephen: So you know you you’ve got this product that basically presents storage to servers and how does that connect with this whole idea of resilience and security. Is it it’s just a SAN, right?

Craig: Yeah, no. SAN, uh, I don’t know if I’ve said that word in many years. For the edge especially, the, you know, most have evolved to a hyper-converged concept out there server-based approach and the, you know, what we do is we are for sure presenting data services but we’re presenting cyber services, we’re providing remote control of the server itself. We’re really providing a whole suite of infrastructure services and all those infrastructure services are running on this card in your server that you would use instead of your favorite storage controller. So this goes into the server, connects to your SSDs and is controlled remotely by the cloud. What this lends itself to do in the context of security is you know we are deduplicating, compressing, encrypting all data, you know, as it goes through. We’re in a perfect position to ascertain whether or not application data is somehow being encrypted in real time as it’s happening and we’ve got capability both with the card and in the cloud you know to kind of keep tabs volume by volume what’s normal and what is, you know, what is a big change as it relates to encryption and we’re really on the lookout for cryptographic ransomware. And if we spot a pattern of that after a couple of minutes we’ll alert someone to that fact and with technology that also is associated with this secure zone in the server anchored by this card, we will recover you back to before ransomware hit in just four minutes. So it’s a very cool story you can only really get to if you take a certain approach and so we’ve taken it for that reason.

Brian: That’s really interesting. So maybe for those who are newer to the edge or maybe not quite as steeped in the security realms, maybe a weird question, but like can you scare us a little bit? Like what could happen if you don’t have this sort of solution or if you’ve got stories about what you have seen happen but put a little fear into the people who are new to thinking about this or who maybe think it’s an edge environment, maybe we don’t allow Ingress into our environment. Like what could go wrong and how could it go wrong?

Craig: Yeah first of all if you have a network, you know if ransomware cracks in somehow it can go anywhere right anywhere that is attached to a network and you know if you believe the you know the statistics out there that are you know sometimes they’re a little more headline oriented. But you know if you believe your favorite security analyst three out of every four businesses over the next two years will experience a ransomware attack and one out of ten of those will be effective, will get inside, and if it gets inside the average time to recover is on the order of a few weeks, okay. We talked to folks every day about this topic and, you know, I’m blown away by the number of people who’ve told me we never got that data back. We gave up trying after three months. And so you know the issue is you know really not if you will experience an attack but when and when it comes to all those stats you know, what do they say about statistics? There’s lies, there’s damn lies, and then there’s statistics. They are all meant to be you know a guide to your actions. You can change what those mean to you, right? Technology is all about changing your fate and so the point is given we’re in a world where ransomware threatens your business operation in every day more creative ways, we need to think in bigger ways about how we protect the server and storage infrastructure that’s serving our applications and there’s a lot done at the network level, for sure some really powerful stuff. But when you think about what’s inside my server right now that will protect me, there’s not a lot out there today, but you know we think with you know some of the technologies that you know to be honest are running in the public cloud right now, there’s a lot more that can be done around detection and recovery.

Stephen: Yeah I think that’s the real key here is that it’s hard to know anytime exactly what’s really running on a server and it’s especially hard to know when those servers are not you know local or not something that you can see, not something that you have seen you know? What’s to say, who’s to say that the operating system image wasn’t compromised? Who’s to say that there isn’t something running, you know, kind of, I don’t know, below the virtualization layer or you know another virtual machine that’s stealing credit card numbers or something like that. I don’t want to be all scare mongering but you really kind of don’t know what’s running on these servers. You only know what you can touch remotely because they’re remote and yet you know they could be running almost anything. I mean honestly it could be something innocuous. Somebody could be browsing the web on the server that’s sitting in the restaurant or it could be something a lot more serious, you know, maybe they browse to the wrong website and now they’ve got a key logger running on that infrastructure and they don’t know about it or ransomware as you said or just maybe it’s just corrupted. Maybe it’s just, you know, somebody kicked it real hard, poured something nasty on it and it’s just a little flaky. That wouldn’t be all that weird either in retail environments. I know a lot of people basically just it’s not working, I’m going to unplug it, plug it back in again. That’s not a great way to end up with a known good server so I think that to me is really what you’re talking about here is that you want to have these things running a known good image. How do you know it’s good except by controlling the hardware that’s showing the image to the server, that’s letting the server you know and so to the layman I think what the product is kind of like a super duper cloud connected hard drive that you just know is right?

Craig: One hundred percent. yeah in fact so a lot of us are familiar with, you know, machine images that were, that we will take and spin up in in our cloud-based instances. What you can do with that technology on-premises is create a machine image that is you know your known good certified image, your operating system or hypervisor, cluster software, application, the patch set that you’ve tested and lock that down and put that into your secure enclave so that you know whenever you reboot that server, instead of rebooting what was running which may have a errant patched you know applied or might have dormant ransomware that somehow made its way in the next time you reboot, you’ll reboot from that known good image. And that having that across your server fleet should something bad happen and maybe it’s not ransomware. Maybe it’s just you know a bad patch that’s taken out a ton of your infrastructur. It’s as simple as a reboot to recover that infrastructure to you know what has been tested and validated you know by your engineering team centrally. It’s all about, you know, good hygiene and keeping that you know the thing that’s running your server infrastructure at every edge location.

Stephen: Yeah that’s got to be kind of an interesting, you know, kind of a compelling idea that, you know, it’s one more way to know that things are right and we’ve heard some companies talk about hardware root of trust and trusted platform modules and things like that. How is this any different from that? Why is that not good enough?

Craig: I think the thing that I’ve learned about security technology is you know it’s an “and”, you know, it’s that and the other it’s it’s never this or that. And you know, so for example, you guys were talking about physical security at the edge, you know, having everything always encrypted so you know when you know a drive fails and is taken away by a subcontractor for disposal but somehow that winds up on eBay with all your data on it, it is entirely encrypted. No one is ever going to get into it. The controller that anchors this secure enclave, it’s really critical that controller has a, you know, hardware route of trust because if somebody were to walk into a store and take that and tryto then use that to crack into your organization, that root of trust is going to prevent them from being able to access and provide the secrets necessary to crack in and so it’s, you know, encryption and hardware root of trust and you know this immutable boot image and you know detection and recovery in case, you know, all of this stuff, you know, goes sideways and somebody does get inside.

Stephen: Brian I want to ask you a question. I mean, you know, we’re talking about this, you run a massive edge environment, you know is, I assume, that ransomware is on your radar I assume that integrity of the system image is on your radar. You know is it and what’s your thought on it?

Brian: Yeah I mean it’s definitely both when we think about you know cloud environments you know as well as the edge, it’s something that you know I think we definitely have top of mind. I think hopefully everybody does these days. Yeah it’s an interesting set of challenges like I think you know we talked about security being an and or another way to say that is like layers right, like you’ve got your, you know, your software supply chain you know side of things, you’ve got your images that we just talked about, all the factors that we just went through here definitely all come into play, you know, together in a solution. Another one like we think about layers too like, I guess it’s not a security layer but like not allowing anything inbound to our stores would be one thing that we get a lot of benefit of. Now you’ve got physical attack vectors and things like that you’d have cloud resources that are compromised, that create vectors so no matter what you do you’re going to run infrastructure, you’re going to have risk of security breaches and ransomware incidents and things like that. So absolutely a top of mind thing and I think we think about it by putting those layers in place and doing as many as we can reasonably do, you know, and be able to manage. Got to lean on partners in a lot of cases to help us with some of those kinds of things, but yeah try and try and build a stack of layers that ultimately gives us a good, you know, hopefully a secure surface or at least, you know, makes it pretty challenging and time consuming to get any value from it should you find a way to compromise it.

Stephen: You know what you’re describing too Craig, you mean you’re talking hardware here right? Isn’t this going to add to the cost of the system? Is this something because as Brian’s saying it’s all about making informed compromises because you have to make sure that the environment makes sense. You have to make sure that the bill of materials make sense and that you’re provisioning things. Sounds like a big complicated piece of hardware or something I mean isn’t it yet another thing that you need to buy everywhere?

Craig: So you know the real magic is all about software right and the the whole point of this you know when it’s spun up in the cloud was about efficiency and the you know the the notion that you can move processing of something from you know expensive x86 processors to low-cost Arm processors is you know always considered to be a good thing. The knock-on benefits of you know some of the security attributes uh a big deal. The way to think about the approach that we’ve taken is, look your every server you’ve got is going to have some kind of storage controller in there and so this is just a replacement for that it’s not an additional card, it’s a replacement for it and the, you know, the software that you’re effectively running, you’re going to use that in instead of, you know, your hyper-converged path. So you’re going, you know, avoid the expense of hyper-converged infrastructure, you’re going to get all of your server cores back to do what you need to do and write size your edge accordingly and you’re going to get these capabilities that are just built right into the application infrastructure. You’re not going to have to write another check to another company for a layer of security. It’s something that you know all server infrastructure should just you know should just come wired for and that’s kind of the idea is you know they’re in larger companies, there’s two people at the party Security Ops and IT and what we’re trying to do is sort of equip the IT team who you know sadly in most cases are stuck with the recovery right, and equip them with you know some of the tools that you know typically have been sourced on the security ops side so that they’re bringing to the party equipment that is you know better protected so they don’t wind up having to own a nasty recovery across you know a whole bunch of sites. Which is the other thing you know we I talk about at some point is how do you enable recovery you know push buttons simply across a bunch of sites but clearly that’s you know IT owns that and if you can enable them to uh avoid that that’s uh that’s uh that’s a good day’s work.

Brian: I’d love to come back to the recovery thing next because I think that’s a great topic real quick. If somebody is thinking about using a solution like yours or doing something like this in their edge fleet, when is the right time to be thinking about that? So you know we already have an existing fleet that’s you know across all of our stores. A lot of other organizations do as well. Is this something that you see as a potential retrofit? Is this something people need to think about when they get to their next refresh? When’s the right time to be thinking about this in in your design?

Craig: Yeah for sure. I mean we talk to companies who roll through their infrastructure in different ways. For sure when you’re thinking about you know modernizing, optimizing your edge sites, that’s clearly a time and place for this. But let’s say you’re between that, when ransomware hits an organization IT ransomware changes sort of data protection as we know it because data protection has really been about the stuff above the operating system your user data. It’s never really been about you know allowing you to roll back your operating system. Why would you back up your operating system, right? So the you know the typical approach with backup software isn’t as useful as we would like it and your recovery cluster which is you know standing by in case something bad happened to your data when ransomware happens, not only is data encrypted but your operating systems are all encrypted including the operating systems in your recovery cluster. And so you know for those with a large edge environment they’ve probably got you know staged recovery clusters in different locations probably under the control of IT. That is a great place to you know to start because getting your recovery cluster back up fast when you’ve lost an edge to ransomware is critical and if you know across the network that cluster has been encrypted, you will spend a fair bit of timere building those servers to get back DHCP, DNS, if you’re you know Linux environment Pixie Boot, whatever and your backups days to then start recovering what you know is producing revenue for your production infrastructure. Start there because you can do that at any point along the way and then of course as you are thinking about you know more broadly how you modernize your edge to handle these modern threats of ransomware absolutely is a good time to uh look at you know the alternatives out there.

Stephen: Yeah and I guess what are the alternatives? What other ways would people have if they don’t want to do a you know, like a hardware software, you know, solution? What are they going to do to protect themselves from, you know, and to ensure that they have a known good infrastructure?

Craig: So there’s you know there’s I think a lot of criteria that you know you’d want to think about that most is widely available but you know you gotta check it. So let’s take the example of HCI common use case at the edge the and you know just about all hyper-converged infrastructure provides for encryption at rest right, you should do that. One of the problems is sometimes people will turn it off because it might have a bit of a overhead on those systems. You don’t know necessarily how it’s been set up and provisioned. So you want to think about you know how can I maintain encryption that’s always on. Do I have hardware root of trust? Do I have any way to handle that you know complete encryption of my infrastructure and how can I separate my applications from the infrastructure services that are actually going to provide recovery and investigate those. The challenge with hyper-converge and hyper-converged, I’ll tell you right now, hyper-converge is great for a lot of things but it isn’t the, you know, it isn’t the, you know, the do all be all for every you know challenge that affects IT. And one of the challenges for hyper-converged as it relates to this world of ransomware we live in today is the application domain, you know your operating system, software application binary, your tools. That lives in the same domain as your data services, your network services, your lights out management and so when that stuff gets encrypted, so does everything else and so if you can find a way to create two separate domains, application from infrastructure, that’s going to ultimately protect you because application is the target. That’s what you know that’s what you the hackers are after and they’re going after you know whether it’s application the operating system server firmware or whatever. If they get in you have a you know protected environment, your infrastructure domain, your secure enclave where you can now reach out and roll that back, you know, pre-ransomware attack and you know, that kind of technology is possible today with you know the advent of modern DPU technology. Which I won’t get too into but that gives you a way to inject you know a secure enclave like you find in many consumer devices today, Intel SGX, Nitro enclaves you know. They’re out there. You know think about how you incorporate that in your next infrastructure design. It’ll pay off.

Stephen: And that’s really what it’s all about right, is you’re trying to take the idea of that that how cloud servers, real, you know, hyperscalers, you know, you mentioned Nitro and Amazon AWS. You’re taking that technology and basically moving it out of the cloud and into the hands of the world in a way.

Craig: Yeah, for the software that you know and love today, for the data that has to stay on-prem, you know here you know here is a an approach proven by you know the folks who run the most efficient and secure data centers on the planet, now available you know for your data center for your edge sites and that’s you know, that’s the approach we’re taking.

Stephen: So this has been a really you know it’s been an interesting and thought-provoking conversation because we really haven’t seen any way to basically ensure that the operating systems the software that’s running on these edge platforms is known good and secure until now. I guess Craig what’s your message to somebody listening to this who is an architect or you know working at the edge who maybe hadn’t considered this technology or this problem? what’s your message?

Craig: Yeah, so the, I mean, we know that the edge, it presents probably the most brutal IT frontier on the planet and traditionally the you know the architects of modern edge deployments are thinking about cost, they’re thinking about how to avoid the operational headaches that can occur, you know. The seed I would plant with every one of those folks is you know put at the Apex security because you know if not this year you know, sometime in your future, your organization is going to feel an attack and the more you have you know built these layers of security from the server and storage on up the safer your your organization is going to be and the more likely you’re going to be able to uh to navigate through that. And even if you have all those things in place think about the recovery. You know, think about the steps, practice the steps, rehearse it with the team so when you know when that time comes you can recover your site infrastructure in just a few minutes. It’s possible today make sure you’ve got that technology at your disposal going forward.

Brian: That’s really good. I’m going to cheat and I have two. I thought it was really interesting to think about the security side as it relates to you know, knowing what’s actually running on any given machine. It sounds so simple like what’s running on there. But a problem that I think many people don’t have a great solution for especially when you think about a very distributed footprint, you know, that’s often remote sites, you know, poor connections, etc. So I thought that was an interesting insight. and then overall I think what I find interesting is we’ve talked to many different organizations over the course of the season here is just this continuing convergence of capabilities that have maybe been born in the cloud making their way into some sort of solution that can exist at the edge and we’ve seen it you know in all kinds of places, containerization and or web assembly, or whatever you know we’ve seen it in just a a host of different places. So it’s interesting to see something that maybe is more on the sophisticated side of security also starting to be a possibility in edge environments as well which I just think speaks to the fact that this is probably a computing paradigm that’s here to stay for a while and it’s getting more robust and more mature, you know, really by the month. So it’s been interesting to learn a bit more about that and to see how that’s happening in the industry. How about you Stephen?

Stephen: Yeah I’ll take it from the other direction Brian. Think about portable devices, think about mobile devices, you know, all of these things as Craig mentioned, they all have a secure enclave, they all have known good operating system images, they all have the ability to basically alternate, you know, recover the image in the event of a bad upgrade, you know, protect the image from security and intrusion. It’s the same nowadays more and more with laptops for example, you know both Windows and Mac have, you know, trusted platforms and secure modules in them that try to authenticate the operating system. This technology is going from the other direction as well it seems almost like the only place it doesn’t exist is, well I guess, in the enterprise and at the edge and so it’s really not a surprise that the same kind of challenges that you’d face with mobile devices, I mean, think about it. You’re a manufacturer of mobile devices you know you’ve got to have that thing. It’s got to be bootable to a known good image or else you’re just not going to, you’re going to have a bunch of duds, a bunch of door stops out there all over the place. It’s kind of the same thing with edge because, you know, you can’t get out there. The system just has to work and it makes sense to have this stuff. So yeah from the cloud coming in and from the mobile device coming in, it makes sense to to bring it to the edge as well. Well thank you so much Craig for joining us today on Utilizing Tech. As we wrap up, where can people connect with you and where can they continue this conversation and learn more?

Craig: Sure, yeah for sure. You know, head on over to www.nebulon.com, we’ve got a lot of ways to interact. You don’t have to have a conversation with us right away. You can you know test drive the UI and you know, get a little tour of things, plenty of resources there and when you’re ready just let us know. We’re happy to hook you up with one of our technologists and tell you more.

Stephen: Great. Brian, what’s new with you?

Brian: Oh man, all kinds of stuff. I had a wedding a couple weekends ago so that’s a big deal. It’s new so excited about that, and then a lot of the same so. People can still find me in all the same places. I still have the Chamber of Tech Secrets blogs going on a weekly basis at BrianChambers.substack.com and out there on the socials, LinkedIn, Twitter, etc. so those are the places to find me.

Stephen: Great, yeah, I do recommend checking out the Chamber of Tech Secrets. Great blog, very thoughtful I have to say. Every week there’s a new thought. I love it. And as for me, you’ll find me on Gestalt IT where I’m writing. You’ll also find me every Wednesday on our tech news show, the Gestalt IT Rundown, you’ll find that in your podcast app or on YouTube if you look for it. And of course every week here on utilizing Tech. So thanks for listening to Utilizing Edge part of the Utilizing Tech podcast series. If you enjoyed the discussion, please do subscribe. We would love to hear from you as well give us a rating give us a review. This podcast is brought to you by GestaltIT.com your home for IT coverage from across the enterprise. For show notes and more episodes head over to our dedicated website, go to UtilizingTech.com or find us on Twitter and Mastodon at Utilizing Tech. Thanks for listening and we will see you next week.