JB Baker Explains Computational Storage Innovation at ScaleFlux

Revolutionizing Data Storage and Efficiency with ScaleFlux’s JB Baker

Episode Overview

Episode Topic

Dive into the transformative world of data storage and fintech innovation with JB Baker, VP of Marketing and Product Management at ScaleFlux. In this episode, we explore how ScaleFlux is revolutionizing the data pipeline, optimizing everything from data creation to actionable business intelligence. With the explosion of data driven by AI-powered financial applications and real-time analytics, ScaleFlux is delivering cutting-edge solutions that are smarter, faster, and more sustainable. Learn how computational storage and advanced SSD technology are not only transforming IT infrastructure for financial institutions but also addressing critical challenges like power efficiency, scalability, and sustainability.

We also explore the growing significance of NVMe SSDs and how they improve enterprise performance while reducing electronic waste. JB shares how ScaleFlux’s unique approach combines inline compression and data reduction to enhance real-time data processing. This episode highlights the critical role of sustainable innovation, offering solutions for businesses managing the ever-increasing demands of AI and big data.

Lessons You’ll Learn

Gain invaluable insights into navigating today’s data-driven business landscape. JB explains how computational storage supports AI workloads, enabling financial services to process data more efficiently while addressing power constraints and latency issues. Learn how strategies like write reduction technology improve data compression, scalability, and IT infrastructure efficiency.

Discover why sustainability matters in tech, as JB discusses solutions to reduce electronic waste through more durable storage components. Understand how NVMe SSDs are redefining industry standards, helping organizations achieve more with fewer resources. From fraud detection to high-frequency trading, this episode reveals how smarter storage enables businesses to deliver faster, better results in a competitive environment.

About Our Guest

JB Baker is an industry leader in data storage, with over 15 years of experience driving innovation at companies like ScaleFlux. As the VP of Marketing and Product Management, JB’s expertise lies in bridging technical solutions with business goals to create impactful strategies. He holds a degree in psychology from Harvard University and an MBA specializing in marketing and operations.

At ScaleFlux, JB leads efforts to reshape the data storage landscape, focusing on computational storage, power efficiency, and sustainability. His deep understanding of NVMe SSD technology and inline compression has positioned ScaleFlux as a pioneer in addressing the unique challenges of data-intensive applications. JB’s leadership and vision continue to drive ScaleFlux’s mission to deliver scalable, high-performance storage solutions tailored to the evolving demands of AI and big data.

Topics Covered

This episode unpacks the critical role of computational storage in supporting data-heavy applications like AI and financial services. JB explains how ScaleFlux’s solutions optimize the data pipeline to deliver faster processing, better scalability, and improved efficiency. Learn why NVMe SSDs are essential for modern IT infrastructure and how they outperform traditional storage technologies like SATA and SAS.

We also dive into sustainability, exploring ScaleFlux’s innovative approach to reducing electronic waste by extending the lifespan of storage components. JB highlights the importance of real-time data analysis in applications like fraud detection, high-frequency trading, and AI workloads. Discover how ScaleFlux empowers organizations to handle exponential data growth while meeting the demands of today’s dynamic business environment.

Our Guest: JB Baker

JB Baker is a distinguished technology business leader with over 25 years of experience driving top and bottom-line growth through innovative products in enterprise and data center storage. He began his career at Intel in 2000, managing the i960 and x Scale I/O Processors, which laid the foundation for his expertise in data storage technologies. In 2008, JB transitioned to LSI, where he led the definition and launch of the LSI Nytro Pie Flash products, playing a pivotal role in scaling the new product line with Tier 1 OEMs, hyperscale’s, and financial industry clients.

Following Seagate’s acquisition of LSI’s flash assets in 2014, JB’s role expanded to encompass the entire SSD product portfolio at Seagate. His strategic leadership resulted in record levels of SSD revenue for the company, underscoring his ability to navigate and influence the rapidly evolving storage industry. In 2018, JB joined ScaleFlux as Vice President of Marketing and Product Management, where he currently spearheads product planning and marketing initiatives. At ScaleFlux, he focuses on expanding the capabilities and market adoption of computational storage, aiming to redefine IT infrastructure by enhancing performance, efficiency, and scalability.

JB’s academic credentials include a Bachelor of Arts from Harvard University and an MBA from Cornell University’s Johnson School. Beyond his professional endeavors, he is an endurance sports enthusiast, participating in Spartan Races and triathlons. JB often draws parallels between the discipline required in endurance sports and professional life, emphasizing the importance of preparation, adaptability, and perseverance. His multifaceted experiences and insights make him a thought leader in both technology and leadership spheres. 

Episode Transcript

JB Baker: From a scale perspective. We contribute on both of those core metrics that I just mentioned. And we look at how can we be more efficient in using the power to deliver the data to the processor, and then also make sure that the processors are not starved for data, so they’re not sitting there burning cycles and burning energy waiting for data because GPUs today are burning. What’s the new Blackwell is 1600 watts or more, just in one processor. And just because it’s quote unquote idle, it’s still burning power. Right. So you want to make sure that it’s that those cycles are being used to produce results and do work instead of just burn power.

 Kevin Rosenquist: Hey there. Welcome to Pay Pod, where we bring you conversations with the trailblazers shaping the future of payments and fintech. My name is Kevin Rosenquist  and thanks for listening. JB Baker is the VP of Marketing and Product Management at Scale Flux, a company redefining data storage solutions for the modern age. With the explosion of data in financial services and the power demands of AI driven applications, Scale Flux is pioneering new ways to make storage faster, smarter, and more energy efficient. In this episode, we’re unpacking how things like computational storage and cutting edge SSD technology can transform IT infrastructure for banks, financial institutions, and beyond. We also talk a decent amount about sustainability and where that’s headed. So joining me now JB Baker. I’m always fascinated by how people get to where they are. You got a bachelor’s in psychology from Harvard a fairly good school from what I’ve heard. Did you always intend to go into business or did you have another plan initially? no.

JB Baker: I was always a technical minded, more science minded person.  I got into a catch 22 in college where I ended up not being able to major in physics like I planned. And then social science was something new and interesting, but it still always had a technology bent. In my mindset I love the data, but then being able to connect how do people operate with some of that technology was how I ended up moving on into the tech world.

 Kevin Rosenquist: Do you find that you got a lot that you used that education, psychology a lot in your business dealings?

JB Baker: Yes. I mean, it’s been, you know, too many years since I, since I actually studied that stuff. But the concepts definitely come into play as, you’re trying to understand what what the what the customer’s issues are, where the technology is headed, and just helping in the general management of the team and people and driving collaboration, and just kind of understanding people’s motivations and leveraging some of that, those soft skills to help us as we work on the the hard definitions of what needs to be in the product and how is it going to be used, and then you.

 Kevin Rosenquist: End up, you got an MBA. And so you still didn’t kind of get into that technical area. You did you focus more on business as a general rule with just sort of a natural aptitude towards tech?

JB Baker: Yeah. I mean, in B-School, I was more kind of marketing and operations.  and the finance side was something that with all the math, that came pretty easily. But, I just enjoyed doing the marketing and operations aspect. Cool.

 Kevin Rosenquist: Well, let’s move on to scale flux. So tell me what the company mission is and how its technology stands out in the data storage landscape.

JB Baker: Sure. I mean, scale, flux. We are innovating in the data pipeline. And what that means that we refer to the data pipeline as from the creation of the data to its temporary storage, to its movement, to the processors to turn it into information that’s usable, actionable business intelligence, and then pushing it back out to be stored for, for use later.  and so that data pipeline having that extremely streamlined and efficient is crucial to getting the most that you can get out of your power budget, out of your GPUs, out of your processors, your entire IT infrastructure. And so that’s our focus. Our team comes from a history of storage technology and memory technology. And so we’re working on ways to improve how storage and memory interact with the processors and moving the data.

 Kevin Rosenquist: And you know, how would you define sustainability in tech and how to scale flux help address environmental concerns because you kind of touched on it a little bit there.

JB Baker: Yeah. There are a tough one because there are so many aspects of sustainability. So from a scale flux perspective, we contribute on both of those core metrics that I just mentioned. And you know, we look at how can we be more efficient in using the power to deliver the data to the processor and then also make sure that the processors are not starved for data, so they’re not sitting there burning. Cycles and burning energy waiting for data. Because GPUs today are burning. What’s the new Blackwell is 1600 watt,  or more just in one processor. And just because it’s quote unquote idle, it’s still burning power, right? So you want to make sure that it’s that those cycles are being used to produce results and do work instead of just burn power. And so that’s the one side. The other side on the materials is by designing a more efficient and effective, storage component or memory component, we allow people to get more endurance out of those components so that maybe you can last for for five, or you can last for 6 or 7 years instead of three or 4 or 5 to reduce how much electronic waste is generated that’s probably the biggest thing that the next thing is through that efficiency in those components, we you can use fewer components to deliver the same infrastructure. So we’ve got examples of customers that are by using our drives instead of ordinary NVMe SSDs from some of the leading vendors out there, that they’re able to do the same amount of work with 40 or 50% fewer drive and fewer servers and fewer processors and fewer networking components. So, you know, from that regard, you’ve doubled the value of every piece of that electronic component that you’re putting into your infrastructure.

 Kevin Rosenquist: And I was going to ask that for anyone listening. He said NVMe SSD, the nonvolatile memory express solid state drive and so that improves both data efficiency and power usage. Correct?

JB Baker: Yeah. It’s,  you know.

 Kevin Rosenquist: You have your inline compression, right? Is that I think I forgot to mention that too. Yeah.

JB Baker: Yeah. And so, you know, traditionally or probably been about a decade that SSDs, solid state drives have really become ubiquitous in the data center and enterprise environments. Previous to me.

 Kevin Rosenquist: Office, I don’t want anything that’s not a solid state drive anymore.

JB Baker: Yeah, I mean, before that it was all the hard drives, spinning media. And those are just those are great for total capacity and cost per bit stored, but they are not great for power efficiency. In terms of data, how much data do you get served per what. Storage per what. And then also just trying to get the performance the speed.

 Kevin Rosenquist: Yeah.

JB Baker: You know you mentioned it in your office. You can’t, you don’t want anything that’s not an SSD. I remember booting up my laptop back at Intel and had a hard drive in it. Press boot, go down, get a coffee, get breakfast, come back, and maybe it’s finished booting. And now, you know, with your SSDs, it’s done in a couple of seconds, right?

 Kevin Rosenquist: Yeah, totally. Yeah. And with I mean doing like video editing or anything intense like that. I mean, there’s just such a better system. I mean, it just works so much more efficiently. Yeah.

JB Baker: Yeah. And then NVMe was the newer protocol that has come to, you know, really become the protocol for enterprise and data center drives. Originally it was SATA, serial ATA, and there was also SAS serial Serial attached SCSI. But those protocols weren’t as they were defined with hard drives in mind originally. NVMe was a streamlined protocol designed specifically for flash and operations over PCIe. So that is the standard at this point. Okay.

 Kevin Rosenquist: And can you talk a little bit about, you know, how computational storage kind of helps meet the specific data demands of AI driven financial services, since that’s a big talking point in the industry of a lot of our listeners. Sure.

JB Baker: So, you know, in terms of computational storage, the term itself got overloaded in the industry,  with all of these grand expectations around being able to push, all these programs and do things right down at the drives. Excuse me. And we glommed on to that term, but we made it simple. And the simple there is we’re doing that inline compression and decompression of the data in the drawer. And this connects to how is it more efficient for the financial services. A lot of the applications that are run there have data that can be compressed. And you know, like we look at packet capture for streaming market data. That was one that we got into initially with them. And across I mean it’s the same data set across all the financial institutions. And it’s always 2 or 2.1 or 2.2 to 1 compressible. So when we use a hardware engine directly in the controller of the SSD to compress the data, it actually accelerates the writes and accelerates the performance of the data or of the drives both on reads and writes. Okay. We kind of talk about it as, hey, if you write less to the drive, you can do more.

 Kevin Rosenquist: 

JB Baker: And so to make it easier to comprehend, we’ve started talking about it as write reduction technology because people again compression typically people think of, when I compress my data it slows things down. Yeah.

 Kevin Rosenquist: Or you’re going to lose something or it’s going to make it less quality, less quality or whatever.

JB Baker: Yeah.  we definitely can’t. This has to be lossless compression. Not yeah. This is not an audio codec where you can afford to lose some fidelity. Right. You know, enterprise data center data. You have to return what was originally given to you. Absolutely.

 Kevin Rosenquist: Yeah, yeah. How does scale flux enable scalability for organizations dealing with perhaps exponential data growth, especially these days?

JB Baker: Yeah. So with that,  data compression, we also enable people to use the drives for, to store more data per gigabyte of flash that they purchased. So, you know, let’s say you buy a 3.84TB drive, one of the standard capacities. Ordinarily, you’re only going to store maybe 3 or 3.2 terabytes of data to that  what we hear from a lot of folks is that they end up short, quote unquote, short stroking those drives like they did with hard drives to maintain performance and consistency. But when there’s even just a little bit of compression in the data, even just 20% compressibility, then it allows us to store more data or allows you to use more of the original capacity without suffering any performance. So it’s I don’t want to go too deep into all of the interactions within the drives. Basically, as SSDs fill up, it gets harder and harder for the controller to move data around. You know, think about a storage unit that you have and you’ve got it 80% full, and then you bring in a cup, and it’s a lot harder to fit that couch in when your storage unit is almost full already versus when it was empty. That’s a great analogy that happens with the drives. Or for those who are familiar with having a hard drive, if you’re old enough to remember that and you had to defrag your drive.

 Kevin Rosenquist: Yeah, definitely.

JB Baker: Right. It’s because you still had enough space to put that new information, but it was just contiguous and it slowed down the operation of the drive. So you had to.

 Kevin Rosenquist: Give you the little visual. Remember the little visual where they’d show you everything going into the little blocks around? You’re like, look, it’s working.

JB Baker: So the same thing happens in SSDs as they get used and data gets written and deleted, it creates these holes. And the controller has to do what’s called garbage collection to pull the good data out and put it into a new place so it can delete and truly empty the space to write new data. Because when you delete your data from the drive, it doesn’t actually erase the cell, right? It just says, that data. I don’t care about that anymore. I’ll mark that for future erasure. 

 Kevin Rosenquist:  Are there specific financial applications where something like computational storage offers particular value.  yeah.

JB Baker: What we’ve seen a few different places. One in like high frequency trading where they really care about that absolute lowest latency and the latency savings that we get in those mixed read write workloads on is it is valuable in that space. I mentioned the packet capture where they’re streaming market data and there’s a lot of rights. And then they’re going to do analysis on that data just to determine what trades they want to make. So again, being able to write very fast and then retrieve the data very fast is is super critical for those. And then going outside of the pure Wall Street type applications we’ve had, we have customers that are doing electronic payment. And so fraud detection is is tremendous there. And they’re using very high performance databases again where latency is super critical because as your consumer is, is buying their latest and greatest thing off of Instagram or wherever or Amazon, you need to check and confirm that that’s is a valid transaction. And otherwise, you know you can lose the transaction. So we need those frauds, detection activities to happen very fast. And the latency and performance of these drives helps with those databases achieving more transactions per second. You know, just even just recently having a discussion with someone, they’re like, yeah, you know, I’ve got my infrastructure and I can handle, you know, a million transactions per second with this cluster. But, you know, we got, we got Black Friday coming up, right. And there’s going to be burst workloads. And so having the infrastructure that can handle a burst in the workloads without spiking in latency is super critical. So again, you need to be able to handle more transactions per cluster um without hurting your latency. And that’s where our  solutions have really shown for a customer. How about real.

 Kevin Rosenquist: Time data analysis. Is that something that’s things like computational storage can help with.

JB Baker: Yeah, absolutely. So again,  all of that is going to come back to data is being read from and written to the drives. Once you go beyond the scope of what can fit into the Dram And as soon as you go to the drives on, the latency of the drives becomes important. And how they handle those mixed workloads is super critical. And what we see on drives and this is just across SSD is that as they fill up, I mentioned this before. As they fill up they start to slow down. We call this the right cliff, where you get a couple hours into writing to a drive and suddenly it drops from, it’ll drop on the right performance by, you know, 60 or 70 or 90% even. And that’s in an enterprise space. We talk about the sustained performance. So people are really looking at that after they’ve filled the drive. How does it perform. And they still perform really well on Re, but only if it’s 100% as as soon as you start mixing in some writes, pushing data back out to the drives, they really sink down to that pure write workload. I would we could show you a graph. It’d be great. But when we’re, when we’re doing that compression and we keep that free space in the drive. We maintain that fresh out of box kind of performance, super high performance even as the drive is full. So what we see is that after several hours and other drives have hit the right cliff. Yeah, we decline a little bit. But in terms of transactions per second that the drives can handle, we’re looking at 2 to 3 x. What a similar capacity drive from another vendor can handle.

 Kevin Rosenquist: What trends in data heavy applications are driving your product innovation. Whether it’s AI, internet of things like what are you? What really kind of motivates your development?

JB Baker: A good question. So yes, AI is definitely all the rage at this point. And that is definitely driving, excuse me, multiple avenues of the product development, particularly around capacity, density, wear and also power efficiency with those GPUs being so power hungry. And what was the statistic that a ChatGPT request takes ten x to the power of an ordinary Google search.

 Kevin Rosenquist: Google search? I saw something about that too.

JB Baker: Yeah. So, you know, you think about that. It’s like somebody might be doing the same thing as they were before, but now it’s taking ten times the power. How are you going to get that power from  you know, we’ve got many customers where, you know, we talked to the financial sector, talking to customers there, and they’ve got their data centers are space constrained and power constrained. Right. And you need to get more work done. But it’s not like you can suddenly pipe in more megawatts of power to that data center or create more floor space, in that space constrained area. So driving for improved power efficiency over generation, over generation in our controllers, as well as determining how we can contribute to making the system more efficient. I mentioned earlier by being able to reduce how many servers they needed to to deliver the same amount of work. It’s not just that our device was better than a similar device. It’s we help them save on other areas too. And then the security is another,  super, super hot topic. And I mentioned I’m not a cyber security expert a little bit about that before the show but security measures and making sure that you can not only secure the data at rest with encryption,  but adopting new, new industry standards and new industry technologies that are coming on to, to make sure that there no malicious actor getting into your drives.  I would encourage people to go look at what’s called Calyptra, and that is a new open source security metric or a method that talks about the hardware root of trust. So you can make sure that you’ve got the right firmware and the component is attested before boot up. Even so, you can’t it just cuts down on the vectors of, of attack.  and then also on that security, it comes back to sustainability where traditionally the well, traditionally over the last decade or so,  the hyperscalers and those who have your your personal data on when a system end of life, you know, they want to decommission a server and decommission the drives from it, they shred everything. So that’s producing a ton of electronic waste a ton is a vast understatement.

 Kevin Rosenquist: Yeah. I was going to say it’s many, many tons.

JB Baker: Yes, yes.  but so now they’re we’re working with other folks in the industry on, making sure that you can thoroughly secure and erase those, all of the data from those drives so the drives don’t have to be shredded and put into the electronic waste pile, that the drives can be reused if they still have life or recycled,  for, you know, to put back into the infrastructure.

 Kevin Rosenquist: You know, as the data demands continue to grow across industries, the more we get into AI, the more things quantum computing comes along, if it does in any sort of grand way. How do you envision scale Flux’s role as that as that happens?

JB Baker:  I think it just expands from where we are today. We are, you know, our initial products have been on the SSD front and then driving for that improved performance, improved efficiency, improved capacity density, improved sustainability,  from the component and contributing to the system. We have recently announced expansion into the memory space.  a technology that’s Compute Express link, or CXL, is a new method for expanding how much memory can be attached to that becomes important because what’s happened is that the processors, particularly GPUs for AI, get starved for a big enough data set that’s fast access for them. And so what you end up doing is installing more servers and more GPUs, not necessarily to get the processor’s Processors capability, but just to get the access to more daring, because you can only attach so much Dram per processor with its direct link. And with CXL we take that and we attach memory via a different method via the PCI bus. And it’s slower than traditional Dram in terms of your access. You’ve got an extra latency to go a little further from the processor, but it allows for reuse of older Dram technology, and it allows for you to to attach more memory and avoid having to put unnecessary components into the system to achieve the goal that you were going after originally. Did that make sense?

 Kevin Rosenquist: I think so, yeah. I keep trying to keep up with the tech. I’m trying to keep up with the tech. Yeah, yeah. Where do you see you know, where do you see the data demands? I mean the data demands is just going to get out of control. I mean is it going to. Are we. Can we keep up? Yeah.

JB Baker: I mean, it’s, every day you can find a new report on how much, how many more exabytes of data or zettabytes of data are going to be generated each year. And that does mean I’ve been in this industry, particularly the SSDs, for 15 years now.  and, you know, even 12, 15 years ago, we were talking about the data deluge,  and how it was just exponentially growing. That has not slowed down and as we expand into more and more of the IoT and now with AI, that data deluge is not going to slow. One thing on  AI that the Microsoft team mentioned recently at in their one of their presentations at the Open Compute Summit was that the raw data for AI isn’t different from raw data for other analytics applications or whatever. The difference is that AI applications not only consume the data, they generate new data.

JB Baker: And so they are exacerbating the problem or the, the data generation, trend. So it’s a, you know, every time it might slow down, we down. We find a new way to accelerate the generation of data.

 Kevin Rosenquist: That’s very true. Do you think, is quantum computing going to be, is it going to be widely used? Is it going to be is it will it help the problem? Will it hurt the problem?

JB Baker: I’ve been looking at that a little bit. I don’t know a ton about the quantum computing. The thing that has me thinking on that is how do we make encryption safe against quantum computing so that there’s called quantum safe encryption? and how are we going to deliver that across in an efficient way? Because my understanding is that that currently takes a lot of CPU power and processing capability. So that is going to be a huge challenge as quantum computing becomes more of a reality.

 Kevin Rosenquist:  Interesting to see where it all goes, that’s for sure. Well, JB Baker, the company is Scale Flux. Thanks so much for being here. I really appreciate your time. Thanks.

JB Baker: I appreciate the opportunity to talk to you and your audience.