Krishna Rangasayee of Sima.ai on the Use Cases and Technology Trends for AI at the Edge

Blog

It’s an incredibly exciting time for us at SiMa.ai – we recently moved into production of our Gen 1 MLSoC chips, and announced additional funding from VentureTech Alliance and Navin Chaddha of Mayfield Fund that brings us to $200M raised in total. In the wake of these milestones, our CEO Krishna Rangasayee sat down with Cambrian AI Research’s Karl Freund for a Q&A to discuss the emergence of AI at the edge and SiMa’s plans to accelerate AI with our MLSoC design and software. The full interview is below, with light edits for length and clarity.

Karl: So, you know, since the beginning of compute, the pendulum keeps swinging back and forth. It’s all centralized on mainframes, and then Unix servers came along and it became decentralized. What’s really driving the interest in Edge AI in the cloud from a business standpoint?

Krishna: I think the last 10-15 years, clearly, I think AI has really been the big driver towards the computing trend. AI has been the application that’s driven compute. It’s really been a cloud presence for the last 10-15 years, and it continues to be. But there are many applications in the physical world that cannot entirely rely on a cloud based experience.

Three things get in the way of it and I’ve seen it become increasingly pronounced in terms of a transition into a hybrid structure, where things that remain on the cloud will continue to remain in the club. But there is increasingly more of an opportunity or a need for things to remain at the edge. And in my mind, three things drive that.

One is throughput and latency. Not every physical application can afford the latency [that comes with the cloud]. Take an automotive application, where decisions need to be made that are critical to safety. Throughput and latency are mission critical for the passengers or the driver, right? In a robotics application with a human-machine interface, if the latency gets in the way of the flow, it’s not helpful at all. So that’s one element of it, which is really around throughput, latency, that really guides the choice of edge compute versus cloud compute.

Another consideration, which I think is increasingly becoming complicated, particularly with AI and ML, is privacy and security. I think we have, for right or wrong, gotten very comfortable with storing all of our personal information on the cloud, and we really assume that it’s a great place to do that. But now, in part from the popularity of ChatGPT, there is a heightened sensitivity to privacy and security and people are looking at, “Can I get the benefit of AI and ML, without having to publicly share my information on the cloud? Would I be able to really do localized processing, where the creator of the data can do the compute and the analysis where the data resides, versus really throwing it around?” And this is true in medical, true in smart vision applications, retail applications and more.

The third element is obviously cost. And cloud is not a very cheap proposition – it’s a pretty expensive proposition for many, many different customers. And throw in the scale, where now, AI and ML is going to be embedded in every single device on the planet. So today, at the edge, microprocessors and microcontrollers make up $40 billion in annual consumption. A huge number. And all of that is essentially classic compute. Within a few decades, 99% of that is going to move from classic compute to ML. So concerns on throughput, latency, privacy, security, and costs will shift a meaningful portion of compute onto the edge.

Karl: Five years from now, how much of that shift do you think will have occurred?

Krishna: I would say at least 10 to 20%. The edge market moves much slower than the cloud market, and it’s tens of thousands of customers versus a few giants. So you’re really boiling the ocean in terms of market adoption. But that’s a very big number, if you really look at the scale. And in my mind, one day, the edge is probably going to be bigger than the cloud. That journey is ahead of us, and the next decade will drive a lot of architectural innovation, and people building purpose-built platforms for the edge. So far the journey has been the cloud, and there’s kind of been an AI and cloud story. But there’s an emergence of an AI and edge story that I think is interesting.

Karl: Other companies are going after this as well. Qualcomm, for example, and Apple with their silicon design. What is it that really differentiates SiMa, especially as the market evolves?

Krishna: One of the key things that we want to do, and this goes back to my time at Xilinx. It was a fantastic training ground for me in terms of the embedded edge market – tens of thousands of customers, solving for problems at the edge. And that learning really shaped my thinking of, from a market perspective, the nuances of what we would need to do once we were where we are today. I should clarify, before I give you a direct answer to your question, we are not focused on the cloud. We’re also not focused on the smartphone market. It’s a very, very tough market. And we didn’t want to be a startup focused on either of those two massive markets.

The middle, the everything in between, that’s really what we’re focused on. Robotics, smart retail, smart vision systems, medical, and now automotive are some of our focuses. So what we are doing is different. I think the biggest thing that I learned was that ML is really not a product, it’s a capability. I see company after company build an ML accelerator and assume that that’s a product.

We are in the business of solving application problems or system problems. ML is a fantastic toolkit, because you get better TCO, better performance, better power, all the benefits. But ML to us is really a toolkit we want to provide to our customers. So our offering is a machine learning SoC, which is very different. What we’re providing is a heterogeneous compute platform. Some things should remain in classic compute, and ML is not a cure-all for everything. ML is really wonderful for solving a few problems. So what we are offering is a machine learning SoC experience, and we’re kind of creating a new product category with that. So that’s one large differentiation, is what we do. The second one is, we think of ourselves more as a software company building silicon than a hardware company. ML is fundamentally a software experience.

Karl: After all, Nvidia has more than twice the software engineers as they have hardware.

Krishna: And I give Nvidia a lot of credit for recognizing that this is really a software experience. Part of the reason why they’re commercially this successful is their software experience is phenomenal. From our perspective, I think we have from day one really focused on software. And though we’re a young company, many of our customers really, really admire us for our software experience. No doubt, from our perspective, the Nvidia experience is more suited for the cloud. We are purpose built for the edge. So we not only provide an ML experience via software that’s world class, we also provide an SoC experience along with it. You can do end to end applications on our product. And that’s a more complicated story than just building an ML model.

So to us, ML is a necessary thing. So we’ve paid a lot of attention to it. Even at this earliest stage, all of our customers validate that our software is way ahead of anybody else. And that’s another large competitive advantage. I think hiding behind silicon advantages is pretty hard – software is a more defensible strategy. The third thing is a measure of frames per second per watt, which is really, at the end of the day, a large technical driver of TCO, better experience, better performance and better power, which are our strategic advantages. We are way ahead of anybody else when it comes to application end to end performance. So those would be the three pillars on where we have differentiated. And I have no doubts, you’re gonna get a lot of new participants and big companies now jumping into the AI story, particularly the edge, because that’s the next growth vector in my mind. But we obviously have our strengths and we plan to continue to fortify our strengths.

Karl: Interesting. So when I first heard about what you guys are doing I was very interested in your SoC approach. So can you talk about the software stack that’s required, because you really provide an entire platform for the application, as opposed to, many people would say, here’s the AI portion, I can accelerate the CPU you want. How does that affect the software development environment?

Krishna: I would say, in some ways, it wasn’t rocket science in what we needed to go do. And I’ll start with the norm, and the $40 billion that’s consumed on an annual basis today is a normal market. People have been shipping SoCs for 30-40 years, so it’s not a new experience. But ML is the new entrant, the new variable. And so many of the incumbents really don’t know how to enable an ML experience, but we obviously have the benefit of a purpose built platform.

What we decided to do was really innovate where it matters, and leverage open source wherever we can. So our philosophy has really been an open source platform from day one. Everything is on the ARM processor subsystem, which is a huge ecosystem, a huge software code base, and a lot of work has been done phenomenally well, first led by ARM, and now by a lot of developers around the planet. So we said, let’s just take advantage of the software experience they’re going to provide. Similarly on computer vision, we partnered with Synopsys for their computer vision pipeline. We leverage their software footprint for what we need.

From an ML perspective, we took a very unique approach. Since we have tens of thousands of potential customers that we one day want to service, our software front end has to be capable of taking on any computer vision problem with ease. We cannot be myopically focused on one or two customers. So we leverage the TVM infrastructure, and virtual machine infrastructure. And today, we are the first company to be able to support any ML framework, any sensor, any resolution, and really create a very wide funnel in what we could support.

The question is, how do we bring all of this experience together from a holistic perspective. We have a very unique Docker container based approach, where you could do ML model development, you could do application model development, and deploy all on a single Docker package. I may have taken it to heart that we really, really need to want to solve the customer experience, in that they should push a button, they should get a result. And everything in the middle, that just happens. My very first job when I started the company was really that push button. What happens in the middle? The less people know, the better, right? There is no kudos in exposing layer by layer optimization. Unlike the cloud, many of our customers really are
challenged in having internal ML talent on the ML experience. So that’s really at the core of what we’ve decided to do. And one day, I think we’ll be well known for our software approach.

Karl: Interesting. So what are the key challenges in doing this? Obviously, you’re picking up some world class technology from Synopsys, you’re picking up TVM, you’re picking up ARM, and you’re integrating that into an SoC. It’s kind of a best of breed approach. What challenges do you see remaining that you believe SiMa is well positioned to solve?

Krishna: It’s a great question. I think, when we started, the journey was really hard, because we’re bringing in so many disparate facets and elements of technology into our company, right? So we need ML competency. We need compiler competency. We need SoC competency. We need firmware competency. It’s not easy bringing this all together. Now, obviously, we’re not 10,000 people, we’re only 150 people. So how do we find these amazing tech software engineers that are willing to do all of this in a constrained environment has really been a challenge.

I think in hindsight, we have some of the best software people in the industry. And it’s an interesting team that actually does not want me to add more people to it because they ruin all the cool bits. And they don’t want a dilution of all the coolness that they get. Right. So it’s only 75-80 software engineers that we have, but they’ve done an incredibly good job and continue to do a really good job. Looking forward, I think scalability is a big issue. And I think it’s the industry’s largest issue. You could get away with hand-optimizing a few things for one or two customers, but this is tens of thousands of customers you need to support – you need to provide a software platform that scales and works for everybody. We have problems all the way from datasets to creating models to accuracy to getting the performance and power we need.

Once you’re done with the ML story, then it’s about getting the application to where the customer gets the end to end, frames per second per watt experience they desire. So our journey is not over. I would say we have created a world class foundation and platform, and scalability and ensuring that we continue to raise the ease-of-use paradigm will remain our largest challenge.

Karl: How do you view the explosion of large language models? You’re not particularly well-suited for running on small devices, right, you’re very focused on image processing markets. That’s your strength. Then you’ve got the Synopsys IP, and some pretty impressive performance results in power. Are you worried that large language models will just obviate the need for image processing?

Krishna: I would say two or three things. One is, I absolutely believe that if you take a step back, generative AI is really going to touch every facet of the planet. It’s obviously making a big impact in the cloud today. But I do believe that elements of generative AI and large language models are just here to stay. And I also think that the journey hasn’t ended yet, it’s still starting. So change is going to be a constant thing in where we’re going. But what I think we haven’t publicly shared, and we are excited about, is that I think the generative AI application use cases are now beginning to evolve at the edge as well. So I don’t think it’s one versus the other.

I think you’re going to see a coexistence of classic CNN computer vision, along with LLM elements, along with generative AI elements, along with vision transformer elements. Because in reality, the benefit is these models, obviously, are very large, but when they bring in great accuracy, they get greater benefits. And people are going to be very creative in solving for cost, solving for performance, solving for power, while enabling these capabilities. I would say stay posted and stay close with us, and the exciting things that we’ll be talking about very soon, and what we’ll be doing in this space.

Karl: Krishna, it sounds very enticing. I can’t wait to see what you guys do. Can you talk about what we should expect from SiMa in the next six to twelve months?

Krishna: So Gen 1, we just recently announced we are now in full production. And it’s really a wonderful milestone for our company. Now it’s all about customer acquisition and scaling for revenue. So that’s sort of our Gen 1 focus. The early adopters of the technology have been in robotics, industrial automation and smart vision, smart retail applications, and also, surprisingly, in the government sector. I would have never placed the government sector as a lead adopter of AI and ML, but we are really seeing them lean in. And we expect to see revenue from these various markets over 2023 and 2024.

Gen 2, we are going to be rolling out towards the second half of the year. And in my mind, I think we are now raising the bar on what we could be solving for as a company. Gen 1, we took computer vision as a problem statement that will remain a core element of where we are. And now we’re extending our roadmap to include generative AI elements and other exciting portions of where the market is going.

The other focus for us is getting into automotive. So just take a step back, I think it’s commercial success on Gen 1, that’s top of mind for us. We’re really excited with the number of customers that we engage with—50+ customers globally—and really excited with how they see us. It’s been a long time coming for a new architecture to come into this market space. So they’re pretty excited. And with Gen 2, no doubt we’re bringing in new capability. But we’ll also continue to improve our software story every month. So that’s really exciting. And the benefit of solving some problems is you get new ones. So that’ll keep us busy and focused on really doing a good job.

Karl: Well, congratulations on your silicon going to general availability. I’m sure your sales team has 50 customers. That’s a pretty large set of customers for your first SoC. So I wish you all the luck and success with those customers. And I really appreciate your time today. And look forward to doing this again soon.

Krishna: Absolutely. Look forward to it. Thank you again and really appreciate your time as well.