OUR TECHNOLOGY

Machine Learning. Effortless.

So you need to add ML to your product. Simple, right? Wrong. ML solutions weren’t purpose built for the embedded edge — everything you find is adapted for it, from a mobile phone to a graphics processor. It’s like putting a square peg in a round hole. At SiMa.ai, we are simplifying the process of adding ML to your products by delivering the first software-centric purpose-built MLSoC™ platform with push-button performance for Effortless ML deployment and scaling at the embedded edge so you can get your products to market faster.

SiMa.ai’s Machine Learning System-on-Chip MLSoC

The MLSoC was purpose built for edge ML applications in mind, providing a unique combination of industry standard interfaces, specialized computer vision and machine learning acceleration processors hosted by an on-board ARM application processor complex running Linux to provide a complete solution for edge ML applications. SiMa.ai’s MLSoC provides a true, clear embedded edge ML platform. ML accelerators lack this integration and require combination with a PC or a Server, sounds simple, but the resulting combination has the software complexity, power and footprint of a server, not an embedded edge device.

SiMa.ai’s Purpose-built Silicon for Machine Learning
Machine Learning System on Chip (MLSoC)

SiMa.ai previously published it’s first peer reviewed results in MLPerf™ Inference Edge 3.0 last April and demonstrated the best power efficiency for ML inferencing of all the silicon providers who submitted benchmark results. It was the first time a start-up was able to perform better in the power efficiency category than the large established vendors for the configuration most utilized on the edge, the angle batch test, looking at a single stream of data and measuring how power efficient the entire solution can perform inferencing using the well understood ResNet50 ML network.

SiMa.ai is proud to announce that for the MLPerf™ 3.1 Inference Edge results published September 11, 2023, not only have we advanced category for the single stream (batch-1) configuration, we have also improved our efficiency for the multi-stream (batch-8) configuration, typical for a device that can concentrate the ML edge processing for 8 cameras in a single device. Our results even for the off-line configuration, (batch-24), typically utilized to analyze data center or large edge server based ML solutions, SiMa’s MLSoC ranked in the top three against chips specifically designed to perform in a data center solution, with high memory bandwidth and power dissipation!

Below we show how we stack up against popular competitor edge products using these results, which show an advantageous position in the closed edge inference power category, indicative of how well SiMa.ai’s purpose built edge MLSoC compare against deep sub-micron silicon.

MLSoC is the industry leading, heterogeneous compute architecture on silicon that is purpose built for machine learning at the edge, providing support for the complete processing pipeline from the sensor to the application layer to provide efficient real-time ML computing at the edge. The device includes a rich set of standard interface peripherals for interfacing multiple cameras, LiDar, Radar, audio, ultrasound and other sensors for ML processing on-chip.

Interfaces: The MLSoC supports up to 4 ports of Gigabit Ethernet natively, PCIe 4.0 supporting up to 8 lanes,  SPIO, I2C and GPIO interfaces. Memory is provided with interfaces for LLPDDR4x4 and support for SDIO/eMMC. The device interfaces can be extended with interface ICs to support MIPI, CSL, GSML and other popular serial interfaces for camera subsystems.

Video Decompression/Compression: The MLSoC silicon contains dedicated hardware processing elements that do not require CPU assistance to decode up to 8 HD video streams and encode up to 4 HD video streams. Video resolutions of up to 4K are natively supported.

Computer Vision Processing: The MLSoC silicon contains four high performance programmable vector signal processors for computer vision and ML model pre and post processing to accelerate complete computer vision pipelines.

Machine Learning Array: The MLSoC silicon features SiMa.ai’s patented machine learning compute array that features 50 TeraOPs per Second (TOPS) processing of 8-bit Integer performance, featuring 25 MBytes of memory integrated into the compute array to minimize data movement and execution stalling. The MLA is efficiently scheduled by the Palette software’s integrated Model Compiler to achieve near hand coded performance with push button compilation.

Quad ARM Processor Subsystem: The MLSoC silicon integrates a complete ARM A65 Quad -core subsystem with integrated multi-level cache and coherent memory supported by an on-chip Network on Chip (NOC) to integrate all of the subsystems on-chip.

Secure Boot: The silicon features a dedicated processor subsystem for secure boot and security management for secure end-point control.

Device Support: The MLSoC device is available today in an FCBGA package in commercial and industrial temperature grade versions with device qualification and screening specifications provided. Additional screening and packaging options are available via request.

Board Support: The MLSoC device is available on development and production board offerings from SiMa.ai.

This high level of performance and programmability provide the flexibility to support any ML model or application pipeline with compiled results. The integration provides a power efficiency for the entire pipeline that rivals competitive offerings in the industry. The silicon is complemented by SiMa.ai’s effortless Palette Software programming environment, providing a pushbutton programming experience for rapid iteration and updates to the ML model, pipeline and application. Utilizing SiMa.ai’s Palette software, developers can utilize pushbutton compilation, building and deployment of software images for execution on the MLSoC platform.

MLSoC Inferencing fps/watt Leadership: Demonstrated on the gold standard for ML benchmarking, MLPerf™

To demonstrate the actual performance and power advantages that the MLSoC provides to developers of edge ML devices,  SiMa.ai submitted benchmark results using our off-the-shelf M.2 module and compilation using our Palette software that were peer reviewed, accepted and published on the MLCommons benchmarking program called MLPerf™.  MLPerf™ is a benchmark held twice a year to compare the performance and power efficiency of a wide class of semiconductor products performing AI/ML training and inferencing. The goal is to provide a truly independent measurement that is peer reviewed by all members, ensuring that the results can speak for themselves, knowing that competitors have a chance to investigate and challenge any results prior to publication.

Caption: FPS/W are derived from the MLPerf Power Metric for Inference Closed Edge Power results 1.1-111, 2.1-0096, 3.0-0081, 3.0-0104, 3.1-0117, 3.1-0131 and 3.0-0104. For results before 3.0, the power metrics were re-calculated with more significant digits from the published logs. The MLPerf name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use is strictly prohibited. See www.mlcommons.org for more information.

SiMa.ai previously published it’s first peer reviewed results in MLPerf™ Inference Edge 3.0 last April and demonstrated the best power efficiency for ML inferencing of all the silicon providers who submitted benchmark results. It was the first time a start-up was able to perform better in the power efficiency category than the large established vendors for the configuration most utilized on the edge, the angle batch test, looking at a single stream of data and measuring how power efficient the entire solution can perform inferencing using the well understood ResNet50 ML network.

SiMa.ai is proud to announce that for the MLPerf™ 3.1 Inference Edge results published September 11, 2023, not only have we advanced category for the single stream (batch-1) configuration, we have also improved our efficiency for the multi-stream (batch-8) configuration, typical for a device that can concentrate the ML edge processing for 8 cameras in a single device. Our results even for the off-line configuration, (batch-24), typically utilized to analyze data center or large edge server based ML solutions, SiMa’s MLSoC ranked in the top three against chips specifically designed to perform in a data center solution, with high memory bandwidth and power dissipation!

Below we show how we stack up against popular competitor edge products using these results, which show an advantageous position in the closed edge inference power category, indicative of how well SiMa.ai’s purpose built edge MLSoC compare against deep sub-micron silicon.

Caption: FPS/W are derived from the MLPerf Power Metric for Inference Closed Edge Power results 3.1-0117, 3.1-0119 and 3.1-0132. The MLPerf name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use is strictly prohibited. See www.mlcommons.org for more information.

MLPerf™ Inference Edge v3.1

Closed ResNet-v1.5 single stream, multi-stream, offline results retrieved from 11 September 2023, https://mlcommons.org/en/inference-edge-31/. Comparing FPS/W derived from 3.1-0117, 3.1-0119 and 3.1-0132. Results verified by MLCommons Association. The MLPerf™ name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.

Summary
  • SiMa.ai is the MLSoC leader in the MLPerf power efficiency category (closed power) for the edge.
  • This peer reviewed result identifies SiMa.ai as a leader in efficient ML at the edge.
  • Performing ML at the edge has significant power efficiencies over performing ML in the datacenter, with the TCO up to 3x better than cloud based solutions.
  • The power consumption to transmit and store the data in the datacenter can exceed the power to perform the ML calculations in the cloud.
  • This provides a multiplicative effect on the edge ML power efficiency ratings.
  • Performing ML at the edge in a system-on-Chip, like SiMa’s MLSoC, has significant power efficiencies over performing ML in an x86 edge server equipped with Ml accelerator chips, with the TCO up to 2x better than these x86+ ML accelerator based solutions.

Run any computer vision application, any network, any model, any framework, any sensor, any resolution.

10x better than alternatives.

Push-button results.

Is Your ML Green? SiMa.ai leads the market in power efficient ML inferencing!

The power consumption of AI/ML systems is predicted to consume a significant amount of the overall power generated worldwide. Reducing the power to perform inferencing is a major goal for organizations as the cost of this power is a significant operating cost for their operations. By deploying SiMa’s MLSOC on the edge, the organization can save significant total cost of ownership (TCO) over cloud or edge based x86 server platforms. Deploying SiMa’s MLSoC also provides power saving advantages over other competitive edge ML silicon providers as demonstrated in the head to head MLPerf benchmarking in the Closed Power Division.

    • SiMa.ai is the MLSoC leader in the MLPerf power efficiency category (closed power) for the edge.
    • This peer reviewed result identifies SiMa.ai as a leader in efficient ML at the edge.
      • Performing ML at the edge has significant power efficiencies over performing ML in the datacenter.
      • The power consumption to transmit and store the data in the datacenter can exceed the power to perform the ML calculations in the cloud.
      • This provides a multiplicative effect on the edge ML power efficiency ratings.
      • Performing ML at the edge in a system-on-Chip, like SiMa’s MLSoC, has significant power efficiencies over performing ML in an x86 edge server equipped with Ml accelerator chips, with the TCO up to 2x better than these x86+ ML accelerator based solutions.
What we focus on

Whether you’re building smarter drones, more intelligent robots, or autonomous systems, SiMa.ai can help get you there.

10x better than alternatives

Amazing power efficiency. Blazing-fast processing. Get immediate results that are 10x better performance per watt than the competition. Enables real-time ML decision making while addressing performance, space, power, and cooling requirements. The SiMa.ai MLSoC™ provides the best performance per watt because it’s purpose-built for the embedded edge market, not adapted for it.

HERE’S HOW WE DO IT

To address customer problems and achieve best-in-class performance per watt, we knew a software-centric machine learning solution that works in conjunction with a new innovative hardware architecture would be required.

Our radically new architecture leverages a combination of hardware and software to precisely schedule all computation and data movement ahead of time, including internal and external memory to minimize wait times.

We designed only the essential hardware blocks required for deep learning operations and put all necessary intelligence in software, while including hardware interfaces to support a wide variety of sensors.

Any computer vision application

Whether you’re building smarter drones, intelligent robots, or autonomous systems, the SiMa.ai platform allows you to quickly and easily run any neural network model from any framework on any sensor with any resolution for any computer vision application. Run your applications, as is, right now on our MLSoC™.

HERE’S HOW WE DO IT

Our MLSoC Software Development Kit (SDK) is designed to run any of your computer vision applications seamlessly for rapid and easy deployment .

Our ML compiler front end leverages open source Tensor Virtual Machine (TVM) framework, and thus supports the industry’s widest range of ML models and ML frameworks for computer vision.

Bring your own model (BYOM) or choose one of our many pre-built and optimized models. SiMa.ai’s software tools will allow you to prototype, optimize, and deploy your ML model in three easy steps.

Push-button results

The ML industry is silicon-centric. Porting an ML application to a new platform is hard and time consuming and performance gains are uncertain. At SiMa.ai, we co-designed the software and hardware from day one. As a result, our industry-leading software suite allows you to get results, all within a matter of minutes with a simple push of a button and without the need to hand optimize your application – saving you months of development time.

HERE’S HOW WE DO IT

We listened to our customers and it was clear that we had to make the software experience push-button. Our innovative software front-end automatically partitions and schedules your entire application across all of the MLSoC™ compute subsystems.

Customers can leverage our software APIs to generate highly optimized MLSoC code blocks that are automatically scheduled on these subsystems. We created a suite of specialized and generalized optimization and scheduling algorithms for our back-end compiler. These algorithms automatically convert your ML network into highly optimized assembly code that runs on the Machine Learning Accelerator (MLA). No manual intervention needed for improved performance.

Your entire application can be deployed to the SiMa.ai MLSoC with a push of a button. Build and deploy your application in minutes in a truly effortless experience.