Forlinx FAI-ARA240-M M.2 AI Accelerator Delivers 40 eTOPS

Forlinx Embedded launches an M.2 AI module with NXP Ara240, 16GB RAM, and PCIe Gen4, enabling high-throughput edge AI and Llama2-7B inference at 14 tokens/s.

AI / ML Embedded and Edge May 6, 2026 by Sayantan Nandy

Forlinx FAI-ARA240-M AI accelerator

The FAI-ARA240-M is an M.2-based AI accelerator that uses the NXP Ara240 processor. Forlinx Embedded has officially launched it. With a dedicated NXP ARA-240 DNPU with 40 eTOPS and 16GB LPDDR4, it can handle bigger models and workloads with high throughput. It delivers an impressive 14 tokens/s for Llama2-7B via PCIe Gen4 x4, bringing server-grade intelligence to NXP-based industrial and embedded systems.

The module has a separate NPU that takes care of inference tasks that embedded host systems can’t handle. The module has a standard M.2 2280 shape with an M-Key interface. This means that it can be added to existing platforms through PCIe without needing to change the base hardware. It has PCIe Gen4 x4 and USB 3.2 Gen1 ports that let the host processor and the accelerator talk to each other.

FAI-ARA240-M Key Specifications:

Processor: NXP Ara240 DNPU Edge AI processor
AI Performance: Up to 40 eTOPS
Memory Options: 8GB or 16GB LPDDR4
Form Factor: M.2 2280 (M-Key standard)
Interfaces: PCIe Gen4 x4, USB 3.2 Gen1
Dimensions: 22mm x 80mm

Earlier, we covered a similar AI accelerator, the Radxa AICore AX-M1 Edge, featuring 8GB LPDDR4X RAM and 8K Ultra HD video support—worth checking out.

FAI-ARA240-M Ara240 AI Acceleration Card

FAI-ARA240-M cooling system

The FAI-ARA240-M module works with host platforms that use NXP i.MX8M Plus and i.MX95 processors. Software support includes being able to work with TensorFlow, PyTorch, and ONNX, as well as tools for deploying, quantizing, and optimizing models. The platform can handle a variety of data types, such as INT4, INT8, and mixed-precision formats. It works with popular AI architectures like CNN, Transformers, and GenAI, and it can be used with both basic vision algorithms and more advanced generative AI applications. It loads complex models quickly, which makes it great for a wide range of industrial situations and very easy to add to. Optimized thermal architecture keeps temperatures stable even when there is a lot of work to do in an industrial setting, which stops overheating and performance loss.

Forlinx says that the FAI-ARA240-M AI accelerator is available for order, but they haven’t revealed how much it’s going to cost.

Images used courtesy of Forlinx.