Taking Advantage of Embedded FPGA (eFPGA)
By Geoff Tate, CEO of Flex Logix, Inc.
Whether you are designing an SoC, MCU or other chip, the one common heartache is “freezing RTL.” Up until that point, it’s no problem making a change or update, but once it’s frozen, the chip design is “locked in.” A change after that point could require a new spin that is not only costly, but can also significantly delay the chip development schedule.
Now imagine what it would be like to have no deadline to freeze RTL. What chip designer would not want that? The good news is this is now possible using embedded FPGA (eFPGA). With eFPGA, designers have the flexibility to make changes at any point in the chip development process, even in the customers’ systems. While this is beneficial to any chip design team, it is especially beneficial for applications such as data centers, networking, deep learning, artificial intelligence, aerospace and defense.
What is eFPA?
Many people think that eFPGA is the same as traditional FPGA such as those offered by Xilinx and Altera. This is not the case at all. While the technology is similar, eFPGA requires no SERDES and PHYs because on-chip signaling is very fast. Density is also very similar, although some eFPGA platforms are much better than others so designers need to do their homework and shop around for the best platform. The real difference is the users. FPGA chips are used primarily by systems companies, with some in high volume. eFPGAs are used primarily by chip companies who need to integrate a small amount of FPGA-like flexibility into their chips.
An FPGA combines an array of programmable/reconfigurable logic blocks in a programmable interconnect fabric. In an FPGA chip, the outer rim of the chip consists of a combination of GPIO, SERDES and specialized PHYs such as DDR3/4. In advanced FPGAs, the I/O ring is roughly 1/4 of the chip and the “fabric” is roughly 3/4 of the chip. The “fabric” itself is mostly interconnect in today’s FPGA chips where 20-25% of the fabric area is programmable logic and 75-80% is programmable interconnect.
An eFPGA is an FPGA fabric without the surrounding ring of GPIO, SERDES and PHYs. Instead, an eFPGA connects to the rest of the chip using standard digital signaling, enabling very wide, very fast on-chip interconnects.
How it can be used?
There are a wide range of applications ideal for eFPGA, from very large networking chips down to small MCU/IoT chips. In 40nm with applications such as MCU/IoT, the emphasis is on power so eFPGA companies optimize their products to have more power management modes, low voltage state retention and other features. In 28/16nm applications, the emphasis is on performance so eFPGA such as the Flex Logix® EFLX® is optimized for that. The highest performance requirement is typically where EFLX operates in the control path or data path and it needs to clock at the frequency of the surrounding hardwired RTL ASIC. In this case, customers typically are using EFLX in blocks of 1000 or less LUTs implanting fast control logic with 1 or 2 LUT stages between flops. I/O requirements tend to be very large especially on inputs. A relatively lower performance requirement is I/O control, such as in a MCU or IOT, where eFPGA can enable local processing of I/Os to reduce the overall system power by not having to activate the MPU or where it can implement additional serial I/O functions as needed. An intermediate application is where we are a block of reconfigurable RTL on a processor bus.
Enables Arrays in any Size or Configuration
One key advantage of eFPGA is the ability to allow customers to design chips in whatever size or configurations array they require. For example, if a customer is designing in 16nm, they might only require a few hundred LUTs of programmable logic for fast reconfigurable control logic running at ~1GHz. In contrast, another customer in the same process may want 50K-100K LUTs for a datacenter processor accelerator. With Flex Logix, the way that can be achieved is by using tileable building blocks. First, 4 EFLX IP cores are designed using the above approach. Each IP core is a stand-alone FPGA, but they can also be arrayed to offer EFLX arrays, about 75 in total, from 100 LUTs to 122.5K LUTs, with any mix of logic/DSP.
Each EFLX IP core has an extra top-layer of interconnect which allows one core to connect automatically to surrounding neighbors to make a large array up to NxN.
EFLX-100 arrays up to 5×5 or 3,000 LUTs (there are actually 120 LUTs in an EFLX-100).
EFLX-2.5K takes over at 2500 LUTs and arrays up to 122.5K LUTs.
An array can be all-logic or all-DSP or any mix of the two types of cores:
It is also possible to embed large amounts of RAM in the embedded array. Flex Logix does this by using standard RAM compilers to generate any kind of RAM that the customer requests (single port, dual port; ECC/parity/none; as much as wanted) and positions the RAM between the cores. The RAM is part of a single EFLX array.
Using the above approach allows a few IP cores to generate an almost limitless variety of embedded FPGA arrays to suit any customer requirement.
eFPGA is changing the way chips are designed, providing a level of flexibility and reprogrammability that never existed. Many leaders are already using eFPGA, including DARPA, Sandia, Harvard, SiFive, and the HiPer Consortium, and many more are in design or evaluation. Once chip designers enjoy the flexibility that eFPGA offers, they will never want to go back to the old process of being locked-in with their RTL.