For as long as there have been FPGAs, there have been people who wanted to use them for computing. The traditional model of FPGA-based compute is to hard code an algorithm and then load the full bitstream once for each program.

The problem comes when loading a new algorithm. Even with partial reconfiguration (PR), the result is like the tortoise and the hare. The FPGA can rip through whatever algorithm is loaded like a hare. By the time a new bitstream is loaded, however, the tortoise wins in most cases.

By comparison, CPUs and GPUs continue to outshine FPGAs in the arena of changing behavior with their ability to simply execute a different program.

Dark configuration

FPGAs have what I call the “dark configuration” problem. Only so many configuration bits can be set at a time, or the voltage rails are pulled down and configuration bits are lost. As FPGAs get larger, they require more configuration. These days, a large FPGA may require over a billion configuration bits.

Even partial reconfiguration cannot save the day. The goal now becomes to try to change the behavior of the FPGA without changing the bitstream that configures the FPGA.

The NoISA processor

The motivation for the NoISA processor is the observation that current instruction set architecture (ISA)-based processor systems use a fixed ALU, a fixed register file, and a fixed hardware controller. These three units divide the data and control planes into small “chunks” called “instructions.” The fixed hardware controller implements the instruction decoder of a fixed ISA and the data orchestration of the program.

By comparison, a NoISA (“No ISA”) processor is based on the Hotstate machine plus some HDL. In turn, the Hotstate machine is an advanced C programmable, runtime loadable, microcoded algorithmic state machine that implements the same functionally as the fixed hardware controller of a processor for any arbitrary hardware architecture.

The Hotstate machine has advanced abilities, including a stack for functions, as many timers as needed, a switch offset table, a large lookup table for input bits, and an interrupt that responds in one clock cycle.

The state outputs are qualified by a corresponding mask bit and captured if appropriate. The mask bit and state bits are used to create the new state output (state[i] = new_state[i] if mask[i] = true else old_state[i]). The number of possible state outputs at any one address during run time is 2(n-m), where n is the total number of states and m is the number of states used in that line of code. As a result, the Hotstate can express more complexity than a standard state machine (a traditional algorithmic state machine has one state output vector per address).

NoISA vs. ISA-based processors

The NoISA processor is smaller, faster, and lower power than an ISA-based processor. The programmer has a lower level of access with a NoISA processor since it’s built around runtime loadable microcode. The NoISA processor does not yet support an operating system, but that can be an advantage with respect to the speed with which the processor can handle real-time events. 

The size of the Hotstate machine depends on the size of the code. The compiler reads in the code and passes parameters to the Hotstate module. This is different to high-level synthesis (HLS), which compiles down to hardware. Those designs can’t change their behavior unless a new configuration file is loaded. I feel HLS jumped the shark by not using NoISA principles. This is why HLS tools are stagnating because they went down a dead-end path evolutionarily speaking. Adopting the NoISA philosophy can change that.

To create a NoISA processor, implement the data flow design in an HDL, then run all the mux selects, FIFO controls, overflows, underflows, and any other control pins into the Hotstate machine. Now, a little C program runs the architecture. Reloading the Hotstate machine at runtime ensures more software can fit into the hardware (see the tutorial and videos on our website to see how easy it is to create and use the Hotstate machine).

Save time and money

The NoISA processor is a system that can be used with any hardware architecture. It’s the ultimate “Napkin” design system. Draw out the data flow architecture and then run all status and control into the Hotstate machine. Designs get done quicker using this method than traditional designs methods. It’s different enough to be useful but not so different as to be hard to use. Also, the Hotstate machine is portable among FPGA vendors that support SystemVerilog.

Use cases

There are many use cases for NoISA processor, some of which are as follows:  

• Use a NoISA processor when a softcore CPU takes up too much area or runs too slowly.

• The NoISA processor uses less energy than a softcore CPU, so use it on the IoT Edge where power matters.

• NoISA processors make great controllers. Use them to quickly create small controllers and C-programmable state machines.

• NoISA processors are small and fast and the perfect choice for systolic arrays.

• NoISA processors are great for changing the behavior of an FPGA without changing the FPGA itself.

• NoISA processors change their behavior by reloading powerful microcode instead of little instructions.

It’s time for a change

It’s hard trying new things. It stretches the mind and changes the way one perceives the world. But it’s worth the effort. ISA-based processors were invented when computers had vacuum tubes and hand-wound memory cells. Can anyone believe that if Von Neumann had a billion transistors instead of a thousand tubes, he would have come up with the ISA processor-based computer?

Try the NoISA processor. It will be well worth your time. Remember: “Don’t be a slave to the ISA.”

www.hotwright.com