Skip to content

Design digital circuits in C. Simulate really fast with a regular compiler.

License

Notifications You must be signed in to change notification settings

suarezvictor/CflexHDL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

142 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CflexHDL

Design digital circuits in C. Simulate really fast with a regular compiler!

image

Q: So can it run algorithms without a CPU?
A: Yes, the algorithm gets implemented as hardware, with gates interconnected to match the C code logic. Complex algorithms are possible like rendering graphics as demoed.
Q: Is the simulation that fast?
A: Well, the fastest logic simulator is Verilator, and after some tests converting existing logic cores writen in verilog or migen to CflexHDL (unsing a provided automatic tool), speed gains were 2.5X to 5X, and up to 10X in some cases, compared with the same cores simulated with Verilator (a few tests, but in all cases so far). See DEMOS.md or this video.

TL;DR

See the CflexHDL slides

Quickstart

Install the minimal dependencies including GCC, Python's libclang, Sylefeb's Silice and Verilator. SDL library for graphical simulations. For synthesis on actual hardware, Vivado or Yosys+NextPNR or Quartus, and OpenFPGALoader (currently supports the Arty A7 board and the Terasic DE0-Nano)

Led glow demo

$ cd demos/led_glow && make should print simulation results
$ make load synths with default toolchain and loads the bitstream. First time of running the parser needs to be compiled and takes some extra time, just once!

Synth options:
$ make BOARD=de0nano load overrides the default board (the Arty)
$ make XILINXTOOLCHAIN=yosys+nextpnr load overrides the default toolchain for the Arty (Vivado)

Graphical demo

$ cd demos/vga && make should bring a window that renders graphics at high FPS (on your PC), and print the FPS on closing.

image

$ make verilator should bring the same window but at slower FPS

$ make load synths and loads the bitstream on the Arty board with a VGA PMOD on JB-JC. You should be able to see your PC and FPGA both running at 60 FPS*, side to side (the blurring is proof that the image was moving when taking the picture):

image

See it in action! https://youtu.be/TqV9wUDEG2o



For using yosys+nextpnr toolchain on the Digilent Arty board, use XILINXTOOLCHAIN=yosys+nextpnr in make
For the DE0-Nano board, use $ make BOARD=de0nano bitstream load, the board outputs DVI signals at LVDS levels (use a simple capacitor coupling)

*To limit FPS, set vsync to true when calling fb_init on simulator_main.cpp

Advanced examples

Hardware accelerated 2D Inverse Discrete Cosine Transform

At the core of the JPEG image enconding, it is the 2D Discrete Cosine Transform (DCT). It is applied to 8x8 luma/chroma pixel blocks and followed by lossy quantization and entropy coding.

In this example a hardware accelerated 2D Inverse DCT was implemented. It is a single C function that can be run on the CPU, and the unmodified code can be automatically translated to Verilog, achieving much faster operation (10X).

It was integrated and tested with two JPEG decoding libraries: Ultraembedded core_jpeg and JPEGDEC (Nlnet funded), producing this result:

image

See here for more details.

Floating point types support

Examples were written using floating point values, including a "shader" (computation of the pixel color corresponding to each coordinate and frame number). Instead of running on a GPU as usual, it gets hardwired as a logic circuit from its C source.

image

See here for additional details.

Pointer syntax support for accessing memory buses

This feature is make even easier to write hardware accelerators that access memory, do processing, and put results in another buffer. It is worth to note that few "C to Verilog" tools are able to achieve that, since they usually restricts access to specific on-chip memory (BRAM) or if not, you have to resort to huge and/or propietary toolchains.

as an example of this CflexHDL capaability, an existing graphics accelerator (line drawing core with alpha blending capability) was modified to use plain C pointer syntax for accessing the buses, instead of predefined macros from the library. See the resulting code line32a.cc and its diff

This way, a long sequence of bus access "instructions", like a bus read, are now a one-liner:

image

Pointer arithmetic is the same as in the C language, that means, it is "sizeof" aware: pointer addresses are incremented in multiples of the pointee size, and the same is supported for pointer substraction (resulting on number of in-between pointee elements). This is used in the line core to advance to next lines or columns, optionally backwards.

The pointer feature is used for a "Image blitting" demo which uses the line drawing accelerator as a form of 2D DMA engine, to copy an image from a rectangular block of memory to another place (the framebuffer), optionally processing colors:

image

The implementation of the feature is in silice_generator.py, where the pointer dereferencing is detected by the parser and translated to bus access signaling.

Benchmarks

See DEMOS page

About

Design digital circuits in C. Simulate really fast with a regular compiler.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •