The Spade Book

By Frans Skarman, with contributions from the community

Spade is a Rust-inspired hardware description language.

Learn how to install Spade and setup your editor. Here are some suggestions to get started:

If you are more interested in a reference of all constructs in the language, see the language reference.

Also, if you have any questions or would like to discuss Spade with others, feel free to join the Discord community or Matrix channel.

Spade is a work in progress language and so is this documentation. Writing documentation is hard since it is hard to know what details are obvious and what things need more explanation. To make the documentation better, your feedback is invaluable so if you notice anything that is unclear or needs more explanation, please, reach out either via a GitLab issue or on Discord.

Other resources

Chapters

Installation

Before installing locally, there is a "playground" available at ▶️ play.spade-lang.org which you can use to play around with the language. The first few chapters of the book use that, so if you want to follow along with the tutorial, you can skip this chapter until prompted to install Spade locally.

At the moment, Spade works best on Linux systems, but macOS also works quite well with only a few minor issues1. Windows is not supported for now, although it should be usable through WSL.

There are a few ways to start using Spade:

  1. Manually

    In order to install Spade manually, you need the Rust toolchain, specifically cargo. If you don't have it installed, you can install it with https://rustup.rs/. Simply run the command there, make sure its binaries are in your PATH, and then run rustup toolchain install stable

    Unless you have specific needs, you should install the Spade compiler via its build tool Swim. Swim then manages the compiler version on a per-project basis. To install Swim, run

    cargo install --git https://gitlab.com/spade-lang/swim
    
  2. With a package manager

    If you are on Arch Linux, you can install the swim-git package from the aur https://aur.archlinux.org/packages/swim-git

  3. Using Docker

    The Spade Docker image, which works on macOS as well, has all the necessary tooling and environment preconfigured.

    For example, here's how you would start an interactive shell where commands like swim are available:

    docker run -it --rm ghcr.io/ethanuppal/spade-docker:latest
    

    Make sure you have the Docker (or podman) daemon running in the background. Do note that the image only supports x86_64 and arm64.

You should now be able to create a swim project using swim init hello_world!

Synthesis Tools and Simulators

Spade compiles to Verilog code which is simulated and synthesised (compiled to hardware) by other tools — in particular, cocotb for simulation and yosys+nextpnr for synthesis.

Automated install

The easiest way to install those tools is via Swim by running:

swim install-tools

which downloads https://github.com/YosysHQ/oss-cad-suite-build into ~/.local/share/swim. If it is installed, swim will always use the cad-suite tools instead of system tools.

NOTE: If you need to uninstall those tools, remove ~/.local/share/swim/bin

Manual install

You can also install the tools manually. Refer to the individual installation instructions in that case. The tools you need are:

If you're just starting out, you probably don't need all of these. Start of by simulating your designs using cocotb and icarus, then you can move on to testing in hardware using yosys and nextpnr.

If your simulations are too slow, you can try verilator.

Next steps

Now, move on to setting up your editor to work with Spade.

Editor Setup

Before installing locally, there is a "playground" available at ▶️ play.spade-lang.org which you can use to play around with the language. The first few chapters of the book use that, so if you want to follow along with the tutorial, you can skip this chapter until prompted to install Spade locally.

There are a variety of third-party plugins integrating Spade in different editors.

Vim

If you use Neovim, you can use spade.nvim, which is maintained by Ethan at that GitHub repository. This plugin sets up syntax highlighting and LSP automatically along with some other quality-of-life features.

Otherwise, you can use https://gitlab.com/spade-lang/spade-vim, following the instructions at that repository for manual setup.

Vscode

Emacs

Other Editors

Made a plugin for your favorite editor? Submit a merge request to add it to this list!

Blinky

The traditional program to start learning any language is "hello, world!". However, printing a string in hardware is a complex task, so the "hello, world!" in hardware is usually blinking an LED.

This chapter has two versions:

  • Blinky (for hardware people) for people who have some experience with digital hardware and want to learn Spade. This version focuses on the syntax of the language and makes comparisons to Verilog and VHDL, but assumes some familiarity with things like registers.
  • Blinky (for software people) for people who are used to software development but are new to hardware. This version puts less emphasis on the syntax of the language, and more on the basic hardware it is describing.

Blinky (for software people)

This chapter will show the very basics of Spade and is aimed at people who are familiar with software development but are new to hardware. If you come here with some experience in hardware design with VHDL or Verilog, the Blinky (for hardware people) chapter is probably more useful.

Before blinking an LED, we can try turning on or off an LED. We can do this as

entity blinky() -> bool {
    true
}

To Rust users, this will likely feel very familiar. To those familiar with other languages, the last value at the end of a block is "returned", so this is an entity which returns true. If we connect the output signal of this to an LED, it would turn on. If you're curious, you can try it ▶️ on the playground

This isn't particularly interesting though, so let's do something more interesting. Blinking an LED is the typical "Hello, World" in hardware, but even that requires some complexity so we will build up to it. Let's first start by making the LED turn off while we hold down a button, which first requires taking a btn as an input:

entity blinky(btn: bool) -> bool {

and then changing the output to !btn

entity blinky(btn: bool) -> bool {
    !btn
}

If you ▶️ try this, you can see that if you press the button, the LED turns off, and if you release it, it will turn on again. Here we're just simulating the resulting hardware, but if we connected this up to real hardware, it would also work!

If you think about this for a while, you may start wondering when this gets "evaluated". In software, this "function" would be called once, giving it the value of the button and generating a single result. But this somehow reacts to inputs! While Spade, and many HDLs for that matter may look like software, it is important to note that we are not describing instructions for some processor to execute, we are describing hardware to be built on a chip. The code we wrote says "connect input btn to an inverter, whose output in turn should be connected to the output of the module which we externally connect to an LED.

If we want to approximate the behaviour from a software perspective, we can view the progrmaming model of Spade either as continuously re-evaluating every value in the design, or as re-evaluating every value when the values it depends on changes.

At this point, we can start thinking about actually making an LED blink. In software we'd probably accomplish this by writing something along the lines of

def main():
  led_on = Talse
  while True:
    led_on = not led_on;
    set_led(led_on);
    sleep(0.5);

However, because we are describing hardware, not software we can't really "loop". Every expression we write will correspond to some physical block of hardware, rather than instructions that get executed.

A Detour Over Software

Before talking about how we would make a LED blink in hardware and Spade, it is helpful to talk about how we might write a software function to to "blink" an LED if we can't have loops inside our function. Remember that we can view our execution model as constantly re-evaluating our function to get its new values, roughly

def blinky():
  return True

while True:
  print(blinky())

On the surface, it might seem very difficult to make this thing blink, but if we had some way to maintain state between calls of the function. In software, we can achieve this by using a global variable for the state of the LED

LED_ON = False
def blinky():
  global LED_ON
  LED_ON = not LED_ON
  return LED_ON


while True:
  print(blinky())

If we run this program, we'll now get alternating True and False

True
False
True
False
...

There are some problems with this though, our value is "blinking" far too fast for us to see it blinking. If this were hardware, the LED would just look dim as opposed to clearly switching between on and off, we need to regulate it somehow. A quick way to do this would be to just call our function less often, for example, once per second. As we'll see, this is something we can kind of do in hardware, so let's try it!

import time

while True:
  start = time.time()
  print(blinky())
  end = time.time()
  # We want each iteration to take 0.5 seconds
  # so we get a blinking frequency of 1 hz.
  # To avoid drifting if `blinky` ends up taking
  # a long time, we'll compute how long the evaluation
  # took and subtract that from the period
  time.sleep(0.5 - (end - start))

That works, but has a major problem: now we cannot do anything more often than once per second, so if our program was to do more things than blinking an LED, we're probably screwed. To solve this, we can reduce the sleep time to something faster, but which we can still manage without having end-start become larger than the period. Being conservative, we'll aim for a frequency of 1 KHz

import time

while True:
  start = time.time()
  print(blinky())
  end = time.time()
  time.sleep(0.001 - (end - start))

If we just run our blinky now, we're back to it blinking faster than we can see, so we'll need to adjust it to compute how long it has been running and toggling the LED accordingly

COUNTER = 0
def blinky():
  global COUNTER
  if COUNTER == 1000:
    COUNTER = 0
  else:
    COUNTER = COUNTER + 1
  # The LED should be on in the second counter interval
  return COUNTER > 500

import time

while True:
  start = time.time()
  print(blinky())
  end = time.time()
  time.sleep(0.001 - (end - start))

Back To Hardware

At this point, you have got a sense of a (pretty cursed) programming model that approximates hardware pretty well, so we can get back to writing hardware.

Almost all primitive hardware blocks are pure (or combinatorial as it is known in hardware). They take their inputs and produce an output. This includes arithmetic operators, comparators, logic gates and "if expressions" (multiplexers). Using these to build up any form of state, like our counter, will be very difficult. Luckily there is a special kind of hardware unit called a flip_flop which can remember a single bit value. These come in several flavours and by far the most common is the D-flipflop which has a signature that is roughly

entity dff(clk: clock, new_value: bool) -> bool

Its behaviour when the clock signal (clk) is unchanged is to simply remember its current value. Flip flops become much more interesting when we start toggling the clock. Whenever the clk signal changes from 0 to 1, it will replace its currently stored value with the value that is on its new_value input.

Hardware is often shown graphically, and a dff is usually drawn like this:

src/dff_schematic.svg

Using this, we can build our initial very fast blinking circuit like this:

entity blinky_dff(clk: clock) -> bool {
    decl led_on;
    let led_on = inst dff(clk, !led_on);
    led_on
}

Don't worry too much about the syntax here, we define led_on as a dff whose new value is !led_on. When the clk goes from 0 to 1, the dff will take the value that is on its input (!led_on) and set it as its internal value, which makes the LED blink. This might be easier to understand graphically:

A graphical representation of a circuit that toggles an LED on and off

We can also visualize the value of the signals in the circuit over time, which looks roughly like

As soon as the clock switches from 0 to 1, the value of led_on switches to new_value. This in turn makes the output of the inverter change to the inverse which is now the "new new_value". Then nothing happens until the clock toggles again at which point the cycle repeats.

At this point, you should be wondering what the initial state of the register is as right now it only depends on itself. While it is possible to specify initial values in registers in FPGAs, that's not possible when building dedicated hardware, so the proper approach is to use a third input to the DFF that we left out for now: the reset. It takes a bool which tells the flip flop to reset its current value if 1, and a value to reset to. Again, looking at the signature, this would be roughly

entity dff(clk: clock, rst_trigger: bool, initial_value: bool, new_value: bool) -> bool

When rst is true, the internal value of the dff will get set to initial_value.

Visualized as signal values over time, this looks like:

The clk and rst_trigger signal are typically fed to our hardware externally. The clock is as you may expect from reading clock signal specifications on hardware, quite fast. Not quite the 3-5 GHz that you may expect from a top of the line processor, but usually between 10 500 MHz in FPGAs. This means that we need to pull the same trick we did in our software model to make the blinking visible: maintain a counter of the current time and use that to derive if the led should be on or not.

Our counter needs to be quite big to count on human time scales with a 10 Mhz clock, so building a counter from individual bools with explicit dffs for each of them is infeasible. Therefore, we almost always use "registers" for our state. These are just banks of dff with a shared clock and reset.

Additionally, using our dff entity isn't super ergonomic since it requires that decl keyword, so Spade has dedicated syntax for registers. It looks like this

reg(clk) value: uint<8> reset(rst: reset_value) = new_value;

which, admittedly is quite dense syntax. It helps to break it down in pieces though

  • reg(clk) specifies that this is a register that is clocked by clk.1
  • value is the name of the variable that will hold the current register value
  • : uint<8> specifies the type of the register, in this case an 8 bit unsigned value. In most cases, the type of variables can be inferred, so this can be left out
  • reset(rst: reset_value) says that the register should be set back to reset_value when rst is true. If the register does not depend on itself, it can be omitted

Blinky, Finally

We finally have all the background we need to drumroll 🥁 blink an LED! The code to do so looks like this

entity blinky(clk: clock, rst: bool) -> bool {
    let duration = 100_000_000;
    reg(clk) count: uint<28> reset(rst: 0) = if count == duration {
        0
    } else {
        trunc(count + 1)
    };

    count > duration / 2
}

Looking at the python code we wrote before, we can see some similarities. Our global count has been replaced with a reg. reg has a special scoping rule that allows it to depend on its own value, unlike normal let bindings which are used to define other values. The new value of the register is given in terms of its current value. If it is duration , it is set to 0, otherwise it is set to count + 1.

trunc is needed since Spade prevents you from overflows and underflows by extending signals when they have the potential to overflow. count + 1 can require one more bit than count, so you need to explicitly convert the value down to 28 bits. trunc is short for "truncate" which is the hardware way of saying "throwing away bits".

Those unfamiliar with Rust or other functional languages may be a bit surprised that the if isn't written as

if count == duration {
  count = 0
} else {
  count = trunc(count + 1)
}

This is because Spade is expression based -- conditional return values instead of having side effects. This is because in hardware, we can't really re-assign a value conditionally, the input to the "new value" field of the register is a single signal, so all variables in Spade are immutable.

If you are used to C or C++, you can view if expressions as better ternary operators (cond ? on_true : on_false), and python users may view them as the on_true if cond else false construct.

Play around

At this point it might be fun to play a little bit with the language, you could try modifying the code to:

  • Add an additional input to the entity called btn which can be used to pause the counter
  • Use btn to invert the blink pattern

You can try the code directly in your browser at ▶️ play.spade-lang.org

2

Technically, there are a whole family, but in practice we almost always use registers built from D-flip flops. 1: Most of the time when starting out you'll just have one clock, but as you build bigger systems, you'll eventually need multiple clocks

Blinky (for hardware people)

This chapter will show the very basics of Spade and is aimed at people who are already familiar with basic digital hardware and want to learn the language. If you come here as a software developer, the Blinky (for software people) chapter is probably more approachable.

A blinky circuit in Spade is written as

entity blinky(clk: clock, rst: bool) -> bool {
    let duration = 100_000_000;
    reg(clk) count: uint<28> reset(rst: 0) = if count == duration {
        0
    } else {
        trunc(count + 1)
    };

    count > duration / 2
}

The first line defines a "unit" 1 called blinky which takes a clock and a reset signal and returns (->) a bool which will be true when the blinking LED should be on. This highlights an important difference between Spade and traditional HDLs: most2 units in Spade take a number of input signals and produces an output signal instead of operating on a set of input or output ports. In general, Spade units are much more "linear" than their VHDL and Verilog counterparts - Variables can only be read after their definition (unless pre-declared using decl) and units do not mix inputs with output.

The first line in the body of the entity uses let to define a new variable called duration whose value is the number of clock cycles in a blink period, here we assume a 100 MHz clock. Spade is a statically typed language so duration will have a fixed type known at compile time, however, the compiler uses type inference to infer the types of variables where possible. In this case, the duration variable is compared to count on the next line which forces its type to be the same as count, i.e. uint<28> and the compiler will ensure that the value fits in the inferred type's range. If needed, the type of a variable can be specified explicitly using let duration: uint<28> = ....

The next few lines are a reg statement which is used to declare a register. The syntax for these can be hard to take in at first, but it helps to break it up into pieces:

  • reg(clk) specifies which clock is used to clock this register
  • count is the name of the variable which will hold the register value
  • : uint<28> specifies the type of the register. Normally this can be omitted but in this case the compiler is unable to infer the size without it since count only refers to itself and duration.
  • reset(rst: 0) says that the register should be reset back to 0 whenever rst is asserted. At the moment, this is always done using an asynchronous reset.

Finally, the statement is ended with an = sign followed by an expression that gives the new value of the register as a "function" of its previous value. Here, the register is set back to 0 if it has reached the duration, otherwise it is incremented by 1. A significant difference between Spade and most other HDLs here is that its semantics are not "imperative". We do not write

if count == duration {
  count = 0
} else {
  count = trunc(count + 1)
}

which is conceptually hard to map to hardware, instead the if construct returns a value which is assigned to the register's new value. This is much closer to the multiplexers that will be generated here than the imperative description is, and prevents bugs if one for example, forgets to give count a value in the else branch.

The trunc function call in the else branch is another effect of Spade's type system. The type system is designed to prevent accidental destruction of information. Since a + 1 can require one more bit than a itself, the type of count + 1 is uint<28+1>, which cannot be implicitly converted to a uint<28>. The trunc function explicitly truncates the result back to fit in the register's value.

The final line count > duration / 2 is what sets the output of the unit. Whenever count is greater than half the duration of the counter, its output will be true. The final expression in a unit is its return value which may feel unfamiliar at first, but eventually feels quite natural, especially when combined with other block-based constructs. For example, the same thing is true in if-expressions. The 0 and trunc(count + 1) are the final expressions in the blocks, and therefore their "return" values.

A note on division: You may question the use of / in the above example since division is usually a very expensive operation in hardware. However, divisions by powers of two are cheap, so Spade explicitly allows those. If the code was changed to / 3, you would get a compiler error telling you about the performance implication and telling you to explicitly use combinational division if you are OK with the performance.

error: Division can only be performed on powers of two
   ┌─ src/blinky.spade:10:24
   │
10 │     count > duration / 3
   │                        ^ Division by non-power-of-two value
   │
   = help: Non-power-of-two division is generally slow and should usually be done over multiple cycles.
   = If you are sure you want to divide by 3, use `std::ops::comb_div`
   │
10 │     count > duration `std::ops::comb_div` 3
   │                      ~~~~~~~~~~~~~~~~~~~~

Play around

If you want to play around with the language at this point, you can try to modify the code to do some of these things:

  • Add an additional input to the entity called btn which can be used to pause the counter
  • Use btn to invert the blink pattern

You can try the code directly in your browser at ▶️ play.spade-lang.org

1

A "unit" in Spade is similar to entity in VHDL and module in Verilog.

2

The input -> output flow is not always well suited to hardware, in those cases, ports may be used.

Common Language Constructs

This chapter goes through common constructs that most languages have such as variables, expressions basic types, and conditionals. The focus is on how they work in Spade and how that is different from other languages.

Basic Expressions and Primitive Types

Expressions are the fundamental building block of Spade code. Anything with a value is an expression - from an integer literal like 5 to arithmetic operations like + all the way up to blocks which at the end of the day consist of several sub-expressions.

Integers and booleans

Like most languages, Spade has a few primitive types that basic operations are applied to. The most common primitive types in Spade are bool, int and uint which are booleans, signed integers, and unsigned integers respectively. When building custom hardware, we are not restricted to integers of a few fixed sizes like 8, 16 and 32 bits, so both int and uint take a generic parameter that specifies its size. For example uint<8> or int<10>.

Sometimes you will also encounter an error talking about Number. This is a special type which the compiler uses until it can figure out if a number is signed or unsigned. This will become more relevant later when we talk about type inference

Operators

Spade's operators are generally the same as any C-like language both in terms of which operators are available and their precedence.

Arithmetic

To start off, Spade naturally has operators for arithmetic +, -, *. These prevent overflow by extending the output to guarantee that the result fits. For addition and subtraction this means that the output is one bit larger than the input and the input operands have to be the same size. For multiplication, the output size is the sum of the input sizes.

It is often necessary to change the number of bits to accommodate this. The sext function sign extends signed integers, the zext function zero extends unsigned integers, and the trunc function truncates (removes bits) both signed and unsigned integers.

Logic

Spade supports logic not (!), and (&&), or (||), as well as xor (^^) as well as the corresponding bitwise operators (~, &, |, and ^). However, Spade does not allow implicit casts between integers and bool, so using a bitwise operator on a bool or a logic operator on an integer is not possible.

Comparison

The comparison operators (==, !=, >, <, >=, <=) work as you would expect.1

1

with one small caveat, they can only be used on integers for now.

Shifts

Spade supports logic left and right shifts (<<) and (>>) as well as arithmetic right shifts (>>>).2

Arithmetic right shifts may be unfamiliar, so here is a short explanation of what it does: When you right shift a value, the most significant bit needs to be filled in. With a logic shift, this is done by a 0. A consequence of this is that the sign of the shifted value flips if it is negative. Arithmetic right shift instead replaces the most significant bits with the most significant bits of the input. For example

  • +12 in binary 0b01010 arithmetic shifted left by 2 becomes 0b00010

  • -12 in binary 0b10110 arithmetic shifted left by 2 becomes 0b11101

  • 2

    Arithmetic left shift is the same operation as logic left shift.

Division and Modulo

Spade also has division and modulo operators, but because division and modulo by non-powers of two is more expensive to implement than the arithmetic operations, the / and % operators can only be used to divide by powers of two. With the std::ops::comb_div function being used if you absolutely need division anyway which the compiler helpfully informs you about.

error: Division can only be performed on powers of two
   ┌─ src/blinky.spade:10:24
   │
10 │     count > duration / 3
   │                        ^ Division by non-power-of-two value
   │
   = help: Non-power-of-two division is generally slow and should usually be done over multiple cycles.
   = If you are sure you want to divide by 3, use `std::ops::comb_div`
   │
10 │     count > duration `std::ops::comb_div` 3
   │                      ~~~~~~~~~~~~~~~~~~~~

Integer Type Conversion

As mentioned previously, to cast a number to a lower number of bits, the trunc function is used, while sext and zext are used to add bits to signed and unsigned integers respectively. In order to convert between signed and unsigned types, the .to_int() and .to_uint() methods can be used.

Numbers

Numbers can be written in decimal without a prefix, in hexadecimal with a 0x prefix, and in binary with a 0b prefix. You can also use _ in numbers to split up groups to make them more readable. For example

  • 1_000_000 for big numbers
  • 0b1100_0101 for grouping binary digits
  • 0xff00_1234 for grouping hexadecimal digits

You can also add a uN or iN suffix to numbers to specify their sign and size. For example, 10u8 is a 10 bit unsigned value and 123i13 is a 13 bit signed value.

Integer literals without prefix do not have a size on their own, and unlike Verilog and VHDL in which integer literals are limited to 32 bits by default, Spade allows arbitrarily large integers 3. The compiler also guarantees that the value will be representable by the type it is used as. For example,

let x: uint<8> = 512;

will result in a compilation error.

3

Technically, there are implementation limits that will cause problems if you try to create an integer literal with more than \(2^{32}\) bits 😉

Booleans

Boolean literals are as you would expect: true and false

Tuples and Arrays

Like many languages, Spade supports compound types in the form of arrays and tuples. Arrays are used when you want several values of the same type to process together, tuples are used when you want to group values of different type into one group.

Arrays are written as a list of values enclosed in [], for example [1, 5, 3, x, y]. You can also create arrays of N copies of the same value using [value; N]. For example, an array of 10 zeros is [0; 10].

To access individual elements, use array[x] where x is an unsigned int. You can also use array[N:M] to access sub-arrays. These are inclusive on the left and exclusive on the right, so [0, 1, 2, 3, 4, 5][1:5] results in [1, 2, 3, 4]. Range indices must be constant values while individual element indices can be runtime values.

Tuples are written as values separated by (). For example (10, x, false).

Tuple elements can be accessed using the # operator, for example (10, x, false)#0 is 10. Most of the time, accessing tuples through pattern matching (destructuring) is more convenient. We will talk more about pattern matching later, but for now you can write

let (x, y, z) = some_tuple;

which will make x take on the value of the first element, y the second and z the third.

Spicy Sxpressions

The expressions discussed in the previous sections should feel familiar to hardware developers and software developers alike, but Spade also has a few expressions that are more unusual. Rust users can probably skip ahead, since these expressions are basically the same as Rust. For everyone else, let's talk about the more spicy 🌶️ expressions in Spade:

If expressions

"Control flow" in Spade is handled a little bit different than what you may be used to, unless you're coming from a Rust or functional programming background. In most languages you use an if expression to "conditionally" execute code if conditions happen. For example, an absolute value operation could be written as

def abs(x):
  result = x;
  if x < 0:
    result = -x;

However, in hardware, there is no way to "conditionally execute" a block of code. Hardware can only compute all branches, and select the corresponding output at the end, typically using a multiplexer

In order to reflect this, Spade is expression based and if expressions select values rather than conditionally executing branches. The above example would be written as

fn abs(x: int<16>) -> int<16> {
  if x < 0 {
    -x
  } else {
    x
  }
}

where the output of the function is the result of the if expression, i.e. -x if x is negative, and x if it is positive.

Conditionals being expressions means you can do some interesting things with them, for example, you can use them as parts of arithmetic:

let result = x + if add_one {1} else {0};

This particular example is strange and probably ill-advised, but this sort of technique can come in handy.

Blocks

The other unusual expression Spade has is the block which we've seen some examples of already; The abs function above has 3 blocks but you may not have thought of them as blocks.

A block is written as {} which contains a list of statements (variables, assertions etc.), and an optional final expression as the value of the block itself.

For example,

let result = {
  let sum = x + y;
  sum * z
}

This is effectively the same as writing let sum_prod = (x + y) * z but it allows you to break things into variables that are local to the block. This may seem strange at first, but hopefully makes more sense when you find out that these blocks are the bodies of both functions and if-expressions. For example you can of course define variables inside the body of if-expressions

let result = if op1 {
  let sum = x + y;
  sum * z
} else {
  x + z
}

Variables

We have seen some variables already, so this section will primarily be used to clarify a few things about them.

First, variables can be defined using let, for example

let x = 0;

Types

Spade is a strongly and statically typed language which means that every expression has a fixed and static type, and that almost all casts are explicit; the compiler will not automatically convert a bool to an int for example. Unlike languages such as C, C++ or Java though, Spade uses type inference to infer the type of variables based on its definition and use. For example, in the above example, x doesn't have a fully known type, it is a numeric value, but the exact number of bits is not known yet. However, if x is used later in a way that constrains its type, the compiler will infer it to that specific type:

fn takes_uint8(a: uint<8>) // ...

takes_uint8(x);

Again, Spade is statically typed, so conflicting types is not allowed:

fn takes_int8(a: int<16>) // ...

takes_uint8(x);
takes_int16(x); // Type mismatch. `x` was uint<8> previously but is now int<16>

In some cases, the compiler is unable to infer the type of a variable. In such cases, you can specify the type manually using : type after the variable name. For example:

let x: uint<8>: 0;

Scoping rules

Unlike most HDLs, Spade has more software-like scoping rules in the sense that variables are only visible below their definition. For example, this code would fail to compile

let x = y; // y used before its declaration
let y = 0;

this helps prevent combinational loops 1, and makes reading code easier to read as it forces its structure to be ordered "topologically" with values which depends on previous values being defined after those values.

decl

In some cases however, a hardware design requires feedback. For example, two registers which depend on each other's value. In this case, Spade has a special decl keyword which pre-declares a variable for later use.

decl y;
reg(clk) x = y;
reg(clk) y = x;

Generally, decl should be used sparringly, and unless you really know what you are doing, make sure to have a register in every "dependency loop", otherwise you will end up with combinational loops 1

1

A combinational loop is a value which depends on itself without any registers to break the dependency loop. In almost all cases, this will result in an undefined value.

Block scopes

Also like software, variables declared in a block as discussed in the previous section are local to that block and any sub-blocks.

let sub_result = {
  let x = true;

  {
    let a = !x; // Allowed, the use is in a deeper nesting than the definition
  }

};
let b = !x; // Disallowed, `x` is only visible inside the block it was declared

Variables are immutable

It is never possible to give a variable a new value. For example, as discussed in the previous chapter, you cannot write

let x = 0;
if cond {
  x = 1;
}

and you instead have to assign x to the result of an if condition:

let x = if cond {
    0
} else {
    1
}

Immutability by default is common in many modern software languages, but most allow opting out of it. Rust has the mut keyword, in javascript you can declare a variable with let instead of const, and in C-style languages you just don't declare a variable as const. However, Spade has no such feature, all variables are immutable and there is no way around that.

At this point, you may be asking if it is even possible to write anything useful with no mutable variables, or your mind may be wandering back to the initial blinky example where the value of our counter changed constantly. These two thoughts are related and the thing that ties them together is that the value of a variable is not immutable, it can change as the inputs to the circuit changes, but the subcircuit that a variable refers to is fixed forever.

As an example, in the following code

let sum = a + b;

the value of sum changes as a and b change, but sum really refers to a set of physical wire in the chip that we are compiling to -- the output of an adder that has a and b as inputs.

Units

The basic building blocks of a Spade project are units. A unit takes a set of input signals, "processes" them, and usually produces a resulting output signal. We already saw an example of a unit in the blinky chapter, but here we will go into them in a bit more detail.

The basic syntax for defining all three is the same for all three though. They start with fn, entity or pipeline depending on their "flavor" which we will talk about soon, then the name of the unit is specified. The unit inputs are specified inside () with each argument on the form name: type. The output of the unit is specified after the parameter list as -> type, and finally the body of the unit is specified.

As an example, the blinky from the previous chapter has the following definition

entity blinky(clk: clock, rst: bool) -> bool

which means it

  • is an entity
  • called blinky
  • which takes 2 inputs: clk with type clock and rst with type bool
  • returns a bool

At this point you are probably wondering why we keep calling them "units" when they are defined as entity. The reason for this is that units come in three "flavors": function, entity and pipeline. While they all take inputs and produce outputs, their semantics are somewhat different

  • Entities are the most general units, but as we will see, they also come with the fewest guarantees. If you need registers but don't want to use a pipeline, you should use an entity.

  • Functions are a special case of entities which don't allow registers or instantiation of non-functions. This means that they cannot contain any state, which in hardware terms means they are combinational, and in software terms means they are pure. While any function can be written as an entity, it is good practice to use functions whenever possible as it tells readers of the code that the unit is non-stateful.

  • Pipelines are a special unit which, as the name implies, is used when building pipelines. You will learn more about these in a later chapter.

In general, you should prefer to use function and pipeline where possible, and only resort to entity in cases where you both need state, and when the hardware you are building is not pipeline-like, for example our blinky module.

Instantiating Units

Units are not very useful if they cannot be instantiated. Functions are instantiated using the same syntax as function calls in C-like software languages: function_name(parameter1, parameter2).

Entities on the other hand need the inst keyword before the instantiation, for example inst entityoname(parameter1, parameter2). This is done to alert you as a writer of the code, and future readers of the code that the unit you are instantiating can have underlying state. If you do not see inst, you know that that is a function and therefore is pure which allows you to make more assumptions about the behaviour of your circuit without having to read through the source code of what you are instantiating.

Finally, when instantiating pipelines, you specify the pipeline depth after the inst, so inst(10) . This will be described in more detail later.

Passing arguments

Of course, most functions need their arguments specified, and there are two ways to pass arguments to units in Spade: by position or by name.

Positional arguments work like they do in most languages: the first value passed is matched with the first argument, the second with the second and so on. It is the syntax we have seen so far.

Named arguments have a $ sign before the argument list and allow you specify the name of each argument along with the value it should receive as arg: value.

As an example, if we want to instantiate the following entity

entity some_entity(x: uint<8>, y: uint<8>) -> uint<8> // ...

with x=10 and y=15 we can do so with positional arguments as

inst some_entity(10, 15)

or using named arguments

inst some_entity$(x: 10, y: 15)
// or
inst some_entity$(y: 15, x: 10)

In many cases when specifying arguments by name, you have a variable where you want to do your instantiation that has the same name as the argument you want to pass it to. You could of course specify arg: arg, but Spade also allows you to use a short-hand syntax and only specify arg in this case.

Continuing with our example, function, yet another way to instantiate it is therefore:

let x = 10;
let y = 15;
inst some_entity$(x, y)

You can even mix and match shorthand names with long names, which is especially useful if you have signals with common names such as clk and rst:

entity do_something(clk: clock, rst: bool) -> uint<8> {
    let x = 10;
    inst takes_clk_rst$(clk, rst, x, y: trunc(x + 5))
}

However, note that you cannot mix positional and non-positional arguments!

Which style to use depends on your application and code, you should strive for the variant that gives the most readable code. Sometimes that means you pass arguments by position because the order is obvious while other times, you opt to pass arguments by name because your unit takes too many signals to keep track of their positions.

For Software People: Instantiation vs calling

Instantiation is similar in behaviour to "calling" in software terms, but because we are building hardware, we cannot simply "transfer control flow" to another function. Instead, we copy the hardware inside the function to our "chip" and connect its inputs and outputs as appropriate.

As an example, if we define the following functions

fn add(a: uint<16>, b: uint<16>) -> uint<16> {
    trunc(a + b)
}

fn mul(a: uint<16>, b: uint<16>) -> uint<16> {
    trunc(a * b)
}

fn sel(a: uint<16>, b: uint<16>, cond: bool) -> uint<16> {
    if cond {a} else {b}
}

which generate the following hardware

The hardware generated by the above code

and then use them as part of a bigger function:

fn mul_or_add(a: uint<16>, b: uint<16>, multiply: bool) -> uint<16> {
    sel(add(a, b), mul(a, b), multiply)
}

it generates this hardware:

The hardware generated by the above code

This is important to keep in mind as a very important metric for resource usage in hardware is the area of the chip being used. In software, an expensive function only used very rarely is relatively cheap since the time taken for the program to run is the main cost. However, in hardware, as soon as a unit is instantiated, you pay the cost upfront, regardless of it is used millions of times per second or just once over the lifetime of the chip.

In addition, it is important to keep in mind how much area each function and operator uses. In the graphics drawn now, the multiplier looks as big as the adder, but in practice, the size of the adder grows as \(O(n)\) in the number of bits, while the multiplier grows as \(O(n^2)\). In FPGAs, things are even trickier as they have built in multipliers. While you have spare multipliers, they are free in terms of other resources, but they themselves are finite. The resource usage of different units is generally something you will learn over time.

Naming conventions

While not strictly required, unit names are usually written using snake_case, so are variable names. User defined types use on PascalCase while constant values use SCREAMING_SNAKE_CASE

Exercises

Modify the blinky code from the previous chapter to do the following

  • Break the check for count > (duration / 2) into a function
    • Call with named arguments
    • And positional arguments
  • Break the counter logic out into its own unit
    • Should it be an entity or function?

Here is a link to the code on the ▶️ playground


Brief intro to generic parameters

We will discuss the type system in more detail later, but you will most likely come across a few generic functions before then, so here is a quick introduction.

In the functions we have seen so far, the type of the arguments has been specified explicitly, for example, sel in the example above takes two uint<16> and a bool. However, this is quite restrictive, we may want sel to operate on other sized integers, or other types entirely. There is nothing in that function that requires 16-bit unsigned integers.

We can redefine sel to make the values it selects "generic" as follows:

fn sel<T>(a: T, b: T, cond: bool) -> T {
    if cond {a} else {b}
}

which defines a new local type T that can be substituted for any other type in the implementation, as long as that same type is used everywhere T is.

We can now instantiate sel with different types

let x_16: uint<16> = 10;
let y_16 = 10u16; // You can specify the type of integers using `u<size>` or `i<size>`
let max_16: uint<16> = sel(x_16, y_16, x_16 > y_16);

let (x_32, y_32) = (0, 0);
let selected: int<32> = sel(x_32, y_32, select_x);

In some cases, the typeinference is unable to infer the generic parameters of an instance which you can resolve by specifying them using the "turbofish"1 syntax (::<>). Like function arguments, type parameters can be specified positionally or by name using ::<> or ::$<>:

// We don't have enough information about what type the integers have here, we'll
// get a compiler error
let selected = sel(10, 20, select_10);

// Turbofish solves that
let selected = sel::<uint<8>>(10, 20, select_10);
let selected = sel::$<T: uint<8>>(10, 20, select_10);

Pipelines

Pipelines are an important construct in most hardware designs, and one of the key unique features of Spade is its native support for pipelining. Like the blinky chapter, this one has a chapter for people who are already familiar with hardware design and how pipelines work, and one chapter for software developers which introduces what pipelines are in addition to how they are expressed in Spade.

Pipelines (for software people)

This section is yet to be written. For now, see Blinky (for hardware people).

Pipelines (for hardware people)

Pipelining is traditionally a tedious and error prone process. Designers need to ensure that all signals are in sync by manually inserting pipeline registers and more importantly, ensure that the correct registers are used for the correct expression. The problem is made even worse when the depth of a pipeline needs to change for some reason. Then the developer has to ensure that all register references are updated accordingly throughout the design.

Spade natively includes a pipelining construct that ensures that pipelines without feedback are correct by construction and which makes it significantly easier to write and reason about pipelines with feedback.

A basic pipeline

Let's look at a basic example of a pipeline which copmutes multiplication or addition of two numbers depending on an Op signal:

enum Op {
    Add,
    Mul
}

pipeline(1) compute(clk: clock, op: Op, x: int<18>, y: int<18>) -> int<36> {
    let sum = x + y;
    let prod = x * y;
  reg;
    match op {
      Op::Add => sext(sum), // Sign extend to match mul
      Op::Mul => prod,
    }
}

The head of a pipeline looks similar to the entity and fn definitions that we saw before but includes a number in parenthesis. This number is the depth of the pipelines, i.e. the number of registers it contains which is the same its latency from input to output. While the compiler could in theory infer this number from the body, it always has to be specified since it is a very important part of the public "API" of the pipeline. Without reading the body of the pipeline, you know how many clock cycles you have to wait between input and output.

The first two lines of the body of the pipeline are somewhat uninteresting: they compute a sum and a product and store them in corresponding variables.

The next line reg; is another pipeline specific construct. It is used to add a new stage to the pipeline which is done by creating a new pipelining register for every variable above the reg; statement, and re-mapping any references to those variables to the pipelined version below the reg; statement.

The final match statement selects whether to use the "sum" or "product" value depending on the op variable. Crucially, because this is a pipeline, the compiler ensures that the three variables are delayed the same amount, so there will be no interleaving of op from the previous cycle with the sum and prod from the current cycle.

All this means that the resulting hardware looks like this:

Nested Pipelines

Spade of course also supports nested pipelines, let's extend the example above to showcase how that is done.

pipeline(1) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
    let result = x * y;
  reg;
    result
}

pipeline(1) compute(clk: clock, op: Op, x: int<18>, y: int<18>) -> int<36> {
    let sum = x + y;
    let prod = inst(1) mul(clk, x, y);
  reg;
    match op {
      Op::Add => sext(sum), // Sign extend to match mul
      Op::Mul => prod,
    }
}

Here, the multiplier from the previous example has been broken out into its own sub-pipeline with its own internal register. Since the compiler is aware of this, it will ensure that the signals are still in sync, in this case by not inserting an extra register for the prod signal.

Spade also requires you to specify the depth of pipelines when instantiating them. This is done in order to make sure that when you change the depth of a pipeline, you also make sure that that change does not affect the behaviour where that pipeline is instantiated.

Compiler guarantees

If you synthesize the previous example on a typical FPGA, you may realize that we are not using the multipliers in the DSP blocks as efficiently as we could - they have built in optional pipelining registers that allow us to raise the \(f_{max}\). This means we could get higher performance from our design by adding 2 more regs to our mul pipeline. Traditionally, this would require updating a bunch of code, but with Spade, all we have to do is make the change to mul:

pipeline(3) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
    let result = x * y;
  reg;
  reg;
  reg;
    result
}

The astute reader will notice that the latency of this pipeline is now wrong, oh no 😱. Luckily, even if you didn't notice this problem, the compiler did:

error: Pipeline depth mismatch. Expected 1 got 3
   ┌─ src/pipelines_hw.spade:40:1
   │
40 │ ╭ pipeline(1) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
   │            - Type 1 inferred here
41 │ │     let result = x * y;
42 │ │   reg;
43 │ │   reg;
44 │ │   reg;
45 │ │     result
46 │ │ }
   │ ╰─^ Found 3 stages in this pipeline
   │
   = note: Expected: 3
                Got: 1

Error: aborting due to previous error

Let's update the code accordingly, and while we're at it change the repeated reg; to reg*3; which is a shorthand for the same thing:

pipeline(3) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
    let result = x * y;
  reg * 3;
    result
}

Now mul looks correct, but if we look at the bigger picture we're not out of the weeds yet. Our compute pipeline as currently described is now this abomination which will have a very different output than before:

Luckily, the compiler once again has our back here. If we compile the new code

pipeline(3) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
    let result = x * y;
  reg * 3;
    result
}

pipeline(3) compute(clk: clock, op: Op, x: int<18>, y: int<18>) -> int<36> {
    let sum = x + y;
    let prod = inst(3) mul(clk, x, y);
  reg * 3;
    match op {
      Op::Add => sext(sum), // Sign extend to match mul
      Op::Mul => prod,
    }
}
error: Pipeline depth mismatch
   ┌─ src/pipelines_hw.spade:61:21
   │
53 │ pipeline(3) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
   │          - swim_test_project::pipelines_hw::m3::mul has depth 3
   ·
61 │     let prod = inst(1) mul(clk, x, y);
   │                     ^ Expected depth 3, got 1
   │
   = note: Expected: 3
                Got: 1

This means we have to update the inst(1) to inst(3) to match the definition of mul which gives us yet one more compiler error

error: Use of swim_test_project::pipelines_hw::m3::prod before it is ready
   ┌─ src/pipelines_hw.spade:65:18
   │
65 │       Op::Mul => prod,
   │                  ^^^^ Is unavailable for another 2 stages
   │
   = note: Requesting swim_test_project::pipelines_hw::m3::prod from stage 1
   = note: But it will not be available until stage 3

This error is saying that there aren't enough pipeline registers between our definition of prod and its use, which is the error we were seeing graphically before. We'll update our compute pipeline accordingly which finally gives

pipeline(3) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
    let result = x * y;
  reg * 3;
    result
}

pipeline(3) compute(clk: clock, op: Op, x: int<18>, y: int<18>) -> int<36> {
    let sum = x + y;
    let prod = inst(3) mul(clk, x, y);
  reg * 3;
    match op {
      Op::Add => sext(sum), // Sign extend to match mul
      Op::Mul => prod,
    }
}

At this point, the compiler is happy, and we should be too because the hardware correctly uses the DSP blocks giving faster performance, and its output is still the same as before (though of course, the latency has changed).

Fearless Refactoring

At this point it is worth taking a step back and analyzing what happened. We started out with a pipeline that computed a correct value, but that was not implemented as efficiently as it could have been. To fix this, we made a minimal change to the mul pipeline to more efficiently use the DSP blocks. Then, by running the compiler and mindlessly addressing the things it complained about, we updated the rest of our code to reflect this change. Once the compiler stopped complaining, our code still has the correct output but runs faster!

If our code is used elsewhere in the project, or by someone else in another project, the compiler would start complaining there until all the issues are fixed.

This is something that happens in several places in Spade, the type system being another notable example. You make a small localized change, then the compiler tells you every place you need to change to reflect that change in order to get back to hardware that still works correctly. Essentially, you can refactor code without having to think about the consequences.

Feedback

The pipelines discussed so far are useful if you're building a compute pipeline where you have no dependence between values. However, this is not always the case. A notable example of this is processors which are often pipelined but where values certainly are not independent. In this case, the guaranteed correctness when adding or removing registers is no longer possible, but being able to reason about pipelines structurally as individual stages rather than a soup of control registers mixed with pipeline registers is still very helpful.

For cases like this, Spade has support for "stage references", where you can refer to values from previous or future stages using stage(...).

As an example, to write a pipeline that computes the sum of a window "around the current" value, we can write

pipeline(2) window(clk: clock, x: int<16>) -> int<18> {
    reg;
    reg;
        x + stage(-1).x + sext(stage(-2).x)
}

where we use relative stage references to refer to x from the stage above, and from 2 stages above. The corresponding hardware looks like this:

As you can see, negative references refer to stages "above" the current stage while positive references refer to stages "below". Since stages "above" have gone through fewer registers, they are values from the "future" while positive references are values "from the past".

You can also use labels ('label) to refer to stages, for example, if you wanted to refer to a variable without delay you can define the first stage as 'first and then refer to variables from that stage using stage(first).

pipeline(2) without_delay(clk: clock, x: int<16>) -> int<16> {
        'first
    reg;
    reg;
        stage(first).x
}

Dynamic pipelines

Spade has experimental support for stalling of pipelines as documented in the language reference section. However, make sure you follow the note at the top of that page to avoid unexpected bugs.

Simulation and Testing

At this point, we're going to move away from the playground and install Spade locally so we can run a "real" flow. If you haven't already, go back and read the installation to install Swim, Spade and a simulator. You can skip synthesis tools for now.

Because it is time-consuming and difficult to debug hardware, most hardware projects use simulation to speed up the development process and ease debugging.

Tests are not written in Spade itself, instead they are written in Python using cocotb or C++ in Verilator. Cocotb is easier to set up and nicer to use but can be quite slow.

If you haven't already, install the tools by following the installation instructions

Cocotb

Any Python files in the test directory will be run with cocotb. The first line must be a comment on the form

# top = <path to unit under test>

where the path is relative to your project. I.e. if you have a unit called top in main.spade, this will be main::top

Tests are asynchronous functions annotated with @cocotb.test(), they take a single input which is the design under test.

When working with Spade, you generally want to be able to use Spade values rather than pure bit vectors. To do so, import SpadeExt from spade and instantiate SpadeExt class, passing the dut to hits constructor.

You can then access the inputs of your unit using .i.<input name> and the output using .o. If the output of the unit is a struct, you can refer to individual fields using .o.<field name>

As an example, consider this unit which computes a+b and a*b with a latency of one cycle:

struct Output {
    sum: int<9>,
    product: int<16>
}

pipeline(1) add_mul(clk: clock, a: int<8>, b: int<8>) -> Output {
        let result = Output$(
            sum: a+b,
            product: a*b
        );
    reg;
        result
}

A test bench for this module looks like this (this assumes that the Spade code is in src/cocotb_sample.spade. If this is not the case, adjust the # top=cocotb_sample::add_mul part to reflect your module name):

#top = cocotb_sample::add_mul

import cocotb
from spade import SpadeExt
from cocotb.clock import Clock
from cocotb.triggers import FallingEdge

@cocotb.test()
async def test(dut):
    s = SpadeExt(dut) # Wrap the dut in the Spade wrapper

    # To access unmangled signals as cocotb values (without the spade wrapping) use
    # <signal_name>_i
    # For cocotb functions like the clock generator, we need a cocotb value
    clk = dut.clk_i

    await cocotb.start(Clock(
        clk,
        period=10,
        units='ns'
    ).start())

    await FallingEdge(clk)

    s.i.a = "2"
    s.i.b = "3"
    await FallingEdge(clk)
    s.o.sum.assert_eq("5")
    s.o.product.assert_eq("6")

    s.i.a = "3"
    s.i.b = "2"
    await FallingEdge(clk)
    s.o.sum.assert_eq("5")
    s.o.product.assert_eq("6")

    s.i.a = "0"
    s.i.b = "1"
    await FallingEdge(clk)
    s.o.sum.assert_eq("1")
    s.o.product.assert_eq("0")

You can then run the test using swim test or swim t. If you want to run all tests in a specific file, run swim test <pattern>. All files which contain the pattern will be run, for example, swim test abc will run tests in both abc.py and cdeabc.py.

You can also run individual tests using -t <test name> though here, the name has to be exactly the name of the test, not a pattern

For more information on the cocotb api, refer to its documentation

Viewing the waveform

One cool thing about HDL simulation is that it tracks how all signals trace over time, allowing you to see exactly how your circuit behaves. These traces are called waveforms, and stored in a file which you can view with a waveform viewer.

Once all tests are run, Swim will print the result of each test, along with a .vcd file in which the waveform is stored:

...

ok   test/cocotb_sample.py 0/1 failed
 🭼 test ok [build/cocotb_sample_test/cocotb_sample.vcd]

For waveform viewers, we recommend https://surfer-project.org/ which was developed specifically for the Spade project, but you can also use GtkWave if you are already familiar with it.

Once the waveform viewer is installed, you can simply run surfer build/cocotb_sample_test/cocotb_sample.vcd or gtkwave build/cocotb_sample_test/cocotb_sample.vcd.

If you have a relatively modern terminal, Swim also supports clickable links for opening wave files for each test, but it needs an initial automated setup. Just run swim setup-links. Now Swim will print two clickable links, one for surfer and one for gtkwave:

ok   test/cocotb_sample.py 0/1 failed
🭼 test ok [build/cocotb_sample_test/cocotb_sample.vcd ([🏄] [🌊])]

Surfer also supports translation from the raw bit patterns to Spade types which makes debugging much easier. By far the easiest way to get this is with the clickable links mentioned above, but you can also run Surfer with this manually by running

surfer <path-to-vcd> --spade-state build/state.ron --spade-top path::to::top::module

Tips and Tricks

For now, &mut wires can not be read by the Spade API in cocotb. Similarly, tests can not be run on generic units. Therefore, it is often a good idea to define a wrapper function, typically called a harness around your units which use advanced features.

Verilator

C++ source files with the .cpp extension in the test directory will be simulated using Verilator. Like cocotb, these files consist of a set of test cases, but the test cases are defined using macros rather than attributes.

The Spade path to the top module is specified using a comment, i.e.

// top = path::to::module

where the path is relative to the current project.

After that, the Verilog name of the top module must be specified using a #define. Unless you know the details of how Verilator name mangling works, you almost certainly want to specify #[no_mangle(all)] on your unit under test.

The last thing you need to do before defining your tests is to include <verilator_util.hpp>.

After all your test have been defined, end test file with MAIN which defines a main function which is compatible with Swim.

// top=main::main

#define TOP main
#include <verilator_util.hpp>


TEST_CASE(it_works, {
    // Your test code here
    return 0;
})

MAIN

Accessing inputs and outputs

Like the cocotb API, there is a wrapper around Spade types to allow easier interactions with your design.

As an example, consider testing the following code

struct SubStruct {
    b: int<10>,
    c: int<5>,
}

struct SampleOutput {
    a: Option<int<20>>,
    sub: SubStruct
}

#[no_mangle]
fn sample(a: Option<int<20>>, b: int<10>, c: int<5>) -> SampleOutput {
    SampleOutput(a, SubStruct(b, c))
}

In the TEST_CASE macro, you have access to two variables: dut and s. dut is the raw verilator interface around your module which. s is the Spade wrapper which has a field i for its inputs and o for its output.

You can set the value of inputs to your module with s.i->input_name = "<Spade expression>", for example:

    s.i->a = "Some(5)";
    s.i->b = "10";
    s.i->c = "5";

Similarly, you can compare the output to a Spade expression using s.o == "<Spade expression>"

If your unit under test returns a struct, you can also access its fields and sub-fields as fields on the output struct, for example s.o->field->subfield == "<Spade expression>".

Finally, to assert that an output value is what you expect, you can use the ASSERT_EQ macro which takes s.o or a subfield, and compares it against a Spade expression. The advantage of using this macro over a C++ assert is that you get a diff print, both with the Spade value and the underlying bits.

For example, tests the fields in our example look like:

    ASSERT_EQ(s.o, "SampleOutput$(a: Some(5), sub: SubStruct$(b: 10, c: 5))");
    ASSERT_EQ(s.o->a, "Some(5)");
    ASSERT_EQ(s.o->sub, "SubStruct$(b: 10, c: 5)");
    ASSERT_EQ(s.o->sub->b, "10");
    ASSERT_EQ(s.o->sub->c, "5");

Clock generation

Clocks need to be ticked manually in verilator. The Spade clock type does not allow direct assignment, so the clock needs to be accessed via the Verilator dut. Spade mangles inputs names as <name>_i, so if you want to set clk, you would set dut->clk_i. Or you can mark the clock input with #[no_mangle]

The following code will tick the clock once

    dut->clk_i = 1;
    ctx->timeInc(1);
    dut->eval();
    dut->clk_i = 0;
    ctx->timeInc(1);
    dut->eval();

Since this is so common, it is helpful to define a macro for it:

#define TICK \
    dut->clk_i = 1; \
    ctx->timeInc(1); \
    dut->eval(); \
    dut->clk_i = 0; \
    ctx->timeInc(1); \
    dut->eval();

which can then be used like this:

    s.i->a = "5";
    s.i->b = "10";
    TICK;
    ASSERT_EQ(s.o, "15");

Alternative test directory

If desired, you can change the name of the test directory by specifying the new name in swim.toml as follows

[simulation]
testbench_dir = "not/test"

Unless you have good reason to do this, it is better to leave the default directory.

Ports and wires

If you prefer documentation in video form there is a talk available on this topic.

Note that the syntax of &mut has changed to inv & since that talk

Units in Spade, unlike most HDLs are similar to functions in programming languages in the sense that the receive a set of values, and their output is another set of values. For example, a function that adds 2 numbers is written as

fn add(x: uint<8>, y: uint<8>) -> uint<9> {
  x + y
}

This makes sense for a lot of hardware where there is a clear flow of values from inputs to outputs, but this is not always the case. wires and ports are a language feature that helps deal with these cases.

To understand wires and ports, it helps to look at a motivating example. If you're building a project consisting of 2 modules that communicate with each other via some other module, such as a memory, you want your hardware to look something like this:

Image showing two pipelines interconnected via a memory. The connections are made from stages in the middle of the pipelines

Without using ports, you'd have to write the signature of this hierarchy as

pipeline(1) mem(clk: clock, addr1: uint<16>, addr2: uint<16>) -> (T, T)
pipeline(4) mod1(clk: clock, inputs: I, data: T) -> (uint<16>, O)
pipeline(3) mod2(clk: clock, inputs: I, data: T) -> (uint<16>, O)

entity top(clk: clock) {
  decl memout1, memout2;
  let (addr1, mod1_out) = inst(4) mod1(clk, I(), memout1);
  let (addr2, mod2_out) = inst(3) mod2(clk, I(), memout2);
  let (memout1, memout2) = inst(1) mem(clk, addr1, addr2);
}

Writing it like this is tedious, and more importantly, error-prone as there is no way to communicate which signals correspond to each other. One might assume that the left output of the memory result is the data corresponding to address 1, but there is nothing to enforce this.

In addition, the pipelines internally have to prevent the addresses and returned data from being pipelined:

pipeline(4) mod1(clk: clock, inputs: I, mem_out: T) -> (uint<16>, O) {
        'start
    reg;
        // ...
    reg;
        'mem_read
        let mem_addr = inst mem_ctrl();
    reg;
        let result = inst compute(stage(start).mem_out);
    reg;
        (stage(mem_read).mem_addr, result)
}

This is another pain point and more importantly a source of errors. Graphically, the structure is more like the following which is as hard to follow as the code that describes it:

Image showing two pipelines interconnected via a memory when delays have to be accounted for manually.

Wires

The solution to the pipelining problem is a new type called a wire denoted by &. Wires, unlike values are not delayed in pipelines and can intuitively be viewed as representing physical wires connecting modules rather than values to be computed on.

To "read" the value of a wire, the * operator is used and to turn a value into a wire, & is used.

With this change, the pipeline example can be rewritten as

pipeline(4) mod1(clk: clock, inputs: I, mem_out: &T) -> (&uint<16>, &O) {
    reg;
        // ...
    reg;
        let mem_addr = &inst mem_ctrl();
    reg;
        let result = inst compute(*mem_out);
    reg;
        (mem_addr, &result)
}

For now, it is not possible to return a compound type with both wires and tuples, which is why the output of the module was changed to &O.

Inverted wires

There is still at least one big problem with the current structure: returning addresses as outputs and taking values as inputs is problematic as there is no clear link between input and output, and the return value of a unit ends up being a mix of both control signals like addresses, and values computed by the unit.

The solution to this problem is inverted wires, denoted inv &. These wires flow the opposite way to the normal flow of values. A unit which accepts an inverted wire as an input is able to set the value of that wire. A unit which returns an inverted wire is able to read the value that was set by the "other end"

Inverted wires are created using the port expression which returns (T, inv T)

let (read_side, write_side) = port;

The set statement is used to give set the value of an inverted wire. For example

set adder_out = a + b;

Rewriting the pipeline once again using inverted wires results in

pipeline(4) mod1(clk: clock, inputs: I, mem_addr: inv &uint<16>, mem_out: &T) -> O {
    reg;
        // ...
    reg;
        set mem_addr = inst mem_ctrl();
    reg;
        let result = inst compute(*mem_out);
    reg;
        result
}

The code can be made even neater by grouping all the memory signals together into a tuple:

pipeline(4) mod1(clk: clock, inputs: I, mem: (inv &uint<16>, &T)) -> O {
    reg;
        // ...
    reg;
        set mem#0 = inst mem_ctrl();
    reg;
        let result = inst compute(*mem#1);
    reg;
        result
}

Wires are passed around as if they were values, so our memory can now return all its signals, both inputs and outputs. As an example, to convert from a memory that does not use ports to one that does, we can write:

// A mockup memory which takes 2 addresses and returns two values.
pipeline(1) fake_memory(clk: clock, addrs: [uint<16>; 2]) -> [T;2]

pipeline(1) mem(clk: clock) -> ((inv &uint<16>, &T), (inv &uint<16>, &T)) {
        let (addr1_read, addr1) = port;
        let (addr2_read, addr2) = port;
        let [out1, out2] = inst(1) fake_memory(clk, [*addr1_read, *addr2_read]);
    reg;
        ((addr1, &out1), (addr2, &out2))
}

This finally allows us to write a neat top module for our running example:

entity top(clk: clock) {
    let (m1, m2) = inst(1) mem(clk);
    let out1 = inst(4) mod1(clk, I(), m1);
    let out2 = inst(4) mod2(clk, I(), m2);
    // ...
}

Ports

It is often desirable to define structs of related wires, for example the wires we've used in the memory interface. We can wrap them all in tuples like we did above it is often desirable to give things names with structs. To put wires in structs, we need to define them as struct port which tells the compiler that the struct is of port kind which is a broader concept than just struct port. In fact, wries, their inversions, compound types of wires like tuples and even clocks are all ports as opposed to values as discussed previously. Most of the time, what is and what is not a port is unimportant, but they have two important properties:

  • Ports are not pipelined.
  • Generic arguments cannot be ports.

We can define a struct port for our memory example as

struct port MemoryPort<T> {
    addr: inv &uint<16>,
    // A practical memory will usually also have a write value:
    write: inv &Option<T>,
    read: &T,
}

inv for real

The inv type is not only used to invert wires, it can be used to invert whole ports. Effectively this flips the direction of all wires in the port. This is very useful if there is no "owner" of a particular port as is the case with the memory example. We could tweak our memory example to use an inverted port by making the memory module also accept the port as an (inverted) input.

pipeline(1) mem<T>(clk: clock, p1: inv MemoryPort<T>, p2: inv MemoryPort<T>) {
        let [out1, out2] = inst(1) fake_memory(clk, [*p1.addr, *p2.addr]);
    reg;
        set p1.read = out1;
        set p2.read = out2;
}

entity top(clk: clock) {
    let (m1, m1_inv) = port;
    let (m2, m2_inv) = port;
    let _ = inst(1) mem::<uint<32>>(clk, m1_inv, m2_inv);
    let out1 = inst(4) mod1(clk, I(), m1);
    let out2 = inst(3) mod2(clk, I(), m2);
    // ...
}

Inverted wires must be set

It is important that a circuit which uses inveted wires has a well defined value for all wires. In practice this means that a wire can only be assigned to exactly once, which is enforced by the compiler.

In practice this means that if you create an inv & wire, or receive one as an argument you must either set the value, or hand it off to a sub-unit you instantiate.

For example, if we make an error while writing the top module in our running example and accidentally pass m1 to both mod1 and mod2

entity top(clk: clock) {
    let (m1, m2) = inst(1) mem(clk);
    let out1 = inst(4) mod1(clk, I(), m1);
    let out2 = inst(4) mod2(clk, I(), m1);
                                   // ^^ Should be  m2
}

We get a compilation error:

error: Use of consumed resource
    ┌─ src/wires.spade:234:39
    │
3   │     let out1 = inst(4) mod1(clk, I(), m1);
    │                                       -- Previously used here
4   │     let out2 = inst(3) mod2(clk, I(), m1);
    │                                       ^^ Use of consumed resource

Similarly, if we don't give m2 a value by removing the last line, we get another error

error: swim_test_project::wires::m9::m2.addr is unused
    ┌─ src/wires.spade:231:10
    │
231 │     let (m2, m2_inv) = port;
    │          ^^ swim_test_project::wires::m9::m2.addr is unused
    │
    = note: swim_test_project::wires::m9::m2.addr is a inv & value which must be set

Conditional assignment

Since Spade is expression based, setting the value of an inv & wire inside an if branch is not supported. For example, you may be tempted to write a multiplexer as

entity mux(sel: bool, on_false: bool, on_true: bool, out: inv &T) {
  if sel {
    set out = on_true
  } else {
    set out = on_false;
  }
}

However, this will result in a multiply used resource error.

The correct way to write this is instead

entity mux<T>(sel: bool, on_false: T, on_true: T, out: inv &T) {
  set out = if sel {on_true} else {on_false};
}

NOTE This mux is only written like this to showcase how mutable wires are used A better way to write a mux is

entity mux<T>(sel: bool, on_false: T, on_true: T) -> T {
    if sel {on_true} else {on_false}
}

Interfacing with Verilog

It is often desirable to interface with existing Verilog, either instantiating a Verilog module inside a Spade project, or including a Spade module as a component of a larger Verilog project. Both are quite easy to do as long as you have no generics on the Spade side, and no parameters on the Verilog side. Generics and parameters may be supported in the future.

Instantiating a Verilog module

If you have a Verilog module that you want to instantiate from Spade, you need to add a stub for it in your Spade project. This is done by defining a function, entity or pipeline1 but using __builtin__ instead of the body of the unit. For example,

struct Output {
    valid: bool,
    value: int<16>
}
entity external_module(clk: clock, x: int<8>) -> Output __builtin__

While this works, Spade will "mangle" names to avoid namespace collisions and collisions with keywords, so this would in practice look for a module like

module \your_project::your_file::external_module(
    input clk_i,
    input[7:0] x_i,
    output[16:0] output__
);

Changing your module to follow this signature would work, but is not very convenient, the more convenient thing is to add #[no_mangle(all)] to the entity:

#[no_mangle(all)]
entity external_module(
    clk: clock,
    x: int<8>
) -> Output __builtin__

Now, the resulting Verilog signature is

module external_module(
    input clk_i,
    input[7:0] x_i,
    output[16:0] output__
);

As you can see it still has a single output__ which is both inconvenient if you can't change the signature, and annoying since you need to know how Spade packs structs in order to generate the correct signals. Spade currently does not even define the packing of the structs, so we need to do something about this. The solution is to use inverted wires to generate Verilog outputs

Note that you could have also written the module with the less economical

#[no_mangle]
entity external_module(
    #[no_mangle] clk: clock,
    #[no_mangle] x: int<8>
) -> Output __builtin__

Changing our module to

#[no_mangle(all)]
entity external_module(
    clk: clock,
    x: int<8>
    output_valid: &mut bool,
    output_value: int<16>,
) __builtin__

results in

module external_module(
    input clk_i,
    input[7:0] x_i,
    output output_valid,
    output[15:0] output_value
);

which is a normal looking Verilog signature.

One downside of this however, is that the interface to this module isn't very Spadey, so typically you will want to define a wrapper around the external module that provides a more Spade-like interface

use std::ports::new_mut_wire;
use std::ports::read_mut_wire;
// Put the wrapper inside a `mod` to allow defining a Spade-native unit of the same name.
mod extern {
    #[no_mangle(all)]
    entity external_module(
        clk: clock,
        x: int<8>
        output_valid: &mut bool,
        output_value: int<16>,
    ) __builtin__
}

struct Output {
    valid: bool,
    value: int<16>
}

entity external_module(clk: clock, x: int<8>) -> Output {
    let valid = inst new_mut_wire();
    let value = inst new_mut_wire();
    let _ = inst extern::external_module$(clk, x, output_valid: valid, output_value: value);
    Output {
        valid: inst read_mut_wire(valid),
        value: inst read_mut_wire(value),
    }
}

With this, we have the best of both worlds. A canonical Spade-entity on the Spade side, and a canonical Verilog module on the other.

Finally, to use the Verilog module in a Spade project, the Verilog file containing the implementation must be specified in swim.toml under extra_verilog at the root, or extra_verilog in the synthesis section. This takes a list of globs that get synthesized with the rest of the project.

1

See the documentation for units for more details. Most of the time, you probably want to use entity for external Verilog.

Instantiating Spade in a Verilog project

Instantiating Spade in a larger Verilog project is similar to going the other way around as just described. Mark the Spade unit you want to expose as #[no_mangle(all)]. Prefer using inv & instead of returning output values, as that results in a more Verilog-friendly interface.

To get the Verilog code, run swim build, which will generate build/spade.sv which contains all the Verilog code for the Spade project, including your exposed module.

Ws2812b Example

This chapter will guide you through how to build a Spade library for the ws2812b RGB led and should serve as a practical example for "real world" Spade usage.

This assumes a bit of familiarity with basic Spade concepts, and is written primarily with software people in mind, as such more weight will be put on the FPGA specifics than on Spade syntax and concepts.

The chapter starts off with a discussion on how to create a Spade project and how that project is laid out. After that, we will discuss the interfaces we want to and need to use, i.e. how to talk to the LEDs, and how to make the driver interface nice to use for other Spade code. Finally, we'll go over the implementation of the actual driver.

Creating a Project.

It is strongly advised to use the Swim build tool to write Spade projects. It manages rebuilding the Spade compiler, including the standard library and dependencies, testing and synthesis etc.

If you haven't installed swim already, do so by following the installation instructions.

After you have swim installed, we should create a new project. The easiest way to do this is to run swim init --board <fpga name>. To get a list of the boards we currently have templates for, run

swim init --list-boards

which should give you something like

Cloning into '/tmp/swim-templates'...
remote: Enumerating objects: 135, done.
remote: Counting objects: 100% (50/50), done.
remote: Compressing objects: 100% (49/49), done.
remote: Total 135 (delta 11), reused 0 (delta 0), pack-reused 85
Receiving objects: 100% (135/135), 30,73 KiB | 30,73 MiB/s, done.
Resolving deltas: 100% (37/37), done.
[INFO] Available boards:
ecpix5
go-board
icesugar-nano
tinyfpga-bx
ulx3s_85k

If your FPGA board is not on the list, you can also set up your project manually, but that's out of scope for this guide. Have a look at the templates repository for inspiration.

For this project, the exact board isn't super important. I like my ecpix5 so I will use that. Create the project using

swim init --board ecpix5 ws2812b

Note that it is likely that this project, being a library to drive specific hardware should be a library, not a standalone project, it is still useful to initialise it targeting a specific FPGA board in order to test in hardware it later.

Basic project layout

Inside the newly created directory we find the following files:

  • ecpix5.lpf
  • openocd-ecpix5.cfg
  • src
    • main.spade
    • top.v
  • swim.toml

FPGA specific files

The ecpix5.lpf file is a pin mapping file which tells the synthesis tool what physical pins correspond to the inputs and outputs from our top module.

If you are using a ice40 based FPGA, this file is instead a pcf file which has the same purpose but different syntax.

We'll get back to this file when it is time to test on hardware

The openocd-ecpix5.cfg file is a file needed to program the FPGA. It is specific to the ecpix5 programmer and you don't really have to care what it does or why it is needed.

Since Spade is a very work in progress language with breaking changes being very common, it's easiest to have each project depend on a specific git version of the compiler. This is handled by swim, which will track a specific compiler version for us.1 The first time we build the project using swim, it will download and compile the compiler.

Since the compilation process takes quite a while the first time you run it, now is a good time to call swim build

The src directory contains our Spade source code. Each file is given a unique namespace based on the name, so anything you define inside main.spade will be under the namespace ws2812b::main::<unit name>.

Finally, there is the swim.toml file which describes our project

name = "ws2812b"

[synthesis]
top = "top"
command = "synth_ecp5"

[board]
name = "ecpix-5"
pin_file = "ecpix5.lpf"
config_file = "openocd-ecpix5.cfg"

The name is, as you might expect, the name of your project. If another project depends on your project, this is the namespace at which your project will reside.

The synthesis, pnr, upload, and packing fields tell swim what tools to call to synthesise the project and upload it to the board. Most things can be ignored here, but the top field is worth knowing about, as that is how you specify the top module (roughly equivalent to main in most software languages).

1

You can read more about this in the swim README.

Basic swim usage

Swim has several subcommands which you can use. These commands call their prerequisites so you only have to call the one you actually want to run. I.e. you don't have to call swim build before swim test.

swim build

Compiles your Spade code to verilog. The output ends up in build/spade.sv

swim synth, swim pnr

Call the synthesis tool and place and route tool respectively.

swim upload

Build the project and upload it to the board

swim test

Run simulation to test your code. Note that by default, your project does not contain any test benches, so this will complain. We'll write some later in the guide.

Aliases

Most of these commands have aliases that you can use to be lazy and avoid typing.

  • b: build
  • syn: synthesise
  • u: upload
  • t, sim: test

In the next section, we will start discussing how to talk to the LEDs.

LED protocol overview

Now that we are familiar with the project layout, we can start writing the driver for our LEDs. To do so, a good place to start is the datasheet. By reading it we can find out how the protocol works:

The LEDs are chained together, with us talking to the data in pin on the first LED in the chain, and it relaying messages to the rest of the chain automatically.

Data transmission consists of 3 symbols:

  • 0 code
  • 1 code
  • RET code

Each LED has 24 bit color, 8 bits per channel and the transmission order is GRB 1 with the most significant bit first. The first 24 bits of color control the first LED, the next 24 the second and so on, until the RET code is sent at which point data transmission re-starts from the beginning

1

Because apparently standard color orders like RGB is too mainstream

As a more graphical example, a transmission of the color information for a sequence of three LEDs look like this:

| G7..0 | R7..0 | B7..0 | G7..0 | R7..0 | B7..0 | G7..0 | R7..0 | B7..0 | RET |...
|<        LED 1        >|<        LED 2        >|<        LED 2        >|     |< ...

Each color segment is a sequence of 1 or 0 codes depending on the desired color for that led and color channel.

We should also have a look at the waveform of the 0, 1 and RET codes which look like this (see the datasheet for prettier figures):

0 code

------+
      |
      +-----------
| T0H |    T0L   |

I.e. a signal that is High for T0H units of time, followed by Low for T0L units of time

1 code

----------+
          |
          +-------
|   T1H   |  T1L |

I.e. a signal that is High for T1H units of time, followed by Low for T1L units of time. It is very similar to the 0 code, but for the 1 code, the high duration is longer than the low duration.

RET code

The RET code is just a Low signal which lasts for Tret units of time.

NOTE: The datasheet usually refers to this signal as reset and the timing as Treset. In order to make the rest of this text less confusing, we use the name ret throughout, as we already have a FPGA reset signal in our design which has different purposes.

Durations

We'll leave the durations of the signals for now and get back to them when we start implementing things. If you're curious already, have a look at the datasheet.

With the discussion of the external protocol out of the way, the next section will discuss our internal protocol, i.e. what interface we expose to users of our driver.

Driver interface

Now that we know how we should talk to the LEDs, we should also consider how we want the interface to our library to work. Here we have a few options with various trade-offs.

Passing an array around

The most familiar coming from a software world might be for the library to take a copy or a reference to an array containing the values to set the LEDs to. However, this is quite a difficult interface to implement in an FPGA. If we were to copy the LED values, we would need 24 bits per LED to be connected between the driver and user. Those bits would need individual wires, so the number of wires would quickly grow very large.

This interface would look something like

entity ws2812<#N>(clk: clock, rst: bool, to_write: [Color; N]) -> bool {
    // ...
}

"Function" to write single LED

Another option we might be tempted to try is to have an interface where you "call" a function to set a specific LED. This is difficult to do in practice however. In Spade, one does not "call" a function, instead you instantiate a block of hardware. One might work around that by passing something like an Option<(Index, Color)> to the driver, which updates the specified LED.

However, this is still not without flaws. First, we can't update a single LED, we need to send colors to all the LEDs before it too, so we'd need to store what the color of the other LEDs are. Second, it takes time to transmit the control signals, so one couldn't send new colors at any rate, the module must be ready to transmit before receiving the next command. This is technically solvable, but there are better options for this particular interface.

Letting the driver read from memory

Passing a reference is slightly more doable in an FPGA. Here, we might give the LED driver a read port to a memory from which it can read the color values at its own pace. This is certainly an option for us to use here, though Spade currently doesn't have great support for passing read ports to memories around. Until that is mitigated, we'll look for other options

This might look something like this, but the MemoryPort is not currently supported in Spade

entity ws2812<#N>(clk: clock, rst: bool, mem: MemoryPort<Color>, start_addr: int<20>) -> bool {
    // ...
}

For those unfamiliar, the #NumLjds syntax means that the entity is generic over an integer called NumLeds.

In current Spade, one would have to write it as

struct Ws2812Out {
    signal: bool,
    read_addr: int<20>,
}
entity ws2812<#NumLeds>(clk: clock, rst: bool, memory_out: Color, start_addr: int<20>) -> Ws2812Out {
    // ...
}

which decouples the read_addr from memory_out, and does not make clear the read delay between them.

Driver owned memory

Another, more Spade- and FPGA friendly option is to have the driver itself own a memory where it stores the colors to write, and expose a write port to that memory for instanciators to write new values. This might look as follows:

entity ws2812<#NumLeds>(clk: clock, rst: bool, write_cmd: Option<int<20, Color>>) -> bool {
    // ...
}

Just in time output

Finally, an interface which might be unfamiliar coming from the software world is to have the user generate the color on the fly, i.e. the user provides a translation from LED index to LED color. This is quite a nice setup as it doesn't intrinsically require any memory; if color selection is simple, it can be made on the fly. This interface is best demonstrated graphically

       Control
       signals
          |
          v
  +---------------+
  | State machine |
  +---------------+
          |
          v
+-------------------+
|   User provided   |
| color translation |
+-------------------+
          |
          v
+------------------+
| Output generator |
+------------------+
          |
          V
         LED
       Signals

Here, as driver implementors we are responsible for providing the state machine, whose output would be some signal which says "Right now, we should emit byte B of the color for LED N". We'll represent it by an enum

The color translator translates that into a known color, and the output generator generates the signals that actually drive the led.

In some sense, this interface is the most general. Both the driver owned memory version, as well as the memory read port version can be implemented by plugging the read logic into the translation part. For that reason, we will implement this interface first.

With all that discussion on interfaces out of the way, it is finally time to start implementing things. The next section will introduce the finite state machine, a real work horse in any Spade project.

State Machine

Now it is finally time to write some code. The swim template project contains some example code in main.spade, feel free to run swim upload to test it if you'd like. However, for this project, we won't need any of it, so once you are done playing around with it, remove all code from main.spade.

We'll start off by writing the state machine that generates the drive signals for the rest of the circuit. Before we do that though, it is a good idea to think about the input and output signals we want.

For simplicity, the state machine will not take any input control signals, it will start running as soon as the reset signal is turned off, and write data as fast as possible until the end of time.

Output type

The output is a bit more interesting. As stated before, we want the Finite State Machine (FSM) to emit information about what we are currently drawing.

For those unfamiliar, a Finite State Machine is less scary than the words make it seem. It is a way to do computation by describing a series of states and how and when to change between the states.

For example, if we want to build a circuit to toggle an LED whenever a short pulse arrives 1, our FSM would consist of two states: On and Off. If no pulse arrives, the current state remains. If the pulse arrives, we transition from the current state to the opposite state.

It is usually convenient to look at small FSMs graphically, the following figure shows the states and transitions of the pulse example

1

perhaps a pulse from a button, though some extra circuitry would be needed to turn the "short" pulse of a human pressing the button into a pulse that is "short" for an electronics circuit :)

Before we discuss our state machine further, we should consider what output we want it to generate. Initially, we might do something like this:

enum OutputControl<#IndexWidth> {
    /// Currently emitting the RET signal
    Ret,
    /// Currently emitting the specified bit of the color for LED `index`
    Led{index: int<IndexWidth>, bit: int<6>}
}

We make this enum generic over the width of the led indices to not waste bits and allow an arbitrary number of LEDs

The bit field is an int<6> because we want to be able to express 0..24. If Spade had better unsigned support, we'd be able to use uint<5> :)

This enum has a few issues though, so let's make some improvements.

First, the data coming out of the color translation block will end up being quite similar to this enum, so we can use generics to share some code. The user will translate the index into a color, so we will allow arbitrary payload instead of that index

enum OutputControl<T> {
    /// Currently emitting the Ret signal
    Ret,
    /// Currently emitting the specified bit of the color for LED `index`
    Led{payload: T, bit: int<6>}
}

This is enough information to write the color translator, but to generate the output, it would be nice to have some more information. Specifically, because this is a time based interface, we could more easily generate the output waveforms if we knew how long we've been emitting the current bit. Let's add that to the enum

enum OutputControl<T> {
    /// Currently emitting the RET signal
    Ret,
    /// Currently emitting the specified bit of the color for LED `index`
    Led{payload: T, bit: int<6>, duration: int<12>}
}

If you are curious, the width of the duration field is 12 to support a counter counting from 0 to 1250. This was chosen because the total duration of a data bit is 1.25 microseconds, which takes 1250 clock cycles at 1 GHz, and we are unlikely to run our FPGA above that frequency. Better generics over clock frequencies is something that might happen down the line

State machine entity

We can finally stop talking about interfaces and write some actual code. Let's start off writing an entity where we can put the logic to generate the OutputControl enum. This entity will be generic over the index width as discussed previously, and will take a number of LEDs to control as a normal parameter.

In practice, it would be a lot nicer to set the number of LEDs at compile time too, but Spade generics are not quite there yet.

entity state_gen<#IndexWidth>(
    clk: clock,
    rst: bool,
    num_leds: int<IndexWidth>
) -> OutputControl<int<IndexWidth>> {
    // TODO
}

Let's work on that // TODO next. Recall that when working in Spade, we always describe the behaviour of our circuit between one clock cycle and the next. However, we want to implement an interface that is time dependent, so we need to do some thinking. In a high level language, we'd want to do something like


#![allow(unused_variables)]
fn main() {
while true {
    for t in 0..ret_duration {
        output = Ret;
        wait_clock_cycle;
    }

    for i in 0..num_leds {
        for bit in 0..24 {
            for t in 0..bit_duration {
                output = Led(i, bit, t)
                wait_clock_cycle;
            }
        }
    }
}
}

Unfortunately, loops are out of reach, so we will need to encode this logic in some other way. Most of the time, this is done by writing a state machine. The exact method is somewhat situation dependent and takes some practice. To be successful at this task, we have two basic constrains: we need enough information to know what state to jump to at all times, and we need enough information to know what output to generate. In Spade, we'll almost always represent the states with an enum

enum State {
    // TODO
}

The RET signal

Let's start off with the first for loop to generate the RET signal. We need to keep track of how long we've been in RET, so we know when to jump over to the output generation loop. A good starting point is therefore a state, with a duration

enum State {
    Ret{duration: int<17>},
    // ...
}

The state_gen will need an instance of the state enum, which is updated at every clock cycle. A perfect use for the reg statement and match expression

entity state_gen<#IndexWidth>(
    clk: clock,
    rst: bool,
    num_leds: int<IndexWidth>
) -> OutputControl<int<IndexWidth>> {
    reg(clk) state reset(rst: State::Ret(0)) = match state {
        // Compute next state here
    };
}

What happened here? We have a register called state which we update by checking the state in the current clock cycle, to build a circuit that gives the state in the next. Since state depends on itself, it needs to be reset back to an initial value when the FPGA, which is why write reset(rst: State::Ret(0)). This will make the circuit send the RET signal to the LEDs when starting up, then operate as normal. We could have started emitting LED values too, but this makes the description easier and gives the LEDs a few microseconds to get up and running during power up.

How do we compute the next state in the Ret state? That depends on how long we have been in Ret already. If that time is longer than the minimum time in Ret, we can start emitting LED data, otherwise we'll stay in the Ret state. We'll write this logic as

entity state_gen<#IndexWidth>(
    clk: clock,
    rst: bool,
    num_leds: int<IndexWidth>
) -> OutputControl<int<IndexWidth>> {
    reg(clk) state reset(rst: State::Ret(0)) = match state {
        State::Ret(duration) => {
            if duration >= Tret {
                // First LED state
            }
            else {
                State::Ret(trunc(duration + 1))
            }
        },
        // ...
    };
}

You may be curious why we need trunc there. That's because Spade does not implicitly cast away overflow. duration+1 is 1 bit larger than duration if it overflows. To make it fit back into our state, we truncate the result of the addition, since we know that we have chosen duration to be large enough for it not to be an issue.

Timing

The keen eyed might have noticed Tret there. What is its value? It represents the minimum time that we should emit the ret signal, but duration is in clock cycles. Eventually, Spade might support being generic over clock cycles and allow you to reason about time natively. For now, we need to compute how many clock cycles Tret is manually. This of course depends on the clock frequency, a value which varies between FPGAs. At the time of writing this guide, out of the boards that swim currently supports natively, there are 4 different clock frequencies, so we probably want to be generic over it in order to write a library.

Since we'll need a few more time dependent parameters down the line, we'll define a Timing struct which we pass to the modules, which contains the relevant timings. We might write something like this

struct Timing {
    Tret: int<17>,
    T0h: int<12>,
    T0l: int<12>,
    T1h: int<12>,
    T1l: int<12>,
    bit_time: int<12>,
}

However, now the user needs to know what those implementation dependent times are, which probably requires going to the data sheet. To make their life simpler, let's change it to

struct Timing {
    // 280 microseconds
    us280: int<17>,
    // 0.4 microseconds
    us0_4: int<12>,
    // 0.8 microseconds
    us0_8: int<12>,
    // 0.45 microseconds
    us0_45: int<12>,
    // 0.85 microseconds
    us0_85: int<12>,
    // 1.25 microseconds
    us1_25: int<12>,
}

and update our entity

entity state_gen<#IndexWidth>(
    clk: clock,
    rst: bool,
    num_leds: int<IndexWidth>
    t: Timing,
) -> OutputControl<int<IndexWidth>> {
    let t_ret = t.us280;
    reg(clk) state reset(rst: State::Ret(0)) = match state {
        State::Ret(duration) => {
            if duration >= t_ret {
                // First LED state
            }
            else {
                State::Ret(trunc(duration + 1))
            }
        },
        // TODO: next states
    };
    // TODO: Output
}

NOTE: If you've been following along with a datasheet, for example the first result on duck duck go, or the first result from google you may be confused by why we use 280 microseconds and not 50. It turns out that the manufacturers of the LEDs updated the protocol at some point without updating model numbers or datasheets. this took quite a few hours of debugging when the code worked on old LEDs, but not a new strip.

Bit signals

To generate the bit signals, i.e. the nested for loop in the example above, we need to keep track of 3 things: which LED we're working on, which bit on that LED we're working on, and how long we've been in that state. Essentially 1 variable per loop level. We'll extend the state enum to fit:

enum State<#IndexWidth> {
    Ret{duration: int<17>},
    Led{idx: int<IndexWidth>, bit: int<6>, duration: int<12>}
}

How do we want the logic to work? At the "innermost level", if we aren't done emitting the current bit, we increase the duration by 1. If the duration reaches the bit time, we move on to the next bit, and if we are done with all bits, we move on to the next LED. Finally, if we reached the last LED, we'll go back to the RET state.

In Spade, we'll write that as

entity state_gen<#IndexWidth>(
    clk: clock,
    rst: bool,
    num_leds: int<IndexWidth>,
    t: Timing,
) -> OutputControl<int<IndexWidth>> {
    let t_ret = t.us280;
    let t_bit = t.us1_25;
    reg(clk) state reset(rst: State::Ret(0)) = match state {
        State::Ret(duration) => {
            if duration >= t_ret {
                State::Led(0, 0, 0)
            }
            else {
                State::Ret(trunc(duration + 1))
            }
        },
        State::Led$(idx, bit, duration) => {
            if duration == t_bit {
                if bit == 23 {
                    if idx == trunc(num_leds-1) {
                        State::Ret(0)
                    }
                    else {
                        State::Led$(idx: trunc(idx+1), bit: 0, duration: 0)
                    }
                }
                else {
                    State::Led$(idx, bit: trunc(bit+1), duration: 0)
                }
            }
            else {
                State::Led$(idx, bit, duration: trunc(duration + 1))
            }
        },
    };
    // TODO: Output
}

Spade supports passing arguments to units both by position, i.e. argument 1 is passed to parameter 1, 2 to 2 and so on, and also by name. To specify parameters by name, the calling parenthesis are preceded by $, i.e. State::Led$(idx, bit, duration: trunc(duration + 1)) says to pass the variable called idx to the parameter idx, bit to bit, and trunc(duration + 1) to duration. This works the same way as the rust struct initialisation syntax

Finally, generating the output signal can be done by another match statement. Since State and OutputControl are very similar in this case, the resulting match statement is not very complex:

    match state {
        State::Ret(_) => OutputControl::Ret(),
        State::Led$(idx, bit, duration) => OutputControl::Led$(payload: idx, bit, duration)
    }

Putting it all together, we end up with the following code:

enum OutputControl<T> {
    /// Currently emitting the RET signal
    Ret,
    /// Currently emitting the specified bit of the color for LED `index`
    Led{payload: T, bit: int<6>, duration: int<12>}
}

struct Timing {
    // 50 microseconds
    us280: int<17>,
    // 0.4 microseconds
    us0_4: int<12>,
    // 0.8 microseconds
    us0_8: int<12>,
    // 0.45 microseconds
    us0_45: int<12>,
    // 0.85 microseconds
    us0_85: int<12>,
    // 1.25 microseconds
    us1_25: int<12>,
}

enum State<#IndexWidth> {
    Ret{duration: int<17>},
    Led{idx: int<IndexWidth>, bit: int<6>, duration: int<12>}
}

entity state_gen<#IndexWidth>(
    clk: clock,
    rst: bool,
    num_leds: int<IndexWidth>,
    t: Timing,
) -> OutputControl<int<IndexWidth>> {
    let t_ret = t.us280;
    let t_bit = t.us1_25;
    reg(clk) state reset(rst: State::Ret(0)) = match state {
        State::Ret(duration) => {
            if duration >= t_ret {
                State::Led(0, 0, 0)
            }
            else {
                State::Ret(trunc(duration + 1))
            }
        },
        State::Led$(idx, bit, duration) => {
            if duration == t_bit {
                if bit == 23 {
                    if idx == trunc(num_leds-1) {
                        State::Ret(0)
                    }
                    else {
                        State::Led$(idx: trunc(idx+1), bit: 0, duration: 0)
                    }
                }
                else {
                    State::Led$(idx, bit: trunc(bit+1), duration: 0)
                }
            }
            else {
                State::Led$(idx, bit, duration: trunc(duration + 1))
            }
        }
    };
    match state {
        State::Ret(_) => OutputControl::Ret(),
        State::Led$(idx, bit, duration) => OutputControl::Led$(payload: idx, bit, duration)
    }
}

While we hope that the above code will work on the first try, that is rarely the case in practice. The next section will discuss how we can test our design

Testing the state machine

Just like software, testing our code is vital. Unlike software however, we don't have (easy) access to fancy tools like debuggers, printfs or error messages when we run on hardware. Therefore, we usually simulate FPGA designs to make sure they work in simulation in order to avoid painful debugging in hardware.

Currently, writing simulation code in Spade is not possible, as the things you want to do in a simulator are quite different to describing hardware. Instead, tests are written in python using the cocotb testing framework.

If you haven't already, refer to the installation instructions to see how to install cocotb.

Setting up tests

To do our testing, we need to do a tiny bit more setup in swim.

To do testing, we need to tell swim where we put our test benches. To do so, create a directory called test

Then edit swim.toml adding a simulation section like so:

[simulation]
testbench_dir = "test"

Inside the test folder we put our test benches in python files. Let's create our first one by creating test/state_gen.py. Each Spade test file must start with a comment telling swim which unit is to be tested, the "top module", like so. The path must be a fully namespaced name, and since our module resides in main.spade, it will be main::state_gen

# top=main::state_gen

We'll also add an empty test to that file like this:

# top=main::state_gen
import cocotb
from spade import SpadeExt

@cocotb.test()
async def normal_operation(dut):
    s = SpadeExt(dut)

Each test is annotated by @cocotb.test() and is an async python function which takes a single parameter dut, the Design Under Test.

Running swim test (or swim t for the lazy :)) presents us with the following error1

Error:
   0: In {tb}
   1: main::state_gen is generic which is currently unsupported in test benches

Which is a limitation of the Spade python interface. To test our module, we'll need to create a dummy entity without any generic parameters, for now we'll use one with 10 LEDs.

entity state_gen_10(clk: clock, rst: bool, t: Timing) -> OutputControl<int<5>> {
    inst state_gen(clk, rst, 10, t)
}

After updating the top to # top=main::state_gen_10 we can swim test again and we should see a nice PASS (along with some other output which we'll ignore for now)

[INFO] Building spade compiler
    Finished release [optimized] target(s) in 0.04s
[INFO] Built spade compiler
[INFO] build/spade.sv is up to date
[INFO] Building spade-python
    Finished release [optimized] target(s) in 0.04s
[INFO] Built spade-python
     -.--ns INFO     cocotb.gpi                         ..mbed/gpi_embed.cpp:109  in set_program_name_in_venv        Using Python virtual environment interpreter at /home/frans/Documents/spade/ws2812-spade/build/.env/bin/python
     -.--ns INFO     cocotb.gpi                         ../gpi/GpiCommon.cpp:99   in gpi_print_registered_impl       VPI registered
/home/frans/Documents/spade/ws2812-spade/build/spade.sv:177: Warning: Calling system function $value$plusargs() as a task.
/home/frans/Documents/spade/ws2812-spade/build/spade.sv:177:          The functions return value will be ignored.
/home/frans/Documents/spade/ws2812-spade/build/spade.sv:111: Warning: Calling system function $value$plusargs() as a task.
/home/frans/Documents/spade/ws2812-spade/build/spade.sv:111:          The functions return value will be ignored.
     0.00ns INFO     Running on Icarus Verilog version 11.0 (stable)
     0.00ns INFO     Running tests with cocotb v1.6.2 from /home/frans/Documents/spade/ws2812-spade/build/.env/lib/python3.10/site-packages/cocotb
     0.00ns INFO     Seeding Python random module with 1657993983
     0.00ns WARNING  Pytest not found, assertion rewriting will not occur
     0.00ns INFO     Found test state_gen.normal_operation
     0.00ns INFO     running normal_operation (1/1)
state_gen.py.vcd
VCD info: dumpfile state_gen.py.vcd opened for output.
     0.00ns INFO     normal_operation passed
     0.00ns INFO     **************************************************************************************
                     ** TEST                          STATUS  SIM TIME (ns)  REAL TIME (s)  RATIO (ns/s) **
                     **************************************************************************************
                     ** state_gen.normal_operation     PASS           0.00           0.00          4.45  **
                     **************************************************************************************
                     ** TESTS=1 PASS=1 FAIL=0 SKIP=0                  0.00           0.00          0.26  **
                     **************************************************************************************

[INFO] Building vcd translator
    Finished release [optimized] target(s) in 0.04s
[INFO] Built vcd translator
[INFO] Translating types in "./build/state_gen/state_gen.py.vcd"
[INFO] Translated VCD: ./build/state_gen/state_gen.py.translated.vcd
test/state_gen.py: PASS
1

The first time you run swim test, it will set up a python environment with the required libraries which requires compiling a separate part of the Spade compiler. Don't be alarmed at the time swim test takes, or the amount of output the first time you run it in a new project.

Writing Some Tests

Now we are ready to actually test our module. All Spade test functions start with s = SpadeExt(dut) which creates a nice Spade interface around the cocotb functions.

Since our entity is clocked, we need to generate a clock for it. This is done by starting a cocotb clock task like this. The exact clock frequency is not really important here, it only decides the mapping between simulation time and real time.

# import the clock generator
from cocotb.clock import Clock

# ...

    clk = dut.clk_i
    await cocotb.start(Clock(clk, 1, units='ns').start())

The design also takes a reset signal which we need to set to get our initial state defined. If we forget to do this, most of the signals will be undefined.

We can access the input ports of our design using s.i.<input_name> and give them values by assigning strings containing Spade expressions to them. The following code sets the reset signal to true:

s.i.rst = "true"

The s.i interface does not work well with cocotb built in functions like Clock You can access the raw verilog input ports on the dut via dut.<name>_i as above which is nice if you want to pass them to special cocotb functions like Clock. However, most of the time you should use the Spade interface since that doesn't require you to know the Spade internal representation of types.

For our design to start running, we need to take it out of reset again, you might think that we can just add another line s.i.rst = false. However, this would give the design no time to see the change in reset. Instead, we need to let the simulation step forward a bit. The easiest way to do that is to let it step forward one clock cycle, which we do by waiting until the next time the clock goes from 1 to 0

# import trigger
from cocotb.triggers import FallingEdge

# ...

    await FallingEdge(clk)
    s.i.rst = "false"

This will create a waveform that looks like this

     ---+   +---+
clk:    |   |   |
        +---+   +---...

     ---+
rst:    |
        +-----------...

In order to let the circuit catch up to the fact that the reset has been turned off, we'll advance the simulation another tiny time step (1 picosecond):

# import timer
from cocotb.triggers import Timer

# ...

    await Timer(1, units='ps')

You can find more things to wait for in the cocotb documentation for triggers.

Now we can do our first test, ensuring that the initial output of the circuit is RET. We can access the output of our dut with s.o, and run assertions on it like this:

s.o.assert_eq("OutputControl::Ret()")

If you return a struct from a unit, you can access them as normal python fields on the s.o field. For example s.o.x.y.assert_eq(...)

Our test file now looks like this:

# top=main::state_gen_10

from spade import SpadeExt

import cocotb
from cocotb.clock import Clock
from cocotb.triggers import FallingEdge, Timer

@cocotb.test()
async def normal_operation(dut):
    s = SpadeExt(dut)

    clk = dut.clk_i
    await cocotb.start(Clock(clk, 1, units='ns').start())

    s.i.rst = "true"
    await FallingEdge(clk)
    s.i.rst = "false"

    await Timer(1, units='ps')
    s.o.assert_eq("OutputControl::Ret()")

and calling swim test should tell us that all our assertions passed.

A failing test

Next, we may want to ensure that we output Ret in the next clock cycle as well. So, we'll advance the clock and assert that.

Changes to our state happens on the rising edge of clocks, so I prefer to do my assertions on the falling edge. That way I don't have to worry about if values have or have not changed right at the RisingEdge.

    await FallingEdge(clk)
    s.o.assert_eq("OutputControl::Ret()")

Calling swim test results in the following output:

...
VCD info: dumpfile state_gen.py.vcd opened for output.
 1.50ns INFO     normal_operation failed
                 Traceback (most recent call last):
                   File "/home/frans/Documents/spade/ws2812-spade/build/state_gen/state_gen.py", line 22, in normal_operation
                     s.o.assert_eq("OutputControl::Ret()")
                   File "/home/frans/Documents/spade/ws2812-spade/spade/spade-python/spade/__init__.py", line 75, in assert_eq
                     assert False, message
                 AssertionError:
                 Assertion failed
                     expected: OutputControl::Ret()
                          got: UNDEF

                    verilog (0XXXXXXXXXXXXXXXXXXXXXXX != xxxxxxxxxxxxxxxxxxxxxxxx)

...

[INFO] Building vcd translator
    Finished release [optimized] target(s) in 0.04s
[INFO] Built vcd translator
[INFO] Translating types in "build/state_gen/state_gen.vcd"
[INFO] Translated VCD: build/state_gen/state_gen.translated.vcd
test/state_gen.py: FAIL [build/state_gen/state_gen.translated.vcd]
	Failed test cases:
	normal_operation

Oh no, something went wrong, why? To debug our tests, the best method by far is to look at the wave dump. It contains the value of all the signals in the design over time and can give plenty of debug information. To see it, we need to install a vcd viewer, and the defacto standard is gtkwave.

Swim translates Verilog values in the wave dump back into Spade files and stores the result in a new vcd file which is printed along with the failing tests:

test/state_gen.py: FAIL [build/state_gen/state_gen.translated.vcd]

Let's open build/state_gen/state_gen.translated.vcd in gtkwave:

gtkwave build/state_gen/state_gen.translated.vcd

This should open a window that looks something like this:

The black portion shows the value of the signals we select over time. The left pane contains a list of the units in our design, in this case e_proj_main_state_gen_10. This is the verilog name of our module main::state_gen_10. If you select it, the signal list below will be populated by all the values in that module. In this case, it is just a wrapper around the actual design state_gen, so expand the module and select the submodule. This should give you a lot more signals

Now we have lots of signals to play with! Broadly, we can group them into several categories. Some signals start with p_. These contain the Spade value of the corresponding signal, for example p_clk_n... is a Spade value, and clk_n... is the raw verilog bits.

Names on the form _e_<numbers> and p_e_<numbers> are subexpressions that are not named in the Spade program. Unless you're debugging the compiler, you can ignore those.

Names on the form <name>_n<numbers> and p_<name>_n<numbers> are values which are named in your Spade code. These are the values you will actually want to look at

Finally, there are some signals called <name>_i. These are input input values. The Spade translation does not translate those, so it is better to look at the corresponding <name>_n<numbers> signals.

To add a signal to the waveform window, double click it. To debug this value, we'll want to look at a few signals clk, rst, state and t, so go ahead and add those to the wave view

Here we get quite a bit of information. We see that our state is defined until the first clock cycle after reset. We also see that all fields of t, our timing struct is "HIGHIMP". The name is a bit confusing, but this is caused by us forgetting to set that parameter.

Going back to the Spade code, to compute the next state in Ret, we check if the duration we've been in Ret so far is greater than t_ret. However, we haven't set t, so we are essentially comparing duration to t_ret, which is undefined, resulting in another undefined value.

Spade tries to do its best to avoid undefined values, it is certainly harder to write undefined values in Spade than in verilog. However, when forgetting to specify inputs, and when working with memories, they can pop up.

Let's specify the timings to fix this issue. Again, exact timing here isn't important, we'll set some values that make testing possible:

    s.i.t = """Timing$(
        us280: 28,
        us0_4: 4,
        us0_8: 8,
        us0_45: 4,
        us0_85: 8,
        us1_25: 12,
    )"""

    await FallingEdge(clk)
    s.o.assert_eq("OutputControl::Ret()")

With that change, our assertions pass.

More tests

Now we can get to testing the rest of the design. Since our state space is quite small in this case, we can ensure that all state transitions happen as they should. Since this is python, we can write things like loop, helper functions etc.

First, let's ensure that we stay in Ret for the specified amount of time, i.e. us280 clock cycles:

    s.i.t = """Timing$(
        us280: 28,
        us0_4: 4,
        us0_8: 8,
        us0_45: 4,
        us0_85: 8,
        us1_25: 12,
    )"""

    for i in range(0, 28):
        await FallingEdge(clk)
        s.o.assert_eq("OutputControl::Ret()")

After that, we should be emitting the value of the first LED. Here we can write a function to check a whole LED output, since we'll do that quite a few times


async def check_led(clk, s, index):
    # Each bit of the LED should be emitted
    for b in range(0, 24):
        # And each duration from 0 to us1_25 in each bit
        # For simulation performance, we'll just check the first and last bit explicitly
        await FallingEdge(clk)
        s.o.assert_eq(f"OutputControl::Led$(payload: {index}, bit: {b}, duration: 0)")
        for d in range(0, 5):
            await FallingEdge(clk)
        s.o.assert_eq(f"OutputControl::Led$(payload: {index}, bit: {b}, duration: 5)")

We can now test all our LEDs by calling it in a loop, and finally ensure that we go back to the ret state at the right time. The final test bench looks like this:

# top=main::state_gen_10

from spade import SpadeExt

import cocotb
from cocotb.clock import Clock
from cocotb.triggers import FallingEdge, Timer


async def check_led(clk, s, index):
    # Each bit of the LED should be emitted
    for b in range(0, 24):
        # And each duration from 0 to us1_25 in each bit
        # For simulation performance, we'll just check the first and last bit explicitly
        await FallingEdge(clk)
        s.o.assert_eq(f"OutputControl::Led$(payload: {index}, bit: {b}, duration: 0)")
        for d in range(0, 5):
            await FallingEdge(clk)
        s.o.assert_eq(f"OutputControl::Led$(payload: {index}, bit: {b}, duration: 5)")

@cocotb.test()
async def normal_operation(dut):
    s = SpadeExt(dut)

    clk = dut.clk_i
    await cocotb.start(Clock(clk, 1, units='ns').start())

    s.i.rst = "true"
    await FallingEdge(clk)
    s.i.rst = "false"

    await Timer(1, units='ps')
    s.o.assert_eq("OutputControl::Ret()")

    s.i.t = """Timing$(
        us280: 28,
        us0_4: 4,
        us0_8: 8,
        us0_45: 4,
        us0_85: 8,
        us1_25: 12,
    )"""

    for i in range(0, 28):
        await FallingEdge(clk)
        s.o.assert_eq("OutputControl::Ret()")

    # Check all 10 leds
    for i in range(0, 10):
        await check_led(clk, s, i)

    # Ensure we get back to the ret state
    await FallingEdge(clk)
    s.o.assert_eq("OutputControl::Ret()")

Running it gives us another assertion error:

VCD info: dumpfile state_gen.py.vcd opened for output.
    28.50ns INFO     normal_operation failed
                     Traceback (most recent call last):
                       File "/home/frans/Documents/spade/ws2812-spade/build/state_gen/state_gen.py", line 44, in normal_operation
                         await check_led(clk, s, i)
                       File "/home/frans/Documents/spade/ws2812-spade/build/state_gen/state_gen.py", line 13, in check_led
                         s.o.assert_eq(f"OutputControl::Led$(payload: {index}, bit: {b}, duration: {d})")
                       File "/home/frans/Documents/spade/ws2812-spade/spade/spade-python/spade/__init__.py", line 75, in assert_eq
                         assert False, message
                     AssertionError:
                     Assertion failed
                     	 expected: OutputControl::Led$(payload: 0, bit: 1, duration: 0)
                     	      got: proj::main::OutputControl::Led(0,0,12)

                     	verilog (100000000001000000000000 != 100000000000000000001100)
    28.50ns INFO     **************************************************************************************
                     ** TEST                          STATUS  SIM TIME (ns)  REAL TIME (s)  RATIO (ns/s) **
                     **************************************************************************************
                     ** state_gen.normal_operation     FAIL          28.50           0.06        461.47  **
                     **************************************************************************************
                     ** TESTS=1 PASS=0 FAIL=1 SKIP=0                 28.50           0.07        432.88  **
                     **************************************************************************************

[INFO] Building vcd translator
    Finished release [optimized] target(s) in 0.04s
[INFO] Built vcd translator
[INFO] Translating types in "./build/state_gen/state_gen.py.vcd"
[INFO] Translated VCD: ./build/state_gen/state_gen.py.translated.vcd
test/state_gen.py: FAIL [./build/state_gen/state_gen.py.translated.vcd]
	Failed test cases:
	normal_operation

Try to see if you can figure out what happened. Looking at the waves can be helpful, but in this case it might be enough to look at what states it transitioned to.

If you can't figure it out, jump to the next section for the answer

Output Generation

First of all, the cause of the bug mentioned in the end of the last chapter was an incorrect equality check of the duration when transitioning between states. It should be

if duration == trunc(t_bit-1) {

instead of

if duration == t_bit {

Now that our state machine works, we have done most of the heavy lifting. We still need to translate our control signal into an actual LED output, which is what we'll work on next.

Since this will require no internal state, and is fairly simple logic we'll represent it as a function. We'll also represent color as a struct with r, g and b values. The output is a single bool, the actual control signal to be passed to the LEDs.

struct Color {
    r: int<8>,
    g: int<8>,
    b: int<8>
}

fn output_gen(control: OutputControl<Color>, t: Timing) -> bool {
    // TODO
}

The ret output is easy, it is simply a low signal. The 0 and 1 signals are a bit more complex. The output should be 1 initially, and then transition to 0 at t0l or t1l depending on if the current bit is a 0 or a 1.

The OutputControl::Led has information about which of the 24 bits should be emitted. To translate that into a bit value, we'll concatenate the color channels, and "index" the correct bit. (Currently, Spade does not support bit indexing, so we'll extract the bits using shifts and masks instead)

This logic can be written as follows:

struct Color {
    r: int<8>,
    g: int<8>,
    b: int<8>
}

fn output_gen(control: OutputControl<Color>, t: Timing) -> bool {
    let t0h = t.us0_4;
    let t1h = t.us0_8;
    match control {
        OutputControl::Ret => false,
        OutputControl::Led$(payload: color, bit, duration) => {
            let color_concat = (color.g `concat` color.r `concat` color.b);
            let val = ((color_concat >> sext((23-bit))) & 1) == 1;
            let step_time = if val {t1h} else {t0h};
            if duration > step_time {
                false
            }
            else {
                true
            }
        }
    }
}

Testing

Again, it is good practice to test the module. Testing it is very similar to the state machine, except here we don't have a clock. Instead, we'll set a signal value, advance the simulation by a tiny time step, and assert the output. Here is an example of the test bench. Feel free to extend it with more tests that you think are reasonable. Here it might also be helpful to define some helper functions which check that a specific input gives a specific waveform, for example.

# top=main::output_gen

from spade import *

@cocotb.test()
async def ret_works(dut):
    s = SpadeExt(dut)

    s.i.t = """Timing$(
        us280: 2800,
        us0_4: 40,
        us0_8: 80,
        us0_45: 45,
        us0_85: 85,
        us1_25: 125,
    )"""

    s.i.control = "OutputControl::Ret()"
    await Timer(1, units='ps')
    s.o.assert_eq("false")



@cocotb.test()
async def one_at_bit_0(dut):
    s = SpadeExt(dut)

    s.i.t = """Timing$(
        us280: 2800,
        us0_4: 40,
        us0_8: 80,
        us0_45: 45,
        us0_85: 85,
        us1_25: 125,
    )"""

    # Sending 1 @ bit 0, time 0
    s.i.control = "OutputControl::Led(Color$(g: 0b1000_0000, r: 0, b: 0), 0, 0)"
    await Timer(1, units='ps')
    s.o.assert_eq("true")

    # Sending 1 @ bit 0, time 40
    s.i.control = "OutputControl::Led(Color$(g: 0b1000_0000, r: 0, b: 0), 0, 40)"
    await Timer(1, units='ps')
    s.o.assert_eq("true")

    # Sending 1 @ bit 0, time 80
    s.i.control = "OutputControl::Led(Color$(g: 0b1000_0000, r: 0, b: 0), 0, 80)"
    await Timer(1, units='ps')
    s.o.assert_eq("true")

    # Sending 1 @ bit 0, time 81
    s.i.control = "OutputControl::Led(Color$(g: 0b1000_0000, r: 0, b: 0), 0, 81)"
    await Timer(1, units='ps')
    s.o.assert_eq("false")

If you want to see a more fleshed out test test, have a look at https://gitlab.com/TheZoq2/ws2812-spade/-/blob/e3ede5d50abf176f0ea5f0dcf6bfdcfb8b2228d8/test/output_gen.py.

With our tests now passing, we can finally run the code in hardware, which we will discuss in the next and final section of this chapter.

Testing in hardware

We are finally at a point where we think the code is correct, and all the pieces are implemented. It's time to test it on hardware.

To do so, we need to set up a demo entity which instantiates the state and output generators, and selects a nice color for them. We'll do this in a separate file, called src/hw_test.spade

use lib::main::state_gen;
use lib::main::output_gen;
use lib::main::OutputControl;
use lib::main::Timing;
use lib::main::Color;

#[no_mangle(all)]
entity demo(clk: clock, pmod0: &mut int<6>) {
    reg(clk) rst initial(true) = false;

    // Our target FPGA, the ecpix5 has a 100 MHz clock.
    let t = Timing$(
        us280: 28000,
        us0_4: 40,
        us0_8: 80,
        us0_45: 45,
        us0_85: 85,
        us1_25: 125,
    );
    let ctrl: OutputControl<int<4>> = inst state_gen(clk, rst, 4, t);

    reg(clk) timer: int<32> reset(rst: 0) = if timer > 100_000_000 {
        0
    }
    else {
        trunc(timer+1)
    };

    reg(clk) offset: int<2> reset(rst: 0) = if timer == 0 {
        trunc(offset+1)
    }
    else {
        offset
    };

    let brightness = 64;
    let colors = [
        Color(brightness, 0, 0),
        Color(0, brightness, 0),
        Color(0, 0, brightness),
        Color(0, brightness, brightness),
    ];

    let with_color = match ctrl {
        OutputControl::Ret => OutputControl::Ret(),
        OutputControl::Led$(payload: led_num, bit, duration) => {
            let led_num = trunc(led_num + sext(offset));
            OutputControl::Led$(payload: colors[led_num], bit, duration)
        },
    };

    let pin = output_gen(with_color, t);

    set pmod0 = if pin {0b1} else {0};
}

There is not much going on here. Since we're in a different file, we need to include the stuff defined in the other file. lib refers to the library we are currently building, and since our code is in main.spade, the items are put in the main namespace

Since our top module, demo, is going to connect to the external world, we'll mark it as #[no_mangle(all)]. This tells the Spade compiler to name things exactly what they are called in the Spade code. The downside of this is that we might collide with Verilog keywords, and the module demo will not have a namespaced name.

For the output, we also use a &mut int<6>. &mut is a mutable wire, i.e. a wire where we can set a value using set. It is an int<6> because the IO port pmod0 on the ecpix5 board we've been using as an example is 6 bits wide. The physical pins pmod0 is mapped to is specified in the lpf file.

The line reg(clk) rst initial(true) = false; generates a reset signal that is active the first clock cycle after the FPGA has started.

To generate the output, we create our timing struct, this time with correct timings for the 100 MHz FPGA we're targeting. We use an array to look up color values for each of the LEDs we're going to test, and output those signals.

Then we instantiate everything, and finally set the output pin to the resulting value. Here the LED strip is connected to the first pin of pmod0

We also need to tell the synthesis tool what entity should be our top module; to do so, change the synthesis.top value in swim.toml to demo

[synthesis]
top = "demo"

With all that done, we can run swim upload, and look at our new RGB LEDs.

The pattern is static and boring at the moment, so this is a great opportunity to play around a bit and make the LEDs do something more interesting!

All the code for this project can be found at https://gitlab.com/TheZoq2/ws2812-spade

Language Reference

This chapter is a reference for individual features in the language.

Items

Anything that appears at the top level of a Spade file is an item. This includes units, types and (sub)modules etc..

As a user, you will rarely encounter the term Item, though it might appear in parser errors if you write something unexpected at the top level of a file.

Units

Units are the basic building blocks of a Spade project, they correspond to modules in Verilog, and entities in VHDL. Units come in three flavors: functions, pipelines and entities.

Functions

Functions are combinational circuits (or pure, in software terms), that is they have no internal state, and can not read or set mutable wires.

Pipelines

Pipelines have a specified delay between input and output, and have explicit staging statements.

Entities

Finally, entities are the most general units, they can have state, and the input-output delay is arbitrary. They therefore have roughly the same programming model as VHDL and Verilog.

Type Declarations

Struct

struct declaration include a name, optional generic arguments and a list of fields. The fields in turn have a name and a type which may use the generic arguments.

struct Empty {}

struct NonGeneric {
    field_a: int<8>,
    field_b: bool
}

struct Generic<T> {
    field_a: T,
    field_b: bool
}

Enum

enum declarations also include a name and optional generic arguments. Their body consists of a list of variants. Each variant in turn has a name, and an optional list of fields

enum Enum {
    EmptyVariant,
    OneField{val: int<8>}
    TwoFields{val1: bool, val2: bool}
}

enum GenericEnum<T> {
    EmptyVariant,
    OneField{val: T}
}

Statements

The body of any unit, or block is a list of statements followed by a resulting expression. Statements can declare things local to the block and contain expressions to be evaluated

Let bindings

Let bindings bind a pattern to a value.

Those not used to bindings and patterns can view a let binding as assigning a value to a variable.

The pattern has to be an irrefutable pattern

If the type specification is omitted, the type is inferred.

Syntax

let pattern [: type specification] = expression ;

Examples

Binding a value to a variable

let a = some_value;

Binding a value to the result of an if expression

let a = if x {0} else {2};

Unpacking a tuple

let (a, b) = some_value;

Unpacking a struct with an explicit type signature

let Struct$(a, b): Struct<int<8>> = some_value;

Registers

Free-standing (i.e. non-pipelining registers) are defined using reg(clk) ... The register definition is quite complex and includes

  • The clock signal which triggers an update
  • A pattern to bind the current value of the register to. It must be irrefutable
  • An optional type specification. Like let bindings, the type is inferred if the type signature is omitted
  • An optional reset consisting of a reset trigger and a reset value. Whenever the reset trigger is true the value of the register is asynchronously set to the reset value1
  • An expression which gives new value

On the rising edge of the clock signal, the value of the register is updated to the value of the new value. The new value expression can include variables from the register itself.

Syntax

reg( clock: expression ) pattern [: type specification] [reset( reset trigger: expression : reset value expression)] = new value: expression ;

Examples

A register which counts from -128 to 127 (Note that because no initial value is specified, this will be undefined in simulation):

reg(clk) value: int<8> = trunc(value + 1);

A register which counts from 0 to 200 (inclusive) and is reset to 0 by rst:

reg(clk) value reset(rst: 0) =
    if value == 200 {
        0
    } else {
        trunc(value + 1)
    };

Pipeline stage markers

Stage markers (reg;) are used in pipelines to specify where pipeline registers should be inserted. After a reg statement, all variables above the statement will be put in registers and any reference to those variables refer to the registered version.

Syntax

Repeated

In cases where more than one stage should be inserted without any new statements in between, there is a shorthand syntax:

reg * n`

where n is an integer. This is compiled into n simple reg statements, i.e.

reg * 3;

is the same as

reg;
reg;
reg;

Conditioned

A condition for the registers to accept values can also be specified in square brackets

reg[condition]

The semantics of this are explained in the section on dynamic pipelines

Pipeline stage labels

Pipeline stages can be given names to refer to them from other stages. This is done using 'name.

  'first
  let x = ...;
reg;

To refer to a named stage, use a []

Set

Set the value of a mutable wire to the specified value.

set wire = value;

Set statements can only appear at the top block of a unit. This might be surprising as you would expect to be able to write


#![allow(unused_variables)]
fn main() {
if condition {
  set wire = value;
}
}

However, this is not well-defined in hardware because the wire needs some value, but no value is specified if condition does not hold. This particular point isn't true if an else branch is also specified, but the exact hardware that gets generated from imperative code like this is not obvious, particularly with more nesting.

Therefore, if you want to write

if condition {
  set wire = on_true;
} else {
  set wire = on_false
}

you should move the set statement outside to make it unconditional, i.e.

set wire =
  if condition {
    on_true
  } else {
    on_false
  }

Syntax

set expression = expression;

Assert

Takes a boolean condition and evaluates it, raising a runtime error in simulation if it ever evaluates to false. In synthesis, this is ignored

assert this_should_be_0 == 0;

NOTE: Assert statements are currently not supported for synthesis with Verilator, only with Icarus.

Comptime

TODO

Real world example

Expressions

An expression is anything that has a value. Like most languages this includes things like integers literals, instantiations and operators. However, unlike the languages you may be used to, almost everything in Spade is an expression and has a value, for example if-expression and match-blocks.

This means, among other things, that you can assign the 'result' of an if-expression to a variable:

let a = if choose_b {
    b
}
else {
    c
};

Blocks

A block is an expression which can contain sub-statements. They are delimited by {}, contain zero or more statements and usually end with an expression for the whole block's value.

let a = {
    let partial_result = ...; // Bind a partial result to a variable

    // 'return' the result of compute_on as the result of the block
    compute_on(partial_result)
}

Variables defined inside blocks are only visible in the block. For example, you cannot use partial_result outside the block above.

Blocks are required in places like bodies of if-expressions and functions, but can be used in any place where an expression is expected.

if-expressions

Syntax

if expression block else block

The if-expression looks a lot like an if-statement in languages you may be used to, but unlike most languages where if is used to conditionally do something, in Spade, it is used to select values.

For example, the following function returns a if select_a is true, otherwise it returns b.

fn select(select_a: bool, a: int<8>, b: int<8>) -> int<8> {
    if select_a {
        a
    } else {
        b
    }
}

This code makes heavy use of blocks. The body of the function, as well as each if-branch is a block.

In traditional hardware description languages, this would instead look like

fn select(select_a: bool, a: int<8>, b: int<8>) -> int<8> {
    var result;
    if select_a {
        result = a;
    } else {
        result = b;
    }
    return result
}

but the Spade version is much closer to the actual hardware that is generated. Hardware in general does not support conditional execution, it will evaluate both branches and select the result.

match-expression

Syntax

match expression { pattern => expression , ... }

The match-expression is used to select a value based on the value of a single expression. It is similar to case statements in many languages, but supports pattern-matching which allows you to bind sub-values to variables. Typically, match statements are used on enum values:

enum MaybeByte {
    Some{value: uint<8>},
    None
}

fn byte_or_zero(in: MaybeByte) -> uint<8> {
    match in {
        // Bind the inner value to a new variable and return it
        MaybeByte::Some(value) => value,
        MaybeByte::None => 0,
    }
}

but they can also be used on any values

If more than one pattern matches the value, the first pattern will be selected.

A match statement must cover all possible values of the matched expression. If this is not the case, the compiler emits an error.

Instantiation

The three kinds of units are instantiated in different ways in order to highlight to readers of the code what might happen beyond an instantiation. For example if you see a function instantiation, you know that there will be no state or other weird behavior behind the instantiation.

The following syntax is used to instantiate the different kinds of units:

  • Functions: unit()
  • Entities: inst unit()
  • Pipelines inst(<depth>) unit(). The depth is the depth of the pipeline

Instantiation rules

Functions can be instantiated anywhere. Entities and pipelines can only be instantiated in entities or pipelines.

In addition, pipelines instantiated in other pipelines check the delay to make sure that values are ready before they are readable. For example,

    let x = inst(3) subpipe();
    let y = function();
reg;
    read(x); // Compilation error. x takes 3 cycles to compute, but is read after 1
    read(y); // Allowed, function is pure so its output is available immeadietly
reg * 2;
    // Allowed, x has 3 stages internally, this will be the first value out of the pipeline
    read(x)

Array Indexing

Arrays can be indexed using []. Indices can either be single runtime integers such as [i], or compile-time ranges, such as [10:15]. Arrays are written and indexed as most software languages: the leftmost array element is at index 0.

For example, [a, b, c][0:2] returns a new array [a, b]

Single element indexing

Non-range indices can be runtime integers. The size of the index is the smallest power of two that can represent the size of the array. However, if the array size is not an even power of two, indexing outside the range causes undefined behavior.

Range indexing

The indices for range indexing can only be raw integers, i.e. not runtime values. The leftmost, i.e. beginning of the range is included, and the end of the range is exclusive. For example, a[0:1] creates an array with a single element containing the first element of a.

Examples

let array = [10, 11, 12, 13, 14];

let _ = array[0]; // 10
let _ = array[1]; // 11
let _ = array[2]; // 12
let _ = array[5]; // Out of bounds access (array length is 5), result is undefined

let _ = array[0:1]; // [10]
let _ = array[1:3]; // [11, 12]
let _ = array[0:5]; // [10, 11, 12, 13, 14]

Tuple indexing

Tuples can also be indexed, though tuple indexing uses #, for example tup#0 for the leftmost tuple value. Tuple indices can only be known at compile time

Stage references

TODO

Patterns

Patterns are used to bind values to variables, and as 'conditions' in match-expressions. Patterns match a set of values, and bind (essentially assigns) a set of partial values to variables.

Name patterns

The simplest pattern is a variable name, like x. It matches all values, and binds the value to the name, x in this case.

Literal patterns

Integers and booleans can be matched on literals of their type. For example, true only matches booleans that are true and 10 only matches integers whose value is 10. Literal patterns do not bind any variables.

Tuple patterns

Another simple pattern is the tuple-pattern. It matches tuples of a specific length, and binds all elements of the tuples to sub-patterns. All patterns can be nested

For example

let ((a, b), c) = ((1, 2), 3);

will result in a=1, b=2 and c=3.

If parts of a tuple pattern are conditional, the pattern will only match if the subpatterns do. For example,

match (x, y) {
    (true, _) => true,
    _ => false,
}

will only return true if x is true, and false otherwise

Struct and enum patterns

Named patterns are used to match structs and enum variants. They consist of the name of the type or variant, followed by an argument list if the type has arguments.

Argument lists can be positional: () or named: $(). In a positional argument list, the fields of the type are matched based on the order of the fields. In a named list, patterns are instead bound by name, either field_name: pattern or just field_name which binds a new local variable field_name to field. Argument matching in patterns works the same way as in argument lists during instantiation

This is best shown by examples

struct S {
    x: int<8>,
    y: int<8>,
}

// Positional pattern, binds `a` to the value of `x` and `b` to the value of `y`
S(a, b)
// Named pattern with no shorthand. The whole pattern matches if the `y` field is `0`
// in which case `a` will be bound to the value of `x`
S$(y: 0, x: a)
// Shorthand named. This binds a local variable `y` to the value of the field `y`. The field `x` is ignored.
S$(y, x: _)

enum variants work the same way, but only match the enum of the specified name. For example

enum E {
    A,
    B{val: int<8>}
}

match e {
    E::A => {},
    E::B(0) => {},
    E::B(val) => {}
}

Wildcard

The wildcard pattern _. It matches all values but does not bind the value to any variable. It is useful as a catch-all in match blocks

For example, if we want to do something special for 0 and 1, but don't care about other values we might write:

match integer {
    0 => {},
    1 => {},
    _ => {}
}

Refutability

A pattern which matches all values of its type is irrefutable while one which only matches conditionally is refutable.

For example, a pattern unpacking a tuple is irrefutable because all values of type (T, Y) will match (a, b)

let tuple: (T, Y) = ...;
match tuple {
    (a, b) => {...}
}

while one which matches an enum variant is refutable because the None option will not match

enum Option<T> {
    Some{val: T},
    None,
}
match x {
    Some(x) => {...} // refutable: None not covered
    ...
}

Full documentation of the type system is yet to be written.

Primitive Types

These are the types built into the Spade language which aren't defined in the standard library.

bool

Generics

In a lot of cases, you want code to be generic over many different types, therefore both types and units support generic parameters.

Defining generics

Units and types which are generic have their generic parameters specified inside angle brackets (<>) after the name. The generics can be either integers denoted by #, or types which do not have # sign. In the body of the generic item, the generic types are referred to by their names

For example a struct storing an array of arbitrary length and type is defined as

struct ContainsArray<T, #N> {
    inner: [T, N]
}

Using generics

When specifying generic parameters, angle brackets (<>) are also used. For example, a function which takes a ContainsArray with 5 8-bit integers is defined as

fn takes_array(a: ContainsArray<int<8>, 5>) {
    ...
}

Ports and Wires

See the user-level documentation

Dynamic Pipelines

NOTE Dynamic pipelines are experimental and have soundness issues when nested. If you use them, make sure that there are no sub-pipelines that overlap with conditional registers.

For conditionally executing pipelines, an enable condition can be specified on the reg statement. If this condition is false, the old value of all pipeline registers for this stage will be held, rather than being updated to the new values.

The stall condition is specified as follows

pipeline(1) pipe(clk: clock, condition: bool, x: bool) -> bool {
    reg[condition];
       x
}

where condition is a boolean expression which when true updates all the registers for this stage, and when false the register content is undefined1.

The above code is compiled down to the equivalent of

entity pipe(clk: clock, condition: bool, x: bool) -> bool {
    reg(clk) condition_s1 = if condition {condition} else {condition_s1}
    reg(clk) x_s1 = if condition {x} else {x_s1}
    x_s1
}

Pipeline enable conditions propagate to stages above the enabled stage, in order to make sure that values are not flushed. This means that in the following code

pipeline(1) pipe(clk: clock, x: bool) -> bool {
    reg;
    reg[inst check_condition()];
    reg;
        x
}

the first two stages will be disabled and keep their old value when check_condition returns false while the registers in the final stage will update unconditionally.

If several conditions are used, they are combined, i.e. in

pipeline(1) pipe(clk: clock, x: bool) -> bool {
    reg;
    reg[inst check_condition()];
    reg;
    reg[inst check_other_condition()];
    reg;
        x
}

the first two stages will update only if both check_condition() and check_other_condition() are true, and the next two registers are only going to update if check_other_condition is true.

stage.ready and stage.valid

In some cases it is necessary to check if a stage will be updated on the next cycle (ready) or if the values in the current cycle are valid. This is done using stage.valid and stage.ready.

stage.ready is true if the registers directly following the statement will update their values this cycle, i.e. if the condition of it and all downstream registers are met.

stage.valid is true if the values in the current stage were enabled, i.e. if none of the conditions for any registers this value flowed through were false.

NOTE: stage.valid currently initializes as undefined, and needs time to propagate through the pipeline. It is up to the user to ensure that a reset signal is asserted long enough for stage.valid to stabilize.

Example: Processor

This is part of a processor that stalls the pipeline in order to allow 3 cycles for a load instruction. The program_counter entity takes a signal indicating whether it should count up, or stall. This signal is set to stage.ready, to ensure that if the downstream registers don't accept new instructions, the program counter will stall.

pipeline(5) cpu(clk: clock) -> bool {
        let pc = program_counter$(clk, stall: stage.ready)
    reg;
        let insn = inst(1) program_memory(clk)
        let stall = stage(+1).is_load || stage(+2).is_load || stage(+3).is_load;
    reg[stall];
        let is_load = is_load(insn);
    reg;
        let alu_out = alu(insn);
    reg;
    reg;
        let regfile_write = if stage.valid && insn_writes(insn) {Some(alu)} else {None()}

        true // NOTE: Dummy output, we need to return something
}

the last line where regfile_write is set uses stage.valid to ensure that results of an instruction are only written for valid signals, not signals being undefined due to a stalled register.

Example: Latency Insensitive Interface

A common design method in hardware is to use a ready/valid interface. Here, a downstream unit can communicate that it is ready to receive data by asserting a ready signal, and upstream unit indicate that their data is valid using a valid signal. If both ready and valid are set, the upstream unit hands over a piece of data to the downstream unit. What follows is an example of a pipelined multiplier that propagates a ready/valid signal from its downstream unit to its upstream unit

struct port Rv<T> {
    data: &T,
    valid: &bool,
    ready: &mut bool
}

pipeline(4) mul(clk: clock, a: Rv<int<16>>, b: Rv<int<16>>) -> Rv<int<32>> {
        let product = a*b;
        set a.ready = stage.ready;
        set b.ready = stage.ready;
    reg[*a.valid && *b.valid];
    reg;
    reg; 
        let downstream_ready = inst new_mut_wire();
    reg[inst read_mut_wire(downstream_ready)];
        Rv {
            data: &product,
            valid: &stage.valid,
            ready: downstream_ready,
        }
}
1

Currently, the implementation holds the previous value of the register, which will also be done in hardware. However, this might change to setting the value to X for easier debugging, and to give more optimization opportunities for the synthesis tool.

Binding

Constructs by syntax

This is a list of syntax constructs in the language with a link to a description of the underlying language construct. The list is split in two: constructs which start with a keyword and those which do not

Keywords

Symbolic

Swim

Swim is a batteries-included build system and package manager for the Spade programming language. It manages rebuilds of Spade source code, the compiler and any additional Verilog, supports simulation using your favorite simulators (icarus, verilator), and automates synthesis for ECP5 and iCE40 and gowin using yosys and nextpnr. The generated Verilog can also, of course, be used with any other tool.

Learn how to:

Installing Swim

Swim can be installed using the Rust package manager, cargo. To install Rust and cargo, follow the instructions at rustup.rs.

Once cargo is installed, you can install the latest development version of Swim by running:

cargo install --git https://gitlab.com/spade-lang/swim

Remember to add the ~/.cargo/bin/ directory to your PATH if you haven't already.

If you want to use the simulation and "place and route" features you will also need a synthesis tool like yosys.

Using Swim

Run swim init <project name> to create a new project in a subdirectory named <project name>. It'll setup a basic project to serve as a jumping-off point.

To compile the Spade code to verilog, run swim build or swim b. This will build the Spade compiler and compile your code to build/spade.sv.

swim help will list the builtin swim subcommands. Here is a short excerpt:

Commands:
  build         [aliases: b]
  synth         [aliases: syn]
  pnr           [aliases: p]
  upload        [aliases: u]
  simulate      [aliases: sim, test, t]
  init          Initialise a new swim project in the specified directory. Optionally copies from a template
  update        Updates all external dependencies that either have a set branch or tag, or hasn't been downloaded locally
  update-spade
  restore       Restore (discard) changes made to git-dependencies (including the compiler)
  clean
  help          Print this message or the help of the given subcommand(s)

For reference projects for configuration see swim-templates.

Custom subcommands

swim also supports custom subcommands as follows: if there is a binary named swim-xxx in your path, then calling swim xxx arg1 arg2 will dispatch to swim-xxx arg1 arg2. We have a list of known community subcommands like cargo's own community subcommand wiki -- if you've made one, feel free to add it!

Simulation and test benches

Swim supports running test benches for your code. Before you do so, you must add a few lines to swim.toml

[simulation]
testbench_dir = "test"

Test benches are currently written in Verilog, place your test benches in a directory called test. Swim will build and run each Verilog file in that directory separately, and if the exit code of the simulator is 0, the test is considered successful.

Finally, run swim test to test your code.

For sample projects for configuration see swim-templates.

Synthesis, place and route

Swim can also simulate and synthesise your project using yosys and nextpnr. Ensure those are installled, then add sections for [synthesis] and [pnr] to your config file. The exact options you need to specify depend on the architecture, but swim should tell you which fields you need to set. As an example, here is the synthesis configuration for an ECP5 based FPGA

[synthesis]
top = "e_main"
extra_verilog = []
command = "synth_ecp5"

[pnr]
architecture = "ecp5"
device = "LFE5UM-85F"
pin_file = "pins.lpf"
package = "CABGA381"

To synthesise your code, call swim synth and to place and route, call swim pnr. Swim will make sure the prerequisite steps are performed for you, so if your end goal is pnr, you can call swim pnr directly.

For sample projects for configuration, see swim-templates.

Upload

Swim can also upload your code for a few FPGAs. To get started, add an [upload] section to your config. Like synthesis, the exact options depend on your target, so let the error messages from swim guide the configuration.

To upload, call swim upload.

Templates

If you're using a supported board you can copy a template repository which contains a project that's ready to upload. Check available boards with swim init --list-boards and then swim init --board <board>.

A note on the Spade compiler and submodules

As Spade is still early in development, it is useful to have each project pinned to a specific compiler version, rather than having a global copy of the compiler. This means that it will still build in the future even if breaking changes are made to the language.

By default, Swim tracks the compiler version in a file called swim.lock that is created on the first build. It is probably a good idea to track this using git or another VCS. If you want to update to the newest version of the Spade compiler, run swim update-spade and commit the updated swim.lock.

If you prefer keeping your own submodule (perhaps you want to do your own changes to the compiler?), you can also setup a path dependency and track it like any other submodule. For example:

git submodule add https://gitlab.com/spade-lang/spade.git spade
git commit -m "Add Spade submodule"

And then, instruct Swim to use a path to the compiler instead by changing your swim.toml to this:

compiler = { path = "spade" }

Using another compiler branch

You can depend on a specific branch by setting the compiler-field in your swim.toml:

compiler = { git = "https://gitlab.com/spade-lang/spade.git", branch = "another-branch" }

After setting the field, run swim update-spade to update the pinned compiler.

You can also change the repository, if you wish.

Using a global Spade compiler

If you prefer using a global compiler, you can set the compiler field to point to an absolute path to the root directory for a local Spade compiler repository:

compiler = { path = "/path/to/spade-repo" }

You can read more about configurating Swim in the docs.

Debugging spadec

In rare cases where you want to attach a debugger to the Spade compiler, you can use --debug-spadec

Community Subcommands

Swim, like cargo, supports extending its functionality with community-defined subcommands. Feel free to add your own subcommand to this list!

NameDescription
swim-clean-allInspired by cargo-clean-all: recursively clean swim projects
Config

The main project configuration specified in swim.toml

Summary

# The name of the library. Must be a valid Spade identifier
# Anything defined in this library will be under the `name` namespace
name = "…"
# List of optimization passes to apply in the Spade compiler. The passes are applied
# in the order specified here. Additional passes specified on individual modules with
# #[optimize(...)] are applied before global passes.
optimizations = ["…", …]
# List of commands to run before anything else.
preprocessing = ["…", …] # Optional
# Paths to verilog files to include in all verilog builds (simulation and synthesis).
# Supports glob syntax
extra_verilog = ["…", …] # Optional
# Map of libraries to include in the build.
# 
# Example:
# ```toml
# [libraries]
# protocols = {git = https://gitlab.com/TheZoq2/spade_protocols.git}
# spade_v = {path = "deps/spade-v"}
# ```
libraries = {key: <Library>, …} # Optional
# Plugins to load. Specifies the location as a library, as well
# as arguments to the plugin
# 
# Example:
# ```toml
# [plugins.loader_generator]
# path = "../plugins/loader_generator/"
# args.asm_file = "asm/blinky.asm"
# args.template_file = "../templates/program_loader.spade"
# args.target_file = "src/programs/blinky_loader.spade"
# 
# [plugins.flamegraph]
# git = "https://gitlab.com/TheZoq2/yosys_flamegraph"
# ```
# 
# Plugins contain a `swim_plugin.toml` which describes their behaviour.
# See [crate::plugin::config::PluginConfig] for details
plugins = {key: <Plugin>, …} # Optional

# Where to find the Spade compiler. See [Library] for details
[compiler]
<Library>

[simulation] # Optional
<Simulation>

[synthesis] # Optional
<Synthesis>

# Preset board configuration which can be used instead of synthesis, pnr, packing and upload
[board] # Optional
<Board>

[pnr] # Optional
<Pnr>

[packing] # Optional
<PackingTool>

[upload] # Optional
<UploadTool>

[log_output]
<LogOutputLevel>

name String

The name of the library. Must be a valid Spade identifier Anything defined in this library will be under the name namespace

optimizations [String]

List of optimization passes to apply in the Spade compiler. The passes are applied in the order specified here. Additional passes specified on individual modules with #[optimize(...)] are applied before global passes.

compiler Library

Where to find the Spade compiler. See [Library] for details

preprocessing [String]

List of commands to run before anything else.

extra_verilog [String]

Paths to verilog files to include in all verilog builds (simulation and synthesis). Supports glob syntax

simulation Simulation

synthesis Synthesis

board Board

Preset board configuration which can be used instead of synthesis, pnr, packing and upload

pnr Pnr

packing PackingTool

upload UploadTool

libraries Map[String => Library]

Map of libraries to include in the build.

Example:

[libraries]
protocols = {git = https://gitlab.com/TheZoq2/spade_protocols.git}
spade_v = {path = "deps/spade-v"}

plugins Map[String => Plugin]

Plugins to load. Specifies the location as a library, as well as arguments to the plugin

Example:

[plugins.loader_generator]
path = "../plugins/loader_generator/"
args.asm_file = "asm/blinky.asm"
args.template_file = "../templates/program_loader.spade"
args.target_file = "src/programs/blinky_loader.spade"

[plugins.flamegraph]
git = "https://gitlab.com/TheZoq2/yosys_flamegraph"

Plugins contain a swim_plugin.toml which describes their behaviour. See [crate::plugin::config::PluginConfig] for details

log_output LogOutputLevel

LogOutputLevel

One of these strings:

  • "Full"
  • "Minimal"
Plugin

Summary

args = {key: "…", …}

[lib]
<Library>

lib Library

args Map[String => String]

UploadTool

One of the following:

icesprog

tool = "icesprog"

Fields

iceprog

tool = "iceprog"

Fields

tinyprog

tool = "tinyprog"

Fields

openocd

tool = "openocd"
config_file = "path/to/file"
Fields

config_file FilePath

fujprog

tool = "fujprog"

Fields

custom

Instead of running a pre-defined set of commands to upload, run the specified list of commands in a shell. #packing_result# will be replaced by the packing output

tool = "custom"
commands = ["…", …]
Fields

commands [String]

PackingTool

One of the following:

icepack

tool = "icepack"

Fields

ecppack

tool = "ecppack"
idcode = "…" # Optional
Fields

idcode String

Pnr

One of the following:

ice40

architecture = "ice40"

[device]
<Ice40Device>
package = "…"
# If set, inputs and outputs of the top module do not need a corresponding field
# in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but
# when running in hardware, it is recommended to leave this off in order to get a warning
# when pins aren't set in the pin file.
allow_unconstrained = true|false
# Continue to the upload step even if the timing isn't met.
# This is helpful when you suspect that the place-and-route tool is conservative
# with its timing requirements, but gives no guarantees about correctness.
allow_timing_fail = true|false
# The path to a file which maps inputs and outputs of your top module to physical pins.
# On ECP5 chips, this is a `pcf` file, and on iCE40, it is an `lpf` file.
pin_file = "path/to/file"
Fields

device Ice40Device

package String

allow_unconstrained bool

If set, inputs and outputs of the top module do not need a corresponding field in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but when running in hardware, it is recommended to leave this off in order to get a warning when pins aren't set in the pin file.

allow_timing_fail bool

Continue to the upload step even if the timing isn't met. This is helpful when you suspect that the place-and-route tool is conservative with its timing requirements, but gives no guarantees about correctness.

pin_file FilePath

The path to a file which maps inputs and outputs of your top module to physical pins. On ECP5 chips, this is a pcf file, and on iCE40, it is an lpf file.

ecp5

architecture = "ecp5"

[device]
<Ecp5Device>
package = "…"
# If set, inputs and outputs of the top module do not need a corresponding field
# in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but
# when running in hardware, it is recommended to leave this off in order to get a warning
# when pins aren't set in the pin file.
allow_unconstrained = true|false
# Continue to the upload step even if the timing isn't met.
# This is helpful when you suspect that the place-and-route tool is conservative
# with its timing requirements, but gives no guarantees about correctness.
allow_timing_fail = true|false
# The path to a file which maps inputs and outputs of your top module to physical pins.
# On ECP5 chips, this is a `pcf` file, and on iCE40, it is an `lpf` file.
pin_file = "path/to/file"
Fields

device Ecp5Device

package String

allow_unconstrained bool

If set, inputs and outputs of the top module do not need a corresponding field in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but when running in hardware, it is recommended to leave this off in order to get a warning when pins aren't set in the pin file.

allow_timing_fail bool

Continue to the upload step even if the timing isn't met. This is helpful when you suspect that the place-and-route tool is conservative with its timing requirements, but gives no guarantees about correctness.

pin_file FilePath

The path to a file which maps inputs and outputs of your top module to physical pins. On ECP5 chips, this is a pcf file, and on iCE40, it is an lpf file.

Ecp5Device

One of these strings:

  • "LFE5U-12F"
  • "LFE5U-25F"
  • "LFE5U-45F"
  • "LFE5U-85F"
  • "LFE5UM-25F"
  • "LFE5UM-45F"
  • "LFE5UM-85F"
  • "LFE5UM5G-25F"
  • "LFE5UM5G-45F"
  • "LFE5UM5G-85F"
Ice40Device

One of these strings:

  • "iCE40LP384"
  • "iCE40LP1K"
  • "iCE40LP4K"
  • "iCE40LP8K"
  • "iCE40HX1K"
  • "iCE40HX4K"
  • "iCE40HX8K"
  • "iCE40UP3K"
  • "iCE40UP5K"
  • "iCE5LP1K"
  • "iCE5LP2K"
  • "iCE5LP4K"
Board

One of the following:

Ecpix5

name = "Ecpix5"
pin_file = "path/to/file" # Optional
config_file = "path/to/file" # Optional
Fields

pin_file FilePath

config_file FilePath

GoBoard

name = "GoBoard"
pcf = "path/to/file" # Optional
Fields

pcf FilePath

tinyfpga-bx

name = "tinyfpga-bx"
pcf = "path/to/file" # Optional
Fields

pcf FilePath

Icestick

name = "Icestick"
pcf = "path/to/file" # Optional
Fields

pcf FilePath

Synthesis

Summary

# The name of the unit to use as a top module for the design. The name must
# be an absolute path to the unit, for example `proj::main::top`, unless the
# module is marked `#[no_mangle]` in which case the name is used.
# 
# Can also be set to the name of a module defined in verilog if a pure verilog top
# is desired.
top = "…"
# The yosys command to use for synthesis
command = "…"
# Extra verilog files only needed during the synthesis process.
# Supports glob syntax
extra_verilog = ["…", …] # Optional

top String

The name of the unit to use as a top module for the design. The name must be an absolute path to the unit, for example proj::main::top, unless the module is marked #[no_mangle] in which case the name is used.

Can also be set to the name of a module defined in verilog if a pure verilog top is desired.

command String

The yosys command to use for synthesis

extra_verilog [String]

Extra verilog files only needed during the synthesis process. Supports glob syntax

Simulation

Summary

# Directory containing all test benches
testbench_dir = "path/to/file"
# Extra dependencies to install to the test venv via pip
python_deps = ["…", …] # Optional
# The simulator to use as the cocotb backend. Currently verified to support verilator and
# icarus, but other simulators supported by cocotb may also work.
# 
# Defaults to 'icarus'
# 
# Requires a relatively recent version of verilator
simulator = "…"
# The C++ version to use when compiling verilator test benches. Anything that
# clang or gcc accepts in the -std= field works, but the verilator wrapper requires
# at least c++17.
# Defaults to c++17
cpp_version = "…" # Optional
# Extra arguments to pass to verilator when building C++ test benches. Supports substituting
# `#ROOT_DIR#` to get project-relative directories
verilator_args = ["…", …] # Optional

testbench_dir FilePath

Directory containing all test benches

python_deps [String]

Extra dependencies to install to the test venv via pip

simulator String

The simulator to use as the cocotb backend. Currently verified to support verilator and icarus, but other simulators supported by cocotb may also work.

Defaults to 'icarus'

Requires a relatively recent version of verilator

cpp_version String

The C++ version to use when compiling verilator test benches. Anything that clang or gcc accepts in the -std= field works, but the verilator wrapper requires at least c++17. Defaults to c++17

verilator_args [String]

Extra arguments to pass to verilator when building C++ test benches. Supports substituting #ROOT_DIR# to get project-relative directories

Library

Location of a library or external code. Either a link to a git repository, or a path relative to the root of the project.

compiler = {git = "https://gitlab.com/spade-lang/spade/"}
path = "compiler/"

One of the following:

Git

Downloaded from git and managed by swim


git = "…"
commit = "…" # Optional
tag = "…" # Optional
branch = "…" # Optional
Fields

git String

commit String

tag String

branch String

Path

A library at the specified path. The path is relative to swim.toml


path = "path/to/file"
Fields

path FilePath

PluginConfig

Summary

# True if this plugin needs the CXX bindings for the Spade compiler to be built
requires_cxx = true|false
# Commands required to build the plugin. Run before any project compilation steps
build_commands = ["…", …]
# The files which this plugin produces
builds = [<BuildResult>, …]
# Arguments which must be set in the `swim.toml` of projects using the plugin
required_args = ["…", …]
# Commands to run after building swim file but before anything else
post_build_commands = ["…", …]
# Commands which the user can execute
commands = {key: <PluginCommand>, …}

# Things to do during the synthesis process
[synthesis] # Optional
<SynthesisConfig>

requires_cxx bool

True if this plugin needs the CXX bindings for the Spade compiler to be built

build_commands [String]

Commands required to build the plugin. Run before any project compilation steps

builds [BuildResult]

The files which this plugin produces

required_args Set[String]

Arguments which must be set in the swim.toml of projects using the plugin

post_build_commands [String]

Commands to run after building swim file but before anything else

synthesis SynthesisConfig

Things to do during the synthesis process

commands Map[String => PluginCommand]

Commands which the user can execute

PluginCommand

Summary

# List of system commands to run in order to execute the command
# 
# Commands specified by the user, i.e. whatever is after `swim plugin <command>`
# is string replaced into `%args%` in the resulting command string. The arguments
# are passed as strings, to avoid shell expansion
script = ["…", …]

# The build step after which to run this command
[after]
<BuildStep>

script [String]

List of system commands to run in order to execute the command

Commands specified by the user, i.e. whatever is after swim plugin <command> is string replaced into %args% in the resulting command string. The arguments are passed as strings, to avoid shell expansion

after BuildStep

The build step after which to run this command

BuildStep

One of these strings:

  • "Start" Before any other processing takes place
  • "SpadeBuild"
  • "Simulation"
  • "Synthesis"
  • "Pnr"
  • "Upload"
SynthesisConfig

Summary

# Yosys commands to run after the normal yosys flow
yosys_post = ["…", …]

yosys_post [String]

Yosys commands to run after the normal yosys flow

BuildResult

Summary

# The path of a file built by this build step
path = "…"

# The first build step for which this file is required. This will trigger
# a re-build of this build step if the file was changed
[needed_in]
<BuildStep>

path String

The path of a file built by this build step

needed_in BuildStep

The first build step for which this file is required. This will trigger a re-build of this build step if the file was changed

Compiler Internals

This chapter describes some internals of the compiler and details about code generation. Normally, this is not relevant to users of the language.

Naming

This chapter describes the naming scheme used by the compiler when generating Verilog. The goal of the Verilog generator is not to generate readable Verilog, but there should be a clear two-way mapping between signal names in the source Spade code and generated Verilog. This mapping should be clear both to users reading lists of signals, for example, in VCD files, and tools, for example VCD parsers.

Variables

Because Spade does not have the same scoping rules as Verilog, some deconfliction of names internal to a Verilog module is needed.

If a name x only occurs once in a unit, the corresponding Verilog name is \x . (This is using the Verilog raw escape string system, and some tools may report the name as x). If x occurs more than once, subsequent names are given an index ordered sequentially in the order that they are visited during AST lowering1 The kth occurrence of a name is suffixed by _n{k}

Pipelined versions of names are suffixed with _s{n} where n is the absolute stage index of the stage.

Names of port type with mutable wires have an additional variable for the mutable bits. This follows the same naming scheme as the forward name, but is suffixed by _mut

The following is an example of the naming scheme

pipeline(1) pipe(
    x: bool, // "\x "
    y: (&bool, &mut bool) // "\y ", "y_o "
) {
        if true {
            let x = true; // "x_n1"
        } else {
            let x = false; // "x_n2"
        }
        let x = true; // "x_n3"
    reg; // "\x_s1 ", "x_n3_s1
        let z = true; // "\z "
}

Spade makes no guarantees about name uniqueness between generated Verilog modules.

1

This is currently the lexical order of the occurrences, i.e. names which occur early in the module are given lower indices.

Type Representation

Description of the Verilog representation of Spade types

Mixed-direction types

Types with mixed direction wires are split in two separate variables, typically <name> and <name>_mut. The structure of the forward part is the same as if the backward part didn't exist, and the backward part is structured as if it were the forward part.

For example, (int<8>, &mut int<9>, int<2>, &mut int<3>) is stored as (int<8>, int<2>) and (int<9>, int<3>).

Tuples

Tuples are stored with their members packed side to side, with the 0th member on the left.

let x: (int<8>, int<2>, bool) = (a, b, c);

is represented as

logic x[10:0];
assign x = {a,b,c};

Binary representation

aaaaaaaabbc

Enums

Enums are packed with the leftmost bits being the discriminant and the remaining bits being the payload. The payload is packed left-to-right, meaning that the rightmost bits are undefined if a variant is smaller than the largest variant.

enum A {
    V1(a: int<8>),
    V2(b: int<2>),
    V3(c: bool)
}
    9 8 7 6 5 4 3 2 1 0
    t t p p p p p p p p
    -------------------
V1: 0 0 a a a a a a a a
V2: 0 1 b b X X X X X X
V3: 1 0 c X X X X X X X