The Spade Book
By Frans Skarman, with contributions from the community
Spade is a Rust-inspired hardware description language.
Learn how to install Spade and setup your editor. Here are some suggestions to get started:
-
For a gentle introduction to Spade, there is a (work in progress) tutorial that starts with the "Hello World" of hardware — blinky.
-
If you want to jump straight into something more complex, there is a chapter on implementing a ws2128 RGB LED driver in Spade.
If you are more interested in a reference of all constructs in the language, see the language reference.
Also, if you have any questions or would like to discuss Spade with others, feel free to join the Discord community or Matrix channel.
Spade is a work in progress language and so is this documentation. Writing documentation is hard since it is hard to know what details are obvious and what things need more explanation. To make the documentation better, your feedback is invaluable so if you notice anything that is unclear or needs more explanation, please, reach out either via a GitLab issue or on Discord.
Other resources
Chapters
Installation
Before installing locally, there is a "playground" available at ▶️ play.spade-lang.org which you can use to play around with the language. The first few chapters of the book use that, so if you want to follow along with the tutorial, you can skip this chapter until prompted to install Spade locally.
At the moment, Spade works best on Linux systems, but macOS also works quite well with only a few minor issues1. On Windows, you have to use WSL.
There are a few ways to start using Spade:
-
Manually
-
Install Rust In order to install Spade manually, you need the Rust toolchain. If you don't have it installed, you can install it by following the instructions on https://rustup.rs/.
-
Restart your terminal
-
Install Swim, the Spade build tool
cargo install --git https://gitlab.com/spade-lang/swim -
You should now be able to run
swimand get a list of available commands. You probably also want to install synthesis tools and simulators at this pointTroubleshoting
- If
cargo installcommand fails witherror: linkerccnot found, you are most likely on a very freshly installed system and need to install a C compiler and linker. On distros like Ubuntu and Debian, you can do this withapt install build-essential. - If
cargo installfails witherror: failed to run custom build command for openssl-sys v0.9.110, you probably need toapt install pkg-config libssl-devor the equivalent for your distro.
- If
-
-
With a package manager
If you are on Arch Linux, you can install the
swim-gitpackage from the aur https://aur.archlinux.org/packages/swim-git -
Using Docker
The Spade Docker image, which works on macOS as well, has all the necessary tooling and environment preconfigured.
For example, here's how you would start an interactive shell where commands like
swimare available:docker run -it --rm ghcr.io/ethanuppal/spade-docker:latestMake sure you have the Docker (or podman) daemon running in the background. Do note that the image only supports x86_64 and arm64.
You should now be able to create a swim project using swim init hello_world!
Synthesis Tools and Simulators
Spade compiles to Verilog code which is simulated and synthesised (compiled to hardware) by other tools — in particular, cocotb for simulation and yosys+nextpnr for synthesis.
Automated install
The easiest way to install those tools is via Swim by running:
swim install-tools
which downloads https://github.com/YosysHQ/oss-cad-suite-build into
~/.local/share/swim. If it is installed, swim will always use the cad-suite tools instead of system tools.
NOTE: If you need to uninstall those tools, remove
~/.local/share/swim/bin
Manual install
You can also install the tools manually. Refer to the individual installation instructions in that case. The tools you need are:
- For simulation:
- Python3.8 or later
- Simulation via Icarus Verilog (using cocotb): https://github.com/steveicarus/iverilog
- Simulation via Verilator: https://github.com/verilator/verilator
- For building for hardware
- Synthesis: https://yosyshq.net/yosys/
- Place and route: https://github.com/YosysHQ/nextpnr. Build
nextpnr-ice40ornextpnr-ecp5depending on your target FPGA - Bitstream generation and upload. Depends on your target FPGA
If you're just starting out, you probably don't need all of these. Start of by simulating your designs using
cocotbandicarus, then you can move on to testing in hardware usingyosysandnextpnr.If your simulations are too slow, you can try
verilator.
Next steps
Now, move on to setting up your editor to work with Spade.
Editor Setup
Before installing locally, there is a "playground" available at ▶️ play.spade-lang.org which you can use to play around with the language. The first few chapters of the book use that, so if you want to follow along with the tutorial, you can skip this chapter until prompted to install Spade locally.
There are a variety of third-party plugins integrating Spade in different editors. For in-editor error messages and things like go to definition, you have to install the language server
cargo install --git https://gitlab.com/spade-lang/spade spade-language-server
Vim
If you use Neovim, you can use spade.nvim, which is maintained by Ethan at that GitHub repository. This plugin sets up syntax highlighting and LSP automatically along with some other quality-of-life features.
Otherwise, you can use https://gitlab.com/spade-lang/spade-vim, following the instructions at that repository for manual setup.
Vscode
Emacs
Helix
Helix supports Spade out of the box.
Zed
Other Editors
Made a plugin for your favorite editor? Submit a merge request to add it to this list!
Blinky
The traditional program to start learning any language is "hello, world!". However, printing a string in hardware is a complex task, so the "hello, world!" in hardware is usually blinking an LED.
This chapter has two versions:
- Blinky (for hardware people) for people who have some experience with digital hardware and want to learn Spade. This version focuses on the syntax of the language and makes comparisons to Verilog and VHDL, but assumes some familiarity with things like registers.
- Blinky (for software people) for people who are used to software development but are new to hardware. This version puts less emphasis on the syntax of the language, and more on the basic hardware it is describing.
Blinky (for software people)
This chapter will show the very basics of Spade and is aimed at people who are familiar with software development but are new to hardware. If you come here with some experience in hardware design with VHDL or Verilog, the Blinky (for hardware people) chapter is probably more useful.
Before blinking an LED, we can try turning on or off an LED. We can do this as
entity blinky() -> bool {
true
}
To Rust users, this will likely feel very familiar. To those familiar with
other languages, the last value at the end of a block is "returned", so this is
an entity which returns true.
If we connect the output signal of this to an LED, it would turn on. If you're curious, you can try it ▶️ on the playground
This isn't particularly interesting though, so let's do something more
interesting. Blinking an LED is the typical "Hello, World" in hardware, but
even that requires some complexity so we will build up to it. Let's first start
by making the LED turn off while we hold down a button, which first requires
taking a btn as an input:
entity blinky(btn: bool) -> bool {
and then changing the output to !btn
entity blinky(btn: bool) -> bool {
!btn
}
If you ▶️ try this, you can see that if you press the button, the LED turns off, and if you release it, it will turn on again. Here we're just simulating the resulting hardware, but if we connected this up to real hardware, it would also work!
If you think about this for a while, you may start wondering when this gets
"evaluated". In software, this "function" would be called once, giving it the value of
the button and generating a single result. But this somehow reacts to inputs!
While Spade, and many HDLs for that matter may look like software, it is
important to note that we are not describing instructions for some processor
to execute, we are describing hardware to be built on a chip. The code we
wrote says "connect input btn to an inverter, whose output in turn should be
connected to the output of the module which we externally connect to an LED.
If we want to approximate the behaviour from a software perspective, we can view the programming model of Spade either as continuously re-evaluating every value in the design, or as re-evaluating every value when the values it depends on changes.
At this point, we can start thinking about actually making an LED blink. In software we'd probably accomplish this by writing something along the lines of
def main():
led_on = False
while True:
led_on = not led_on;
set_led(led_on);
sleep(0.5);
However, because we are describing hardware, not software we can't really "loop". Every expression we write will correspond to some physical block of hardware, rather than instructions that get executed.
A Detour Over Software
Before talking about how we would make a LED blink in hardware and Spade, it is helpful to talk about how we might write a software function to "blink" an LED if we can't have loops inside our function. Remember that we can view our execution model as constantly re-evaluating our function to get its new values, roughly
def blinky():
return True
while True:
print(blinky())
On the surface, it might seem very difficult to make this thing blink, but if we had some way to maintain state between calls of the function. In software, we can achieve this by using a global variable for the state of the LED
LED_ON = False
def blinky():
global LED_ON
LED_ON = not LED_ON
return LED_ON
while True:
print(blinky())
If we run this program, we'll now get alternating True and False
True
False
True
False
...
There are some problems with this though, our value is "blinking" far too fast for us to see it blinking. If this were hardware, the LED would just look dim as opposed to clearly switching between on and off, we need to regulate it somehow. A quick way to do this would be to just call our function less often, for example, once per second. As we'll see, this is something we can kind of do in hardware, so let's try it!
import time
while True:
start = time.time()
print(blinky())
end = time.time()
# We want each iteration to take 0.5 seconds
# so we get a blinking frequency of 1 hz.
# To avoid drifting if `blinky` ends up taking
# a long time, we'll compute how long the evaluation
# took and subtract that from the period
time.sleep(0.5 - (end - start))
That works, but has a major problem: now we cannot do anything more often than
once per second, so if our program was to do more things than blinking an LED,
we're probably screwed. To solve this, we can reduce the sleep time to
something faster, but which we can still manage without having end-start become larger than the period. Being conservative, we'll aim for a frequency of 1 KHz
import time
while True:
start = time.time()
print(blinky())
end = time.time()
time.sleep(0.001 - (end - start))
If we just run our blinky now, we're back to it blinking faster than we can see, so we'll need to adjust it to compute how long it has been running and toggling the LED accordingly
COUNTER = 0
def blinky():
global COUNTER
if COUNTER == 1000:
COUNTER = 0
else:
COUNTER = COUNTER + 1
# The LED should be on in the second counter interval
return COUNTER > 500
import time
while True:
start = time.time()
print(blinky())
end = time.time()
time.sleep(0.001 - (end - start))
Back To Hardware
At this point, you have got a sense of a (pretty cursed) programming model that approximates hardware pretty well, so we can get back to writing hardware.
Almost all primitive hardware blocks are pure (or combinatorial as it is known in hardware). They take their inputs and produce an output. This includes arithmetic operators, comparators, logic gates and "if expressions" (multiplexers). Using these to build up any form of state, like our counter, will be very difficult. Luckily there is a special kind of hardware unit called a flip_flop which can remember a single bit value. These come in several flavours and by far the most common is the D-flipflop which has a signature that is roughly
entity dff(clk: clock, new_value: bool) -> bool
Its behaviour when the clock signal (clk) is unchanged is to simply remember
its current value. Flip flops become much more interesting when we start
toggling the clock. Whenever the clk signal changes from 0 to 1, it will
replace its currently stored value with the value that is on its new_value
input.
Hardware is often shown graphically, and a dff is usually drawn like this:
Using this, we can build our initial very fast blinking circuit like this:
entity blinky_dff(clk: clock) -> bool {
decl led_on;
let led_on = inst dff(clk, !led_on);
led_on
}
Don't worry too much about the syntax here, we define led_on as a dff
whose new value is !led_on. When the clk goes from 0 to 1, the dff will
take the value that is on its input (!led_on) and set it as its internal
value, which makes the LED blink. This might be easier to understand graphically:
We can also visualize the value of the signals in the circuit over time, which
looks roughly like
As soon as the clock switches from 0 to 1, the value of led_on switches to
new_value. This in turn makes the output of the inverter change to the
inverse which is now the "new new_value". Then nothing happens until the
clock toggles again at which point the cycle repeats.
At this point, you should be wondering what the initial state of the register is as right now it only depends on itself. While it is possible to specify initial values in registers in FPGAs, that's not possible when building dedicated hardware, so the proper approach is to use a third input to the DFF that we left out for now: the reset. It takes a bool which tells the flip flop to reset its current value if 1, and a value to reset to. Again, looking at the signature, this would be roughly
entity dff(clk: clock, rst_trigger: bool, initial_value: bool, new_value: bool) -> bool
When rst is true, the internal value of the dff will get set to initial_value.
Visualized as signal values over time, this looks like:
The clk and rst_trigger signal are typically fed to our hardware
externally. The clock is as you may expect from reading clock signal
specifications on hardware, quite fast. Not quite the 3-5 GHz that you may
expect from a top of the line processor, but usually between 10 and 500 MHz in
FPGAs. This means that we need to pull the same trick we did in our software
model to make the blinking visible: maintain a counter of the current time and
use that to derive if the led should be on or not.
Our counter needs to be quite big to count on human time scales with a 10 Mhz
clock, so building a counter from individual bools with explicit dffs for
each of them is infeasible. Therefore, we almost always use "registers" for our state. These are just banks of dff with a shared clock and reset.
Additionally, using our dff entity isn't super ergonomic since it requires that decl keyword, so Spade has dedicated syntax for registers. It looks like this
reg(clk) value: uint<8> reset(rst: reset_value) = new_value;
which, admittedly is quite dense syntax. It helps to break it down in pieces though
reg(clk)specifies that this is a register that is clocked byclk.1valueis the name of the variable that will hold the current register value: uint<8>specifies the type of the register, in this case an 8 bit unsigned value. In most cases, the type of variables can be inferred, so this can be left outreset(rst: reset_value)says that the register should be set back toreset_valuewhenrstis true. If the register does not depend on itself, it can be omitted
Blinky, Finally
We finally have all the background we need to drumroll 🥁 blink an LED! The code to do so looks like this
entity blinky(clk: clock, rst: bool) -> bool {
let duration = 100_000_000;
reg(clk) count: uint<28> reset(rst: 0) = if count == duration {
0
} else {
trunc(count + 1)
};
count > duration / 2
}
Looking at the python code we wrote before, we can see some similarities. Our
global count has been replaced with a reg. reg has a special scoping rule
that allows it to depend on its own value, unlike normal let bindings which
are used to define other values. The new value of the register is given in
terms of its current value. If it is duration , it is set to 0, otherwise it
is set to count + 1.
trunc is needed since Spade prevents you from overflows and underflows by
extending signals when they have the potential to overflow. count + 1 can
require one more bit than count, so you need to explicitly convert the value
down to 28 bits. trunc is short for "truncate" which is the hardware way of
saying "throwing away bits".
Those unfamiliar with Rust or other functional languages may be a bit surprised that the if isn't written as
if count == duration {
count = 0
} else {
count = trunc(count + 1)
}
This is because spade is expression based -- conditionals return values instead of having side effects. This is because in hardware, we can't really re-assign a value conditionally, the input to the "new value" field of the register is a single signal, so all variables in Spade are immutable.
If you are used to C or C++, you can view if expressions as better ternary
operators (cond ? on_true : on_false), and python users may view them as the
on_true if cond else false construct.
Play around
At this point it might be fun to play a little bit with the language, you could try modifying the code to:
- Add an additional input to the
entitycalledbtnwhich can be used to pause the counter - Use
btnto invert the blink pattern
You can try the code directly in your browser at ▶️ play.spade-lang.org
-
Most of the time when starting out you'll just have one clock, but as you build bigger systems, you'll eventually need multiple clocks ↩
Blinky (for hardware people)
This chapter will show the very basics of Spade and is aimed at people who are already familiar with basic digital hardware and want to learn the language. If you come here as a software developer, the Blinky (for software people) chapter is probably more approachable.
A blinky circuit in Spade is written as
entity blinky(clk: clock, rst: bool) -> bool {
let duration = 100_000_000;
reg(clk) count: uint<28> reset(rst: 0) = if count == duration {
0
} else {
trunc(count + 1)
};
count > duration / 2
}
The first line defines a "unit" 1 called blinky which takes a clock and a
reset signal and returns (->) a bool which will be true when the blinking LED
should be on.
This highlights an important difference between Spade and traditional HDLs: most2 units in Spade take
a number of input signals and produces an output signal instead of operating
on a set of input or output ports.
In general, Spade units are much more "linear" than their VHDL and Verilog
counterparts - Variables can only be read after their definition (unless
pre-declared using decl) and units do not mix inputs with output.
The first line in the body of the entity uses let to define a new variable called duration whose
value is the number of clock cycles in a blink period, here we assume a 100
MHz clock.
Spade is a statically typed language so duration will have a fixed type
known at compile time, however, the compiler uses type inference to infer the
types of variables where possible.
In this case, the duration variable is compared to count on the next line
which forces its type to be the same as count, i.e. uint<28> and the compiler
will ensure that the value fits in the inferred type's range.
If needed, the type of a variable can be specified explicitly using let duration: uint<28> = ....
The next few lines are a reg statement which is used to declare a register. The syntax
for these can be hard to take in at first, but it helps to break it up into pieces:
reg(clk)specifies which clock is used to clock this registercountis the name of the variable which will hold the register value: uint<28>specifies the type of the register. Normally this can be omitted but in this case the compiler is unable to infer the size without it sincecountonly refers to itself andduration.reset(rst: 0)says that the register should be reset back to0wheneverrstis asserted. At the moment, this is always done using an asynchronous reset.
Finally, the statement is ended with an = sign followed by an expression
that gives the new value of the register as a "function" of its previous value. Here, the register is set back to 0 if it has reached the duration, otherwise it is incremented by 1.
A significant difference between Spade and most other HDLs here is that its
semantics are not "imperative". We do not write
if count == duration {
count = 0
} else {
count = trunc(count + 1)
}
which is conceptually hard to map to hardware, instead the if construct returns a value
which is assigned to the register's new value.
This is much closer to the multiplexers that will be generated here than the
imperative description is, and prevents bugs if one for example, forgets to
give count a value in the else branch.
The trunc function call in the else branch is another effect of Spade's type
system. The type system is designed to prevent accidental destruction of
information.
Since a + 1 can require one more bit than a itself, the type
of count + 1 is uint<28+1>, which cannot be implicitly converted to a
uint<28>. The trunc function explicitly truncates the result back to fit in
the register's value.
The final line count > duration / 2 is what sets the output of the unit.
Whenever count is greater than half the duration of the counter, its output
will be true.
The final expression in a unit is its return value which may feel unfamiliar
at first, but eventually feels quite natural, especially when combined with
other block-based constructs. For example, the same thing is true in
if-expressions. The 0 and trunc(count + 1) are the final expressions in
the blocks, and therefore their "return" values.
A note on division: You may question the use of
/in the above example since division is usually a very expensive operation in hardware. However, divisions by powers of two are cheap, so Spade explicitly allows those. If the code was changed to/ 3, you would get a compiler error telling you about the performance implication and telling you to explicitly use combinational division if you are OK with the performance.error: Division can only be performed on powers of two ┌─ src/blinky.spade:10:24 │ 10 │ count > duration / 3 │ ^ Division by non-power-of-two value │ = help: Non-power-of-two division is generally slow and should usually be done over multiple cycles. = If you are sure you want to divide by 3, use `std::ops::comb_div` │ 10 │ count > duration `std::ops::comb_div` 3 │ ~~~~~~~~~~~~~~~~~~~~
Play around
If you want to play around with the language at this point, you can try to modify the code to do some of these things:
- Add an additional input to the
entitycalledbtnwhich can be used to pause the counter - Use
btnto invert the blink pattern
You can try the code directly in your browser at ▶️ play.spade-lang.org
-
A "unit" in Spade is similar to
entityin VHDL andmodulein Verilog. ↩ -
The
input -> outputflow is not always well suited to hardware, in those cases, ports may be used. ↩
Common Language Constructs
This chapter goes through common constructs that most languages have such as variables, expressions basic types, and conditionals. The focus is on how they work in Spade and how that is different from other languages.
Basic Expressions and Primitive Types
Expressions are the fundamental building block of Spade code. Anything with a
value is an expression - from an integer literal like 5 to arithmetic
operations like + all the way up to blocks which at the end of the day
consist of several sub-expressions.
Integers and booleans
Like most languages, Spade has a few primitive types that basic operations are applied
to. The most common primitive types in Spade are bool, int and uint which are
booleans, signed integers, and unsigned integers respectively. When building custom
hardware, we are not restricted to integers of a few fixed sizes like 8, 16 and 32 bits,
so both int and uint take a generic parameter that specifies its size. For example
uint<8> or int<10>.
Sometimes you will also encounter an error talking about Number. This is a special type
which the compiler uses until it can figure out if a number is signed or unsigned. This
will become more relevant later when we talk about type inference
Operators
Spade's operators are generally the same as any C-like language both in terms of which operators are available and their precedence.
Arithmetic
To start off, Spade naturally has operators for arithmetic +, -, *.
These prevent overflow by extending the output to guarantee that the result
fits. For addition and subtraction this means that the output is one bit larger
than the input and the input operands have to be the same size. For
multiplication, the output size is the sum of the input sizes.
It is often necessary to change the number of bits to accommodate this. The
sext function sign extends signed integers, the zext function zero extends
unsigned integers, and the trunc function truncates (removes bits) both
signed and unsigned integers.
Logic
Spade supports logic not (!), and (&&), or (||), as well as xor
(^^) as well as the corresponding bitwise operators (~, &, |, and ^).
However, Spade does not allow implicit casts between integers and bool, so using
a bitwise operator on a bool or a logic operator on an integer is not possible.
Comparison
The comparison operators (==, !=, >, <, >=, <=) work as you would expect.1
Shifts
Spade supports logic left and right shifts (<<) and (>>) as well as
arithmetic right shifts (>>>).2
Arithmetic right shifts may be unfamiliar, so here is a short explanation of what it
does: When you right shift a value, the most significant bit needs to be filled in. With
a logic shift, this is done by a 0. A consequence of this is that the sign of the shifted
value flips if it is negative. Arithmetic right shift instead replaces the most significant bits
with the most significant bits of the input. For example
+12in binary0b01010arithmetic shifted left by 2 becomes0b00010-12in binary0b10110arithmetic shifted left by 2 becomes0b11101
Division and Modulo
Spade also has division and modulo operators, but because division and modulo by non-powers of two is more
expensive to implement than the arithmetic operations, the / and % operators can
only be used to divide by powers of two. With the std::ops::comb_div function
being used if you absolutely need division anyway which the compiler helpfully informs you about.
error: Division can only be performed on powers of two
┌─ src/blinky.spade:10:24
│
10 │ count > duration / 3
│ ^ Division by non-power-of-two value
│
= help: Non-power-of-two division is generally slow and should usually be done over multiple cycles.
= If you are sure you want to divide by 3, use `std::ops::comb_div`
│
10 │ count > duration `std::ops::comb_div` 3
│ ~~~~~~~~~~~~~~~~~~~~
Integer Type Conversion
As mentioned previously, to cast a number to a lower number of bits, the
trunc function is used, while sext and zext are used to add bits to
signed and unsigned integers respectively. In order to convert between signed
and unsigned types, the .to_int() and .to_uint() methods can be used.
Numbers
Numbers can be written in decimal without a prefix, in hexadecimal with a 0x prefix,
and in binary with a 0b prefix. You can also use _ in numbers to split up groups to
make them more readable. For example
1_000_000for big numbers0b1100_0101for grouping binary digits0xff00_1234for grouping hexadecimal digits
You can also add a uN or iN suffix to numbers to specify their sign and size. For
example, 10u8 is a 8 bit unsigned value and 123i13 is a 13 bit signed value.
Integer literals without prefix do not have a size on their own, and unlike Verilog and VHDL in which integer literals are limited to 32 bits by default, Spade allows arbitrarily large integers 3. The compiler also guarantees that the value will be representable by the type it is used as. For example,
let x: uint<8> = 512;
will result in a compilation error.
Booleans
Boolean literals are as you would expect: true and false
Tuples and Arrays
Like many languages, Spade supports compound types in the form of arrays and tuples. Arrays are used when you want several values of the same type to process together, tuples are used when you want to group values of different type into one group.
Arrays are written as a list of values enclosed in [], for example [1, 5, 3, x, y]. You can also create arrays of N copies of the same value using
[value; N]. For example, an array of 10 zeros is [0; 10].
To access individual elements, use array[x] where x is an unsigned int.
You can also use array[N:M] to access sub-arrays. These are inclusive on the
left and exclusive on the right, so [0, 1, 2, 3, 4, 5][1:5] results in [1, 2, 3, 4]. Range indices must be constant values while individual element
indices can be runtime values.
Tuples are written as values separated by (). For example (10, x, false).
Tuple elements can be accessed using the # operator, for example (10, x, false)#0 is 10. Most of the time, accessing tuples through pattern matching (destructuring) is more convenient. We will talk more about pattern matching later, but for now you can write
let (x, y, z) = some_tuple;
which will make x take on the value of the first element, y the second and z the third.
-
with one small caveat, they can only be used on integers for now. ↩
-
Arithmetic left shift is the same operation as logic left shift. ↩
-
Technically, there are implementation limits that will cause problems if you try to create an integer literal with more than \(2^{32}\) bits 😉 ↩
Spicy Sxpressions
The expressions discussed in the previous sections should feel familiar to hardware developers and software developers alike, but Spade also has a few expressions that are more unusual. Rust users can probably skip ahead, since these expressions are basically the same as Rust. For everyone else, let's talk about the more spicy 🌶️ expressions in Spade:
If expressions
"Control flow" in Spade is handled a little bit different than what you may be used to, unless
you're coming from a Rust or functional programming background.
In most languages you use an if expression to "conditionally" execute code if conditions happen. For example, an absolute value operation could be written as
def abs(x):
result = x;
if x < 0:
result = -x;
However, in hardware, there is no way to "conditionally execute" a block of code. Hardware can only compute all branches, and select the corresponding output at the end, typically using a multiplexer
In order to reflect this, Spade is expression based and if expressions select values
rather than conditionally executing branches. The above example would be written as
fn abs(x: int<16>) -> int<16> {
if x < 0 {
-x
} else {
x
}
}
where the output of the function is the result of the if expression, i.e. -x if x
is negative, and x if it is positive.
Conditionals being expressions means you can do some interesting things with them, for example, you can use them as parts of arithmetic:
let result = x + if add_one {1} else {0};
This particular example is strange and probably ill-advised, but this sort of technique can come in handy.
Blocks
The other unusual expression Spade has is the block which we've seen some
examples of already; The abs function above has 3 blocks but you may not have
thought of them as blocks.
A block is written as {} which contains a list of statements (variables,
assertions etc.), and an optional final expression as the value of the block
itself.
For example,
let result = {
let sum = x + y;
sum * z
};
This is effectively the same as writing let sum_prod = (x + y) * z but it allows
you to break things into variables that are local to the block. This may seem strange
at first, but hopefully makes more sense when you find out that these blocks are
the bodies of both functions and if-expressions. For example you can of course define
variables inside the body of if-expressions
let result = if op1 {
let sum = x + y;
sum * z
} else {
x + z
};
Variables
We have seen some variables already, so this section will primarily be used to clarify a few things about them.
First, variables can be defined using let, for example
let x = 0;
Types
Spade is a strongly and statically typed language which means that every expression
has a fixed and static type, and that almost all casts are explicit; the compiler will not
automatically convert a bool to an int for example. Unlike languages such as C, C++
or Java though, Spade uses type inference to infer the type of variables based on its
definition and use. For example, in the above example, x doesn't have a fully known type, it is a
numeric value, but the exact number of bits is not known yet. However, if x is used later in
a way that constrains its type, the compiler will infer it to that specific type:
fn takes_uint8(a: uint<8>) // ...
takes_uint8(x);
Again, Spade is statically typed, so conflicting types is not allowed:
fn takes_int8(a: int<16>) // ...
takes_uint8(x);
takes_int16(x); // Type mismatch. `x` was uint<8> previously but is now int<16>
In some cases, the compiler is unable to infer the type of a variable. In such cases,
you can specify the type manually using : type after the variable name. For example:
let x: uint<8>: 0;
Scoping rules
Unlike most HDLs, Spade has more software-like scoping rules in the sense that variables are only visible below their definition. For example, this code would fail to compile
let x = y; // y used before its declaration
let y = 0;
this helps prevent combinational loops 1, and makes reading code easier to read as it forces its structure to be ordered "topologically" with values which depends on previous values being defined after those values.
decl
In some cases however, a hardware design requires feedback. For example, two registers which
depend on each other's value. In this case, Spade has a special decl keyword which pre-declares
a variable for later use.
decl y;
reg(clk) x = y;
reg(clk) y = x;
Generally, decl should be used sparringly, and unless you really know what
you are doing, make sure to have a register in every "dependency loop",
otherwise you will end up with combinational loops 1
Block scopes
Also like software, variables declared in a block as discussed in the previous section are local to that block and any sub-blocks.
let sub_result = {
let x = true;
{
let a = !x; // Allowed, the use is in a deeper nesting than the definition
}
};
let b = !x; // Disallowed, `x` is only visible inside the block it was declared
Variables are immutable
It is never possible to give a variable a new value. For example, as discussed in the previous chapter, you cannot write
let x = 0;
if cond {
x = 1;
}
and you instead have to assign x to the result of an if condition:
let x = if cond {
0
} else {
1
};
Immutability by default is common in many modern software languages, but most
allow opting out of it. Rust has the mut keyword, in javascript you can declare
a variable with let instead of const, and in C-style languages you just don't declare
a variable as const. However, Spade has no such feature, all variables are immutable and
there is no way around that.
At this point, you may be asking if it is even possible to write anything useful with no mutable variables, or your mind may be wandering back to the initial blinky example where the value of our counter changed constantly. These two thoughts are related and the thing that ties them together is that the value of a variable is not immutable, it can change as the inputs to the circuit changes, but the subcircuit that a variable refers to is fixed forever.
As an example, in the following code
let sum = a + b;
the value of sum changes as a and b change, but sum really
refers to a set of physical wire in the chip that we are compiling to -- the
output of an adder that has a and b as inputs.
-
A combinational loop is a value which depends on itself without any registers to break the dependency loop. In almost all cases, this will result in an undefined value. ↩ ↩2
Units
The basic building blocks of a Spade project are units. A unit takes a set of input signals, "processes" them, and usually produces a resulting output signal. We already saw an example of a unit in the blinky chapter, but here we will go into them in a bit more detail.
The basic syntax for defining all three is the same for all three though. They
start with fn, entity or pipeline depending on their "flavor" which we
will talk about soon, then the name of the unit is specified. The unit inputs
are specified inside () with each argument on the form name: type. The
output of the unit is specified after the parameter list as -> type, and
finally the body of the unit is specified.
As an example, the blinky from the previous chapter has the following definition
entity blinky(clk: clock, rst: bool) -> bool
which means it
- is an
entity - called
blinky - which takes 2 inputs:
clkwith typeclockandrstwith typebool - returns a
bool
At this point you are probably wondering why we keep calling them "units" when they are defined as entity. The reason for this is that units come in three "flavors": function, entity and pipeline. While they all take inputs and produce outputs, their semantics are somewhat different
-
Entities are the most general units, but as we will see, they also come with the fewest guarantees. If you need registers but don't want to use a pipeline, you should use an
entity. -
Functions are a special case of entities which don't allow registers or instantiation of non-functions. This means that they cannot contain any state, which in hardware terms means they are combinational, and in software terms means they are pure. While any function can be written as an entity, it is good practice to use functions whenever possible as it tells readers of the code that the unit is non-stateful.
-
Pipelines are a special unit which, as the name implies, is used when building pipelines. You will learn more about these in a later chapter.
In general, you should prefer to use function and pipeline where possible,
and only resort to entity in cases where you both need state, and when the
hardware you are building is not pipeline-like, for example our blinky module.
Instantiating Units
Units are not very useful if they cannot be instantiated. Functions are
instantiated using the same syntax as function calls in C-like software
languages: function_name(parameter1, parameter2).
Entities on the other hand need the inst keyword before the instantiation,
for example inst entityoname(parameter1, parameter2).
This is done to alert you as a writer of the code, and future readers of the
code that the unit you are instantiating can have underlying state.
If you do not see inst, you know that that is a function and therefore is
pure which allows you to make more assumptions about the behaviour of your
circuit without having to read through the source code of what you are
instantiating.
Finally, when instantiating pipelines, you specify the pipeline depth after
the inst, so inst(10) . This will be described in more detail later.
Passing arguments
Of course, most functions need their arguments specified, and there are two ways to pass arguments to units in Spade: by position or by name.
Positional arguments work like they do in most languages: the first value passed is matched with the first argument, the second with the second and so on. It is the syntax we have seen so far.
Named arguments have a $ sign before the argument list and allow you specify
the name of each argument along with the value it should receive as arg: value.
As an example, if we want to instantiate the following entity
entity some_entity(x: uint<8>, y: uint<8>) -> uint<8> // ...
with x=10 and y=15 we can do so with positional arguments as
inst some_entity(10, 15)
or using named arguments
inst some_entity$(x: 10, y: 15)
// or
inst some_entity$(y: 15, x: 10)
In many cases when specifying arguments by name, you have a variable where you
want to do your instantiation that has the same name as the argument you want
to pass it to. You could of course specify arg: arg, but Spade also allows
you to use a short-hand syntax and only specify arg in this case.
Continuing with our example, function, yet another way to instantiate it is therefore:
let x = 10;
let y = 15;
inst some_entity$(x, y)
You can even mix and match shorthand names with long names, which is especially useful
if you have signals with common names such as clk and rst:
entity do_something(clk: clock, rst: bool) -> uint<8> {
let x = 10;
inst takes_clk_rst$(clk, rst, x, y: trunc(x + 5))
}
However, note that you cannot mix positional and non-positional arguments!
Which style to use depends on your application and code, you should strive for the variant that gives the most readable code. Sometimes that means you pass arguments by position because the order is obvious while other times, you opt to pass arguments by name because your unit takes too many signals to keep track of their positions.
For Software People: Instantiation vs calling
Instantiation is similar in behaviour to "calling" in software terms, but because we are building hardware, we cannot simply "transfer control flow" to another function. Instead, we copy the hardware inside the function to our "chip" and connect its inputs and outputs as appropriate.
As an example, if we define the following functions
fn add(a: uint<16>, b: uint<16>) -> uint<16> {
trunc(a + b)
}
fn mul(a: uint<16>, b: uint<16>) -> uint<16> {
trunc(a * b)
}
fn sel(a: uint<16>, b: uint<16>, cond: bool) -> uint<16> {
if cond {a} else {b}
}
which generate the following hardware
and then use them as part of a bigger function:
fn mul_or_add(a: uint<16>, b: uint<16>, multiply: bool) -> uint<16> {
sel(add(a, b), mul(a, b), multiply)
}
it generates this hardware:
This is important to keep in mind as a very important metric for resource usage in hardware is the area of the chip being used. In software, an expensive function only used very rarely is relatively cheap since the time taken for the program to run is the main cost. However, in hardware, as soon as a unit is instantiated, you pay the cost upfront, regardless of it is used millions of times per second or just once over the lifetime of the chip.
In addition, it is important to keep in mind how much area each function and operator uses. In the graphics drawn now, the multiplier looks as big as the adder, but in practice, the size of the adder grows as \(O(n)\) in the number of bits, while the multiplier grows as \(O(n^2)\). In FPGAs, things are even trickier as they have built in multipliers. While you have spare multipliers, they are free in terms of other resources, but they themselves are finite. The resource usage of different units is generally something you will learn over time.
Naming conventions
While not strictly required, unit names are usually written using
snake_case, so are variable names. User defined types use on PascalCase
while constant values use SCREAMING_SNAKE_CASE
Exercises
Modify the blinky code from the previous chapter to do the following
- Break the check for
count > (duration / 2)into a function- Call with named arguments
- And positional arguments
- Break the counter logic out into its own
unit- Should it be an entity or function?
Here is a link to the code on the ▶️ playground
Brief intro to generic parameters
We will discuss the type system in more detail later, but you will most likely come across a few generic functions before then, so here is a quick introduction.
In the functions we have seen so far, the type of the arguments has been
specified explicitly, for example, sel in the example above takes two
uint<16> and a bool. However, this is quite restrictive, we may want sel
to operate on other sized integers, or other types entirely. There is nothing in
that function that requires 16-bit unsigned integers.
We can redefine sel to make the values it selects "generic" as follows:
fn sel<T>(a: T, b: T, cond: bool) -> T {
if cond {a} else {b}
}
which defines a new local type T that can be substituted for any other type in the
implementation, as long as that same type is used everywhere T is.
We can now instantiate sel with different types
let x_16: uint<16> = 10;
let y_16 = 10u16; // You can specify the type of integers using `u<size>` or `i<size>`
let max_16: uint<16> = sel(x_16, y_16, x_16 > y_16);
let (x_32, y_32) = (0, 0);
let selected: int<32> = sel(x_32, y_32, select_x);
In some cases, the typeinference is unable to infer the generic parameters of
an instance which you can resolve by specifying them using the "turbofish"1
syntax (::<>). Like function arguments, type parameters can be specified positionally or by name using ::<> or ::$<>:
// We don't have enough information about what type the integers have here, we'll
// get a compiler error
let selected = sel(10, 20, select_10);
// Turbofish solves that
let selected = sel::<uint<8>>(10, 20, select_10);
let selected = sel::$<T: uint<8>>(10, 20, select_10);
Types
The powerful type system is one of the features that make Spade stand out the most compared to contemporary HDLs. For software developers already familiar with Rust, the Spade type system is very similar to that of Rust, to the point where you can skim this section and assume that you're writing Rust.
Structs
Like most languages, Spade supports structs to encapsulate related values. For example, a struct containing a field named a with type int<8> and b with type bool can be written as
struct IntAndBool {
a: int<8>,
b: bool,
}
Like many languages, accessing struct fields is done with .
let x = instance.a;
let y = instance.b;
Structs are initialized as if they are functions. This means you can either do so with positional arguments:
let instance = IntAndBool(0, true);
or with named arguments:
let instance = IntAndBool$(a: 0, b: true);
Tuples
Sometimes, defining a named struct just to group values is overkill, which makes tuples
a useful alternative. Tuple types are written as (type1, type1, ...) and values are written
as (a, b, ...). As an example, a function that wraps a bool and int<8> in a tuple can
be written as
fn example(a: int<8>, b: bool) -> (int<8>, bool) {
// ^^^^^^^^^^^^^^ Return a tuple
(a, b)
// ^^^^^^ Construct the return value
}
Tuple elements can be accessed using the # operator, i.e.
let a = tup#0;
let b = tup#1;
though in most cases it is better to use destructuring
Destructuring
Tuples, structs, and most other types can be destructured to gain access to the inner values. For example, the tuple indexing above can be replaced with
let (a, b) = tup;
The big advantage of destructuring over indexing is that you cannot forget to
account for all fields. If someone adds another field to tup, you will get a
compilation error saying that field also needs to be taken into account.
Structs can also be destructured, and like instantiation it can be done with both positional and named arguments
let IntAndBool(x, y) = instance;
let IntAndBool$(a: x, b: y) = instance;
Like named arguments, you can also use shorthand notation, to bind a field name to a variable of the same name:
let IntAndBool$(a, b) = instance;
// Is the same as
let a = instance.a;
let b = instance.b;
Destructuring can also be done recursively, for example:
let (IntAndBool(x, y), z) = instance_and_bool;
Arrays
Arrays too work like most other languages. They are a collection of a fixed
number of identical values. Array types are written as [T; N] where T is
the contained type and N is the number of elements. For example, an array
of ten 8-bit integers is written as [int<8>; 10]. For more details on initializing and accessing arrays, see the previous Basic Expressions chapter
More fancy types
Beyond these types which are similar to those found in many languages, Spade has some more powerful type system features that are discussed in the next section.
Enums and Pattern Matching
The previously discussed type system features behave very similarly to other languages,
but the enums in Spade are considerably more powerful than those found in languages like
C and VHDL.
Like C and VHDL enums, an enum in Spade takes on one of several values, for example it can be used to represent different colors:
enum Color {
Red,
Green,
Blue
}
Unlike C, enums are namespaced and statically typed, so you cannot convert them to or from
integers, and to create them, you need to use their full name, i.e.
Color::Red, not just Red. (The namespacing system will be discussed in more
detail in a later chapter.)
To access an enum, you use the match expression, which like if "returns" a value for every branch. For example,
to convert the Color enum into RGB values, you would write
fn to_rgb(color: Color) -> (uint<8>, uint<8>, uint<8>) {
match color {
Color::Red => (255, 0, 0),
Color::Green => (0, 255, 0),
Color::Blue => (0, 0, 255),
}
}
Payload
So far, enums do not seem to be "considerably more powerful" than what you may be used to, but that is because we haven't discussed payloads.
In addition to being one of several variants (Red, Green, Blue in this case), each variant can have associated values which are only present when the enum is that particular variant. For example, we can augment our Color enum with two new variants: Grayscale and Custom like this
enum Color {
Red,
Green,
Blue,
Gray{brightness: uint<8>},
Custom{r: uint<8>, g: uint<8>, b: uint<8>},
}
When the enum variant is Gray, there is an additional field available
containing the brightness of the color, and when the variant is Custom, the
full RGB value is available. However, these are only accessible when the enum
is of the right variant, and this is checked by the compiler.
To access these variants, we use the match block again:
fn to_rgb(color: Color) -> (uint<8>, uint<8>, uint<8>) {
match color {
Color::Red => (255, 0, 0),
Color::Green => (0, 255, 0),
Color::Blue => (0, 0, 255),
// Grayscale has the brightness value in all three channels
Color::Gray(br) => (br, br, br),
// For custom colors, just map the channels directly
Color::Custom(r, g, b) => (r, g, b)
}
}
Initializing enum variants is done the same way as strutcs, they are simply functions and as such can be instantiated with both named and positional arguments
let red = Color::Red; // Variants with no members do not need ()
// Positional arguments
let red = Color::Custom(255, 0, 0);
let bright_gray = Color::Gray(200);
// Named arguments
let red = Color::Custom$(r: 255, g: 0, b: 0);
let bright_gray = Color::Gray$(brightness: 200);
Pattern Matching
The branches in a pattern can be much more complex than simply matching on enum variants. To showcase this, we'll define a new enum with some more interesting variants
enum Example {
Empty,
Int{val: int<8>},
Tuple{value: (int<8>, bool)},
Struct{value: IntAndBool},
Color{color: Color},
}
struct IntAndBool { a: int<8>, b: bool }
Like the destructuring we saw of structs and tuples before, you can destructure things recursively inside the match statement. For example:
match example {
Example::Empty => 0,
Example::Int(val) => val,
Example::Tuple((a, b)) => a,
Example::Struct(IntAndBool(a, b)) => a,
Example::Color(_) => 0,
};
In the last branch, we use _ to ignore a member we don't care about, in this case the color.
If we do care about the color, we can do pattern matching recursively on enums too:
match example {
Example::Empty => 0,
Example::Int(_) => 1,
Example::Tuple(_) => 2,
Example::Struct(_) => 3,
Example::Color(Color::Red) => 4,
Example::Color(Color::Green) => 5,
Example::Color(Color::Blue) => 6,
Example::Color(_) => 7,
};
You may notice that the last branch in this example leaves out a few of the color members
we defined previously, and has a branch with Example::Color(_). This demonstrates an important
aspect of the match expression: the first branch which matches a value will be taken, so in
this case red, green and blue are handled by the explicit branches, while Gray and Custom
are handled by the "fallback branch". If the Example::Color(_) branch was placed before the others, it would match all color values.
The compiler checks that all match expressions are complete, i.e. there are no values for which
no branch will match. As an example, if the final Example::Color(_) branch is omitted, the following
error is produced.
error: Non-exhaustive match: patterns
Example::Color(color: Color::Gray(brightness: 0..255)),
Example::Color(color: Color::Custom(r: 0..255, g: 0..255, b: 0..255)) not covered
┌─ src/enums.spade:86:1
│
86 │ ╭ match example {
87 │ │ Example::Empty => 0,
88 │ │ Example::Int(_) => 1,
89 │ │ Example::Tuple(_) => 2,
· │
93 │ │ Example::Color(Color::Blue) => 6,
94 │ │ };
│ ╰─^
patterns Example::Color(color: Color::Gray(brightness: 0..255)),
Example::Color(color: Color::Custom(r: 0..255, g: 0..255, b: 0..255))
not covered
Patterns can also contain values. For example, you can pattern match on a bool
match b {
true => 1,
false => 0,
};
or integers:
match 0u8 {
0 => 0,
1 => 0,
2 => 0,
_ => 1,
};
Of course, these value patterns can be used recursively in other patterns:
let is_black = match color {
Color::Gray(0) => true,
Color::Custom(0, 0, 0) => true,
_ => false,
};
Example: A state machine
One common use case of enums and match statements in Spade is to write state
machines. To showcase this, we will write a simple state machine that blinks an
LED thrice after a button is pressed.
We will have two "main states": Idle and Blinking, where idle is waiting for the button
to be pressed, and blinking is doing the actual blinking.
To blink, we also have to keep track of
- How many times we have left to blink
- How long we have blinked
We can encode this as an enum like this
enum State {
Idle,
Blink{blinks_left: uint<2>, duration_left: uint<16>}
}
And then use pattern matching to implement the state machine itself like this:
reg(clk) state reset(rst: State::Idle) =
match (state, btn) {
// If we're not blinking and the user isn't pressing the button,
// stay in the idle state
(State::Idle, false) => state,
// If we're in idle, and the user clicks the button, go to the blink
// state
(State::Idle, true) => {
State::Blink$(blinks_left: 2, duration_left: 5_000)
},
// If we have no blinks left, and are done with this blink,
// go back to idle
(State::Blink$(blinks_left: 0, duration_left: 0), _) => {
State::Idle
},
// If we have blinks left (blinks_left != 0), but we're done
// with this blink, start the next blink
(State::Blink$(blinks_left, duration_left: 0), _) => {
State::Blink$(blinks_left: trunc(blinks_left-1), duration_left: 5_000)
},
// Otherwise, decrement the duration
(State::Blink$(blinks_left, duration_left), _) => {
State::Blink$(blinks_left, duration_left: trunc(duration_left - 1))
},
};
We match on both the current state, and input ((state, btn)), and ignore the input
while we are blinking with a wildcard (_). In the blinking state, we use integer patterns
and priority to handle three cases
- We are done blinking
- We are done with this blink
- None of the above
The output of the module can also be written with a match block on the current state:
match state {
State::Idle => false,
State::Blink$(blinks_left: _, duration_left) => {
if duration_left < 2_500 {
false
} else {
true
}
}
}
You can try this example ▶️ in the playground
Exercises
Modify the blink_thrice entity to
- Blink 4 times
- Have a cooldown of one second between button presses
Here is a link to the code on the ▶️ playground
Pipelines
Pipelines are an important construct in most hardware designs, and one of the key unique features of Spade is its native support for pipelining.
Like the blinky chapter, this one has two versions:
- Pipelines (for software people) for software developers, which introduces what pipelines are in addition to how they are expressed in Spade.
- Pipelines (for hardware people) for people who are already familiar with hardware design and how pipelines work.
Pipelines (for software people)
This section is yet to be written. For now, see Pipelines (for hardware people).
Pipelines (for hardware people)
Pipelining is traditionally a tedious and error prone process. Designers need to ensure that all signals are in sync by manually inserting pipeline registers and more importantly, ensure that the correct registers are used for the correct expression. The problem is made even worse when the depth of a pipeline needs to change for some reason. Then the developer has to ensure that all register references are updated accordingly throughout the design.
Spade natively includes a pipelining construct that ensures that pipelines without feedback are correct by construction and which makes it significantly easier to write and reason about pipelines with feedback.
A basic pipeline
Let's look at a basic example of a pipeline which computes multiplication or
addition of two numbers depending on an Op signal:
enum Op {
Add,
Mul
}
pipeline(1) compute(clk: clock, op: Op, x: int<18>, y: int<18>) -> int<36> {
let sum = x + y;
let prod = x * y;
reg;
match op {
Op::Add => sext(sum), // Sign extend to match mul
Op::Mul => prod,
}
}
The head of a pipeline looks similar to the entity and fn
definitions that we saw before but includes a number in parenthesis. This
number is the depth of the pipelines, i.e. the number of registers it
contains which is the same its latency from input to output.
While the compiler could in theory infer this number from the body, it always
has to be specified since it is a very important part of the public "API" of the
pipeline. Without reading the body of the pipeline, you know how many clock cycles
you have to wait between input and output.
The first two lines of the body of the pipeline are somewhat uninteresting: they compute a sum and a product and store them in corresponding variables.
The next line reg; is another pipeline specific construct. It is used to add
a new stage to the pipeline which is done by creating a new pipelining register for
every variable above the reg; statement, and re-mapping any references to
those variables to the pipelined version below the reg; statement.
The final match statement selects whether to use the "sum" or "product"
value depending on the op variable. Crucially, because this is a pipeline,
the compiler ensures that the three variables are delayed the same amount, so
there will be no interleaving of op from the previous cycle with the sum
and prod from the current cycle.
All this means that the resulting hardware looks like this:
Nested Pipelines
Spade of course also supports nested pipelines, let's extend the example above to showcase how that is done.
pipeline(1) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
let result = x * y;
reg;
result
}
pipeline(1) compute(clk: clock, op: Op, x: int<18>, y: int<18>) -> int<36> {
let sum = x + y;
let prod = inst(1) mul(clk, x, y);
reg;
match op {
Op::Add => sext(sum), // Sign extend to match mul
Op::Mul => prod,
}
}
Here, the multiplier from the previous example has been broken out into its
own sub-pipeline with its own internal register. Since the compiler is aware of
this, it will ensure that the signals are still in sync, in this case by not
inserting an extra register for the prod signal.
Spade also requires you to specify the depth of pipelines when instantiating them. This is done in order to make sure that when you change the depth of a pipeline, you also make sure that that change does not affect the behaviour where that pipeline is instantiated.
Compiler guarantees
If you synthesize the previous example on a typical FPGA, you may realize that
we are not using the multipliers in the DSP blocks as efficiently as we could -
they have built in optional pipelining registers that allow us to raise the
\(f_{max}\). This means we could get higher performance from our design by
adding 2 more regs to our mul pipeline. Traditionally, this would require
updating a bunch of code, but with Spade, all we have to do is make the change
to mul:
pipeline(1) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
let result = x * y;
reg;
result
}
The astute reader will notice that the latency of this pipeline is now wrong, oh no 😱. Luckily, even if you didn't notice this problem, the compiler did:
error: Pipeline depth mismatch. Expected 1 got 3
┌─ src/pipelines_hw.spade:40:1
│
40 │ ╭ pipeline(1) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
│ - Type 1 inferred here
41 │ │ let result = x * y;
42 │ │ reg;
43 │ │ reg;
44 │ │ reg;
45 │ │ result
46 │ │ }
│ ╰─^ Found 3 stages in this pipeline
│
= note: Expected: 3
Got: 1
Error: aborting due to previous error
Let's update the code accordingly, and while we're at it change the repeated
reg; to reg*3; which is a shorthand for the same thing:
pipeline(3) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
let result = x * y;
reg * 3;
result
}
Now mul looks correct, but if we look at the bigger picture we're not out of the weeds yet. Our compute pipeline as currently described is now this abomination which will have a very different output than before:
Luckily, the compiler once again has our back here. If we compile the new code
pipeline(3) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
let result = x * y;
reg * 3;
result
}
pipeline(3) compute(clk: clock, op: Op, x: int<18>, y: int<18>) -> int<36> {
let sum = x + y;
let prod = inst(3) mul(clk, x, y);
reg * 3;
match op {
Op::Add => sext(sum), // Sign extend to match mul
Op::Mul => prod,
}
}
error: Pipeline depth mismatch
┌─ src/pipelines_hw.spade:61:21
│
53 │ pipeline(3) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
│ - swim_test_project::pipelines_hw::m3::mul has depth 3
·
61 │ let prod = inst(1) mul(clk, x, y);
│ ^ Expected depth 3, got 1
│
= note: Expected: 3
Got: 1
This means we have to update the inst(1) to inst(3) to match the definition of mul which
gives us yet one more compiler error
error: Use of swim_test_project::pipelines_hw::m3::prod before it is ready
┌─ src/pipelines_hw.spade:65:18
│
65 │ Op::Mul => prod,
│ ^^^^ Is unavailable for another 2 stages
│
= note: Requesting swim_test_project::pipelines_hw::m3::prod from stage 1
= note: But it will not be available until stage 3
This error is saying that there aren't enough pipeline registers between our
definition of prod and its use, which is the error we were seeing graphically
before. We'll update our compute pipeline accordingly which finally gives
pipeline(3) mul(clk: clock, x: int<18>, y: int<18>) -> int<36> {
let result = x * y;
reg * 3;
result
}
pipeline(3) compute(clk: clock, op: Op, x: int<18>, y: int<18>) -> int<36> {
let sum = x + y;
let prod = inst(3) mul(clk, x, y);
reg * 3;
match op {
Op::Add => sext(sum), // Sign extend to match mul
Op::Mul => prod,
}
}
At this point, the compiler is happy, and we should be too because the hardware correctly uses the DSP blocks giving faster performance, and its output is still the same as before (though of course, the latency has changed).
Fearless Refactoring
At this point it is worth taking a step back and analyzing what happened. We
started out with a pipeline that computed a correct value, but that was not
implemented as efficiently as it could have been. To fix this, we made a minimal change
to the mul pipeline to more efficiently use the DSP blocks.
Then, by running the compiler and mindlessly addressing the things it
complained about, we updated the rest of our code to reflect this change.
Once the compiler stopped complaining, our code still has the correct output
but runs faster!
If our code is used elsewhere in the project, or by someone else in another project, the compiler would start complaining there until all the issues are fixed.
This is something that happens in several places in Spade, the type system being another notable example. You make a small localized change, then the compiler tells you every place you need to change to reflect that change in order to get back to hardware that still works correctly. Essentially, you can refactor code without having to think about the consequences.
Feedback
The pipelines discussed so far are useful if you're building a compute pipeline where you have no dependence between values. However, this is not always the case. A notable example of this is processors which are often pipelined but where values certainly are not independent. In this case, the guaranteed correctness when adding or removing registers is no longer possible, but being able to reason about pipelines structurally as individual stages rather than a soup of control registers mixed with pipeline registers is still very helpful.
For cases like this, Spade has support for "stage references", where you can refer to
values from previous or future stages using stage(...).
As an example, to write a pipeline that computes the sum of a window "around the current" value, we can write
pipeline(2) window(clk: clock, x: int<16>) -> int<18> {
reg;
reg;
x + stage(-1).x + sext(stage(-2).x)
}
where we use relative stage references to refer to x from the stage above,
and from 2 stages above. The corresponding hardware looks like this:
As you can see, negative references refer to stages "above" the current stage while positive references refer to stages "below". Since stages "above" have gone through fewer registers, they are values from the "future" while positive references are values "from the past".
You can also use labels ('label) to refer to stages, for example, if you
wanted to refer to a variable without delay you can define the first stage as
'first and then refer to variables from that stage using stage(first).
pipeline(2) without_delay(clk: clock, x: int<16>) -> int<16> {
'first
reg;
reg;
stage(first).x
}
Dynamic pipelines
Spade has experimental support for stalling of pipelines as documented in the language reference section. However, make sure you follow the note at the top of that page to avoid unexpected bugs.
Spade projects and swim
So far the Spade playground has been a convenient way to get started with the language, but as our projects get more complex we will move on to real Spade projects. If you haven't already, go back and read the installation to install Swim and its surrounding tools.
swim is the build tool for Spade, it manages the files in your project, sets
up and calls the Spade compiler for you, can install and run backend tools for
you, and is used to run tests. This is the main program you will interact with
while writing Spade.
For software developers, think of
swimas thecargo,npm,pipetc. of the Spade world.
Creating a project
You can create a Spade project with swim init. If you run it in an empty
directory, it will initialize a project with the same name as the directory.
You can also run swim init project_name to create a new directory for your
project.
Project structure
A swim project has two important parts. The source code in the src folder,
and a project configuration file in swim.toml. For now, you can just write
your code in src/main.spade until we discuss namespacing later.
Building your project
Swim has a few sub-commands for building your code. First, swim build simply compiles the Spade code into Verilog.
From there, you can either simulate or synthesize your code. To Simulate, use
swim sim (or swim test or swim t), though unlike the playground this
requires writing tests, which we will cover in the next section.
Hint You don't actually need to run
swim buildandswim test,swimwill callswim buildautomatically if it is required, so you can simply runswim test.
Uploading to real hardware
For now, in this tutorial you can continue running simulation and skip this section, but if you want to try your code on real hardware, this is how to do it:
If you want to synthesize your code and run it on an FPGA, use swim upload which will
synthesize, place and route, pack, and then upload the result to whatever FPGA
board you have configured. These steps require configuring the project for your
target FPGA though, see the project configuration
section for details, or set
your project up with a template as described below
Hint Like
swim test, you don't actually need to run bothswim buildandswim upload,uploadwill runbuildautomatically if it is required.
swim uploadalso consists of several steps, namelyswim synth,swim pnr,swim packandswim upload. These can be helpful to run individually in some cases, but most of the time simply runningswim uploadis the easiest option
Project templates
You can also create a project from a template which will pre-populate the
configuration file for synthesis for a particular FPGA. If you have an FPGA
board already, run swim init --list-boards. If your board is supported, you
can then run swim init --board <your_board> to quickly get started.
If your board isn't supported yet, you can also fill in the
[simulation],[pnr],[packing]and[upload]fields manually, see the project configuration section for the
Simulation and Testing
Because it is time-consuming and difficult to debug hardware, most hardware projects use simulation to speed up the development process and ease debugging.
Tests are not written in Spade itself, instead they are written in Python using cocotb or C++ in Verilator. Cocotb is easier to set up and nicer to use but can be quite slow.
If you haven't already, install the tools by following the installation instructions
Cocotb
Any Python files in the test directory will be run with cocotb. The first line must be a
comment on the form
# top = <path to unit under test>
for now, you probably put your code in main.spade, and then simply writing top = your_unit_name will work.
In general, the paths used here follow the same rules as the namespacing.
Tests are asynchronous functions annotated with @cocotb.test(), they take a single
input which is the design under test.
When working with Spade, you generally want to be able to use Spade values rather than
pure bit vectors. To do so, import SpadeExt from spade and instantiate SpadeExt class, passing the dut to hits constructor.
You can then access the inputs of your unit using .i.<input name> and the
output using .o. If the output of the unit is a struct, you can refer to
individual fields using .o.<field name>
As an example, consider this unit which computes a+b and a*b with a latency of
one cycle:
struct Output {
sum: int<9>,
product: int<16>
}
pipeline(1) add_mul(clk: clock, a: int<8>, b: int<8>) -> Output {
let result = Output$(
sum: a+b,
product: a*b
);
reg;
result
}
A test bench for this module looks like this (this assumes that the Spade code
is in src/cocotb_sample.spade. If this is not the case, adjust the # top=cocotb_sample::add_mul part to reflect your module name):
#top = cocotb_sample::add_mul
import cocotb
from spade import SpadeExt
from cocotb.clock import Clock
from cocotb.triggers import FallingEdge
@cocotb.test()
async def test(dut):
s = SpadeExt(dut) # Wrap the dut in the Spade wrapper
# To access unmangled signals as cocotb values (without the spade wrapping) use
# <signal_name>_i
# For cocotb functions like the clock generator, we need a cocotb value
clk = dut.clk_i
await cocotb.start(Clock(
clk,
period=10,
units='ns'
).start())
await FallingEdge(clk)
s.i.a = "2"
s.i.b = "3"
await FallingEdge(clk)
s.o.sum.assert_eq("5")
s.o.product.assert_eq("6")
s.i.a = "3"
s.i.b = "2"
await FallingEdge(clk)
s.o.sum.assert_eq("5")
s.o.product.assert_eq("6")
s.i.a = "0"
s.i.b = "1"
await FallingEdge(clk)
s.o.sum.assert_eq("1")
s.o.product.assert_eq("0")
You can then run the test using swim test or swim t. If you want to run all
tests in a specific file, run swim test <pattern>. All files which contain the
pattern will be run, for example, swim test abc will run tests in both abc.py
and cdeabc.py.
You can also run individual tests using -t <test name> though here, the name has
to be exactly the name of the test, not a pattern
For more information on the cocotb api, refer to its documentation
Viewing the waveform
One cool thing about HDL simulation is that it tracks how all signals trace over time, allowing you to see exactly how your circuit behaves. These traces are called waveforms, and stored in a file which you can view with a waveform viewer.
Once all tests are run, Swim will print the result of each test, along with a .vcd file in which the waveform is stored:
...
ok test/cocotb_sample.py 0/1 failed
🭼 test ok [build/cocotb_sample_test/cocotb_sample.vcd]
This file can then be opened in a waveform viewer for visualization. We recommend using surfer since it was originally developed for Spade and comes with translation from raw bit vectors back to proper Spade values.
If you have Surfer installed already, you can use surfer <path-to-vcd>, where <path-to-vcd> is the path printed by the swim. If you haven't installed Surfer, but installed the simulation tools with swim install-tools, you can use swim command surfer <path-to-vcd> to open it.
If you have a relatively modern terminal, Swim also supports clickable links for opening wave files for each test, but it needs an initial automated setup. Just run
swim setup-links. Now Swim will print two clickable links, one for surfer and one for gtkwave:ok test/cocotb_sample.py 0/1 failed 🭼 test ok [build/cocotb_sample_test/cocotb_sample.vcd ([🏄] [🌊])]
Tips and Tricks
Tests can not be run on generic units. Therefore, it is often a good idea to
define a wrapper function, typically called a harness around your units where you
specify concrete types for the generic variables.
Verilator
C++ source files with the .cpp extension in the test directory will be
simulated using Verilator. Like cocotb, these files consist of a set of test
cases, but the test cases are defined using macros rather than attributes.
The Spade path to the top module is specified using a comment, i.e.
// top = path::to::module
where the path is relative to the current project.
After that, the Verilog name of the top module must be specified using a
#define. Unless you know the details of how Verilator name mangling works,
you almost certainly want to specify #[no_mangle(all)] on your unit under test.
The last thing you need to do before defining your tests is to include <verilator_util.hpp>.
After all your test have been defined, end test file with MAIN which defines
a main function which is compatible with Swim.
// top=main::main
#define TOP main
#include <verilator_util.hpp>
TEST_CASE(it_works, {
// Your test code here
return 0;
})
MAIN
Accessing inputs and outputs
Like the cocotb API, there is a wrapper around Spade types to allow easier interactions with your design.
As an example, consider testing the following code
struct SubStruct {
b: int<10>,
c: int<5>,
}
struct SampleOutput {
a: Option<int<20>>,
sub: SubStruct
}
#[no_mangle]
fn sample(a: Option<int<20>>, b: int<10>, c: int<5>) -> SampleOutput {
SampleOutput(a, SubStruct(b, c))
}
In the TEST_CASE macro, you have access to two variables: dut and s.
dut is the raw verilator interface around your module which. s is the Spade
wrapper which has a field i for its inputs and o for its output.
You can set the value of inputs to your module with s.i->input_name = "<Spade expression>", for example:
s.i->a = "Some(5)";
s.i->b = "10";
s.i->c = "5";
Similarly, you can compare the output to a Spade expression
using s.o == "<Spade expression>"
If your unit under test returns a struct, you can also access its fields and sub-fields
as fields on the output struct, for example s.o->field->subfield == "<Spade expression>".
Finally, to assert that an output value is what you expect, you can use the ASSERT_EQ macro
which takes s.o or a subfield, and compares it against a Spade expression. The advantage
of using this macro over a C++ assert is that you get a diff print, both with the Spade value
and the underlying bits.
For example, tests the fields in our example look like:
ASSERT_EQ(s.o, "SampleOutput$(a: Some(5), sub: SubStruct$(b: 10, c: 5))");
ASSERT_EQ(s.o->a, "Some(5)");
ASSERT_EQ(s.o->sub, "SubStruct$(b: 10, c: 5)");
ASSERT_EQ(s.o->sub->b, "10");
ASSERT_EQ(s.o->sub->c, "5");
Clock generation
Clocks need to be ticked manually in verilator. The Spade clock type does not
allow direct assignment, so the clock needs to be accessed via the Verilator
dut. Spade mangles inputs names as <name>_i, so if you want to set clk,
you would set dut->clk_i. Or you can mark the clock input with #[no_mangle]
The following code will tick the clock once
dut->clk_i = 1;
ctx->timeInc(1);
dut->eval();
dut->clk_i = 0;
ctx->timeInc(1);
dut->eval();
Since this is so common, it is helpful to define a macro for it:
#define TICK \
dut->clk_i = 1; \
ctx->timeInc(1); \
dut->eval(); \
dut->clk_i = 0; \
ctx->timeInc(1); \
dut->eval();
which can then be used like this:
s.i->a = "5";
s.i->b = "10";
TICK;
ASSERT_EQ(s.o, "15");
Alternative test directory
If desired, you can change the name of the test directory by specifying the new name
in swim.toml as follows
[simulation]
testbench_dir = "not/test"
Unless you have good reason to do this, it is better to leave the default directory.
Namespaces
Like most modern languages, everything you define in Spade is defined in some
namespace. Namespaces are separated by ::, for example path::to::thing
There are two ways you can define namespaces, first, you can explicitly write a module with
mod
mod submodule {
fn test() {}
}
and in the same file you can refer to that function as
submodule::test()
In addition, every file in a Spade project has its own namespace. First, the namespace of the whole project is set by the name specified in swim.toml, for example
# swim.toml
name = "documentation_example"
Every Spade project needs a main.spade file, and the content there is put
directly in the namespace of the project. I.e. if main.spade contains
// main.spade
fn free_standing() {}
mod submodule {
fn test() {}
}
the full path to the two functions will be
documentation_example::free_standing;
documentation_example::submodule::test;
If there are more files in the src directory, those need a corresponding mod filename; in main.spade, and the full path to anything defined in those files will be
documentation_example::filename::function_in_filename;
In main.spade you can refer to things in the submodules simply as filename::function_in_filename, but in sibling files in the same project, you need to use the full path. I.e. in other_filename.spade you need to write
// other_filename.spade
documentation_example::filename::function_in_filename;
to access the function
NOTE: Most of the time, you don't want to write the name of the current project in your imports, instead you can write lib to refer to the current project, i.e. you can change the last example above to
// other_filename.spade
lib::filename::function_in_filename;
Importing with use
Writing the full path to every function would get very tedious quickly, so you can import things with the use statement. For example, you can write
use lib::filename::function_in_filename;
function_in_filename()
or import the whole filename namespace:
use lib::filename;
filename::function_in_filename()
External dependencies
Spade supports adding fetching dependencies, currently from git repositories. For example, you can import an implementation of common digital protocols from https://gitlab.com/spade-lang/lib/protocols by adding the following to swim.toml
[libraries]
protocols.git = "https://gitlab.com/spade-lang/lib/protocols"
Everything in that project will then get placed in a protocols namespace in your project. For example, you can import the UART transmitter implementation in uart.spade from that project with
use protocols::uart::uart_tx;
Ports and wires
If you prefer documentation in video form there is a talk available on this topic.
Note that the syntax of
&muthas changed toinv &since that talk
Units in Spade, unlike most HDLs are similar to functions in programming languages in the sense that the receive a set of values, and their output is another set of values. For example, a function that adds 2 numbers is written as
fn add(x: uint<8>, y: uint<8>) -> uint<9> {
x + y
}
This makes sense for a lot of hardware where there is a clear flow of values from inputs to outputs, but this is not always the case. wires and ports are a language feature that helps deal with these cases.
To understand wires and ports, it helps to look at a motivating example. If you're building a project consisting of 2 modules that communicate with each other via some other module, such as a memory, you want your hardware to look something like this:
Without using ports, you'd have to write the signature of this hierarchy as
pipeline(1) mem(clk: clock, addr1: uint<16>, addr2: uint<16>) -> (T, T)
pipeline(4) mod1(clk: clock, inputs: I, data: T) -> (uint<16>, O)
pipeline(3) mod2(clk: clock, inputs: I, data: T) -> (uint<16>, O)
entity top(clk: clock) {
decl memout1, memout2;
let (addr1, mod1_out) = inst(4) mod1(clk, I(), memout1);
let (addr2, mod2_out) = inst(3) mod2(clk, I(), memout2);
let (memout1, memout2) = inst(1) mem(clk, addr1, addr2);
}
Writing it like this is tedious, and more importantly, error-prone as there is no way to communicate which signals correspond to each other. One might assume that the left output of the memory result is the data corresponding to address 1, but there is nothing to enforce this.
In addition, the pipelines internally have to prevent the addresses and returned data from being pipelined:
pipeline(4) mod1(clk: clock, inputs: I, mem_out: T) -> (uint<16>, O) {
'start
reg;
// ...
reg;
'mem_read
let mem_addr = inst mem_ctrl();
reg;
let result = inst compute(stage(start).mem_out);
reg;
(stage(mem_read).mem_addr, result)
}
This is another pain point and more importantly a source of errors. Graphically, the structure is more like the following which is as hard to follow as the code that describes it:
Wires
The solution to the pipelining problem is a new type called a wire denoted
by &. Wires, unlike values are not delayed in pipelines and can intuitively
be viewed as representing physical wires connecting modules rather than values
to be computed on.
To "read" the value of a wire, the * operator is used and to turn a value
into a wire, & is used.
With this change, the pipeline example can be rewritten as
pipeline(4) mod1(clk: clock, inputs: I, mem_out: &T) -> (&uint<16>, &O) {
reg;
// ...
reg;
let mem_addr = &inst mem_ctrl();
reg;
let result = inst compute(*mem_out);
reg;
(mem_addr, &result)
}
For now, it is not possible to return a compound type with both wires and tuples, which is why the output of the module was changed to
&O.
Inverted wires
There is still at least one big problem with the current structure: returning addresses as outputs and taking values as inputs is problematic as there is no clear link between input and output, and the return value of a unit ends up being a mix of both control signals like addresses, and values computed by the unit.
The solution to this problem is inverted wires, denoted inv &. These wires
flow the opposite way to the normal flow of values. A unit which accepts an
inverted wire as an input is able to set the value of that wire. A unit which
returns an inverted wire is able to read the value that was set by the "other
end"
Inverted wires are created using the port expression which returns (T, inv T)
let (read_side, write_side) = port;
The set statement is used to give set the value of an inverted wire. For example
set adder_out = &(a + b);
Rewriting the pipeline once again using inverted wires results in
pipeline(4) mod1(clk: clock, inputs: I, mem_addr: inv &uint<16>, mem_out: &T) -> O {
reg;
// ...
reg;
set mem_addr = &inst mem_ctrl();
reg;
let result = inst compute(*mem_out);
reg;
result
}
The code can be made even neater by grouping all the memory signals together into a tuple:
pipeline(4) mod1(clk: clock, inputs: I, mem: (inv &uint<16>, &T)) -> O {
reg;
// ...
reg;
set mem#0 = &inst mem_ctrl();
reg;
let result = inst compute(*mem#1);
reg;
result
}
Wires are passed around as if they were values, so our memory can now return all its signals, both inputs and outputs. As an example, to convert from a memory that does not use ports to one that does, we can write:
// A mockup memory which takes 2 addresses and returns two values.
pipeline(1) fake_memory(clk: clock, addrs: [uint<16>; 2]) -> [T;2]
pipeline(1) mem(clk: clock) -> ((inv &uint<16>, &T), (inv &uint<16>, &T)) {
let (addr1_read, addr1) = port;
let (addr2_read, addr2) = port;
let [out1, out2] = inst(1) fake_memory(clk, [*addr1_read, *addr2_read]);
reg;
((addr1, &out1), (addr2, &out2))
}
This finally allows us to write a neat top module for our running example:
entity top(clk: clock) {
let (m1, m2) = inst(1) mem(clk);
let out1 = inst(4) mod1(clk, I(), m1);
let out2 = inst(4) mod2(clk, I(), m2);
// ...
}
Ports
It is often desirable to define structs of related wires, for example the
wires we've used in the memory interface.
While we can wrap them all in tuples like
we did above, it is often desirable to give things names with structs.
To put wires in structs, we need to define them as struct port, which tells the compiler
that the struct is of port kind, which is a broader concept than just struct.
In fact, wires, their inversions, compound types of wires like tuples and even clocks are all
ports as opposed to values as discussed previously.
Most of the time, what is and what is not a port is unimportant, but they have two important
properties:
- Ports are not pipelined.
- Generic arguments cannot be ports.
We can define a struct port for our memory example as
struct port MemoryPort<T> {
addr: inv &uint<16>,
// A practical memory will usually also have a write value:
write: inv &Option<T>,
read: &T,
}
inv for real
The inv type is not only used to invert wires, it can be used to invert whole ports.
Effectively this flips the direction of all wires in the port.
This is very useful if there is no "owner" of a particular port as is the case with the
memory example.
We could tweak our memory example to use an inverted port by making the memory module also
accept the port as an (inverted) input.
pipeline(1) mem<T>(clk: clock, p1: inv MemoryPort<T>, p2: inv MemoryPort<T>) {
let [out1, out2] = inst(1) fake_memory(clk, [*p1.addr, *p2.addr]);
reg;
set p1.read = &out1;
set p2.read = &out2;
}
entity top(clk: clock) {
let (m1, m1_inv) = port;
let (m2, m2_inv) = port;
let _ = inst(1) mem::<uint<32>>(clk, m1_inv, m2_inv);
let out1 = inst(4) mod1(clk, I(), m1);
let out2 = inst(3) mod2(clk, I(), m2);
// ...
}
Inverted wires must be set
It is important that a circuit which uses inveted wires has a well defined value for all wires. In practice this means that a wire can only be assigned to exactly once, which is enforced by the compiler.
In practice this means that if you create an inv & wire, or receive one as
an argument you must either set the value, or hand it off to a sub-unit you
instantiate.
For example, if we make an error while writing the top module in our running
example and accidentally pass m1 to both mod1 and mod2
entity top(clk: clock) {
let (m1, m2) = inst(1) mem(clk);
let out1 = inst(4) mod1(clk, I(), m1);
let out2 = inst(4) mod2(clk, I(), m1);
// ^^ Should be m2
}
We get a compilation error:
error: Use of consumed resource
┌─ src/wires.spade:234:39
│
3 │ let out1 = inst(4) mod1(clk, I(), m1);
│ -- Previously used here
4 │ let out2 = inst(3) mod2(clk, I(), m1);
│ ^^ Use of consumed resource
Similarly, if we don't give m2 a value by removing the last line, we get another error
error: swim_test_project::wires::m9::m2.addr is unused
┌─ src/wires.spade:231:10
│
231 │ let (m2, m2_inv) = port;
│ ^^ swim_test_project::wires::m9::m2.addr is unused
│
= note: swim_test_project::wires::m9::m2.addr is a inv & value which must be set
Conditional assignment
Since Spade is expression based, setting the value of an inv & wire inside an if branch is not supported. For example, you may be tempted to write a multiplexer as
entity mux(sel: bool, on_false: bool, on_true: bool, out: inv &T) {
if sel {
set out = on_true
} else {
set out = on_false;
}
}
However, this will result in a multiply used resource error.
The correct way to write this is instead
entity mux<T>(sel: bool, on_false: T, on_true: T, out: inv &T) {
set out = &if sel {on_true} else {on_false};
}
NOTE This mux is only written like this to showcase how mutable wires are used A better way to write a mux is
entity mux<T>(sel: bool, on_false: T, on_true: T) -> T { if sel {on_true} else {on_false} }
Interfacing with Verilog
It is often desirable to interface with existing Verilog, either instantiating a Verilog module inside a Spade project, or including a Spade module as a component of a larger Verilog project. Both are quite easy to do as long as you have no generics on the Spade side.
Instantiating a Verilog module
If you have a Verilog module that you want to instantiate from Spade, you need
to add a stub for it in your Spade project. This is done by defining a
function, entity or pipeline1 as extern. For example,
struct Output {
valid: bool,
value: int<16>
}
extern entity external_module(clk: clock, x: int<8>) -> Output;
While this works, Spade will "mangle" names to avoid namespace collisions and collisions with keywords, so this would in practice look for a module like
module \your_project::your_file::external_module (
input clk_i,
input[7:0] x_i,
output[16:0] output__
);
Changing your module to follow this signature would work, but is not very convenient, the more convenient thing is to add #[no_mangle(all)] to the entity:
#[no_mangle(all)]
extern entity external_module(
clk: clock,
x: int<8>,
output: inv &Output
);
Now, the resulting Verilog signature is
module external_module(
input clk_i,
input[7:0] x_i,
output[16:0] output
);
Spade currently does not define the packing of the structs, so we need to do something about the output__ name that might be changed later.
Therefore, #[no_mangle(all)] refuses to accept units with return types.
The solution is to use inverted wires to generate Verilog outputs.
Note that you could have also written the module with the less economical
#[no_mangle] extern entity external_module( #[no_mangle] clk: clock, #[no_mangle] x: int<8> ) -> Output;i.e., manually apply #[no_mangle] to all the parameters.
This has the advantage of allowing a return type, but it's completely useless for the reason stated above, so just use
#[no_mangle(all)]!
Changing our module to
#[no_mangle(all)]
extern entity external_module(
clk: clock,
x: int<8>
output_valid: inv &bool,
output_value: int<16>,
);
results in
module external_module(
input clk_i,
input[7:0] x_i,
output output_valid,
output[15:0] output_value
);
which is a normal looking Verilog signature.
One downside of this however, is that the interface to this module isn't very Spadey, so typically you will want to define a wrapper around the external module that provides a more Spade-like interface
use std::ports::new_mut_wire;
use std::ports::read_mut_wire;
// Put the wrapper inside a `mod` to allow defining a Spade-native unit of the same name.
mod verilog {
#[no_mangle(all)]
extern entity external_module(
clk: clock,
x: int<8>
output_valid: inv &bool,
output_value: int<16>,
);
}
struct Output {
valid: bool,
value: int<16>
}
entity external_module(clk: clock, x: int<8>) -> Output {
let (valid, valid_inv) = port;
let (value, value_inv) = port;
let _ = inst verilog::external_module$(clk, x, output_valid: valid_inv, output_value: value_inv);
Output {
valid,
value
}
}
With this, we have the best of both worlds. A canonical Spade-entity on the Spade side, and a canonical Verilog module on the other.
Finally, to use the Verilog module in a Spade project, the Verilog file containing the implementation must be specified in swim.toml under verilog at the root or verilog in the synthesis section.
[verilog]
include = []
sources = []
[synthesis.verilog]
include = []
sources = []
sourcestakes a list of globs that get synthesized with the rest of the project.includetakes a list of directories for Verilog search paths.
Instantiating Spade in a Verilog project
Instantiating Spade in a larger Verilog project is similar to going the other
way around as just described. Mark the Spade unit you want to expose as
#[no_mangle(all)]. Prefer using inv &
instead of returning output values, as that results in a more Verilog-friendly
interface.
To get the Verilog code, run swim build, which will generate build/spade.sv
which contains all the Verilog code for the Spade project, including your
exposed module.
-
See the documentation for units for more details. Most of the time, you probably want to use
entityfor external Verilog. ↩
Ws2812b Example
This chapter will guide you through how to build a Spade library for the ws2812b RGB led and should serve as a practical example for "real world" Spade usage.
This assumes a bit of familiarity with basic Spade concepts, and is written primarily with software people in mind, as such more weight will be put on the FPGA specifics than on Spade syntax and concepts.
The chapter starts off with a discussion on how to create a Spade project and how that project is laid out. After that, we will discuss the interfaces we want to and need to use, i.e. how to talk to the LEDs, and how to make the driver interface nice to use for other Spade code. Finally, we'll go over the implementation of the actual driver.
Creating a Project.
It is strongly advised to use the Swim build tool to write Spade projects. It manages rebuilding the Spade compiler, including the standard library and dependencies, testing and synthesis etc.
If you haven't installed swim already, do so by following the installation instructions.
After you have swim installed, we should create a new project. The easiest way
to do this is to run swim init --board <fpga name>. To get a list of the boards
we currently have templates for, run
swim init --list-boards
which should give you something like
Cloning into '/tmp/swim-templates'...
remote: Enumerating objects: 135, done.
remote: Counting objects: 100% (50/50), done.
remote: Compressing objects: 100% (49/49), done.
remote: Total 135 (delta 11), reused 0 (delta 0), pack-reused 85
Receiving objects: 100% (135/135), 30,73 KiB | 30,73 MiB/s, done.
Resolving deltas: 100% (37/37), done.
[INFO] Available boards:
ecpix5
go-board
icesugar-nano
tinyfpga-bx
ulx3s_85k
If your FPGA board is not on the list, you can also set up your project manually, but that's out of scope for this guide. Have a look at the templates repository for inspiration.
For this project, the exact board isn't super important. I like my ecpix5 so I will use that.
Create the project using
swim init --board ecpix5 ws2812b
Note that it is likely that this project, being a library to drive specific hardware should be a library, not a standalone project, it is still useful to initialise it targeting a specific FPGA board in order to test in hardware it later.
Basic project layout
Inside the newly created directory we find the following files:
- ecpix5.lpf
- openocd-ecpix5.cfg
- src
- main.spade
- top.v
- swim.toml
FPGA specific files
The ecpix5.lpf file is a pin mapping file which tells the synthesis tool what physical
pins correspond to the inputs and outputs from our top module.
If you are using a ice40 based FPGA, this file is instead a
pcffile which has the same purpose but different syntax.
We'll get back to this file when it is time to test on hardware
The openocd-ecpix5.cfg file is a file needed to program the FPGA. It is
specific to the ecpix5 programmer and you don't really have to care what it
does or why it is needed.
Since Spade is a very work in progress language with breaking changes being very
common, it's easiest to have each project depend on a specific git version of
the compiler. This is handled by swim, which will track a specific compiler
version for us.1 The first time we build the
project using swim, it will download and compile the compiler.
Since the compilation process takes quite a while the first time you run it, now is a good time to call
swim build
The src directory contains our Spade source code. Each file is given a unique
namespace based on the name, so anything you define inside main.spade will be under
the namespace ws2812b::main::<unit name>.
Finally, there is the swim.toml file which describes our project
name = "ws2812b"
[synthesis]
top = "top"
command = "synth_ecp5"
[board]
name = "ecpix-5"
pin_file = "ecpix5.lpf"
config_file = "openocd-ecpix5.cfg"
The name is, as you might expect, the name of your project. If another
project depends on your project, this is the namespace at which your project
will reside.
The synthesis, pnr, upload, and packing fields tell swim what tools to call to
synthesise the project and upload it to the board. Most things can be ignored
here, but the top field is worth knowing about, as that
is how you specify the top module (roughly equivalent to main in most software languages).
Basic swim usage
Swim has several subcommands which you can use. These commands call their prerequisites so you only have to call the one you actually want to run. I.e. you don't have to call swim build before swim test.
swim build
Compiles your Spade code to verilog. The output ends up in build/spade.sv
swim synth, swim pnr
Call the synthesis tool and place and route tool respectively.
swim upload
Build the project and upload it to the board
swim test
Run simulation to test your code. Note that by default, your project does not contain any test benches, so this will complain. We'll write some later in the guide.
Aliases
Most of these commands have aliases that you can use to be lazy and avoid typing.
b:buildsyn:synthesiseu:uploadt,sim:test
In the next section, we will start discussing how to talk to the LEDs.
-
You can read more about this in the swim README. ↩
LED protocol overview
Now that we are familiar with the project layout, we can start writing the driver for our LEDs. To do so, a good place to start is the datasheet. By reading it we can find out how the protocol works:
The LEDs are chained together, with us talking to the data in pin on the first LED in the chain, and it relaying messages to the rest of the chain automatically.
Data transmission consists of 3 symbols:
- 0 code
- 1 code
- RET code
Each LED has 24 bit color, 8 bits per channel and the transmission order is GRB 1 with the most significant bit first. The first 24 bits of color control the first LED, the next 24 the second and so on, until the RET code is sent at which point data transmission re-starts from the beginning
As a more graphical example, a transmission of the color information for a sequence of three LEDs look like this:
| G7..0 | R7..0 | B7..0 | G7..0 | R7..0 | B7..0 | G7..0 | R7..0 | B7..0 | RET |...
|< LED 1 >|< LED 2 >|< LED 2 >| |< ...
Each color segment is a sequence of 1 or 0 codes depending on the desired
color for that led and color channel.
We should also have a look at the waveform of the 0, 1 and RET codes which look like this (see the datasheet for prettier figures):
0 code
------+
|
+-----------
| T0H | T0L |
I.e. a signal that is High for T0H units of time, followed by Low for T0L
units of time
1 code
----------+
|
+-------
| T1H | T1L |
I.e. a signal that is High for T1H units of time, followed by Low for T1L
units of time. It is very similar to the 0 code, but for the 1 code, the high
duration is longer than the low duration.
RET code
The RET code is just a Low signal which lasts for Tret units of time.
NOTE: The datasheet usually refers to this signal as
resetand the timing asTreset. In order to make the rest of this text less confusing, we use the nameretthroughout, as we already have a FPGA reset signal in our design which has different purposes.
Durations
We'll leave the durations of the signals for now and get back to them when we start implementing things. If you're curious already, have a look at the datasheet.
With the discussion of the external protocol out of the way, the next section will discuss our internal protocol, i.e. what interface we expose to users of our driver.
-
Because apparently standard color orders like RGB is too mainstream ↩
Driver interface
Now that we know how we should talk to the LEDs, we should also consider how we want the interface to our library to work. Here we have a few options with various trade-offs.
Passing an array around
The most familiar coming from a software world might be for the library to take a copy or a reference to an array containing the values to set the LEDs to. However, this is quite a difficult interface to implement in an FPGA. If we were to copy the LED values, we would need 24 bits per LED to be connected between the driver and user. Those bits would need individual wires, so the number of wires would quickly grow very large.
This interface would look something like
entity ws2812<#N>(clk: clock, rst: bool, to_write: [Color; N]) -> bool {
// ...
}
"Function" to write single LED
Another option we might be tempted to try is to have an interface where you
"call" a function to set a specific LED. This is difficult to do in practice
however. In Spade, one does not "call" a function, instead you instantiate a
block of hardware. One might work around that by passing something like an
Option<(Index, Color)> to the driver, which updates the specified LED.
However, this is still not without flaws. First, we can't update a single LED, we need to send colors to all the LEDs before it too, so we'd need to store what the color of the other LEDs are. Second, it takes time to transmit the control signals, so one couldn't send new colors at any rate, the module must be ready to transmit before receiving the next command. This is technically solvable, but there are better options for this particular interface.
Letting the driver read from memory
Passing a reference is slightly more doable in an FPGA. Here, we might give the LED driver a read port to a memory from which it can read the color values at its own pace. This is certainly an option for us to use here, though Spade currently doesn't have great support for passing read ports to memories around. Until that is mitigated, we'll look for other options
This might look something like this, but the MemoryPort is not currently supported in Spade
entity ws2812<#N>(clk: clock, rst: bool, mem: MemoryPort<Color>, start_addr: int<20>) -> bool { // ... }For those unfamiliar, the
#NumLjdssyntax means that the entity is generic over an integer calledNumLeds.In current Spade, one would have to write it as
struct Ws2812Out { signal: bool, read_addr: int<20>, } entity ws2812<#NumLeds>(clk: clock, rst: bool, memory_out: Color, start_addr: int<20>) -> Ws2812Out { // ... }which decouples the
read_addrfrommemory_out, and does not make clear the read delay between them.
Driver owned memory
Another, more Spade- and FPGA friendly option is to have the driver itself own a memory where it stores the colors to write, and expose a write port to that memory for instanciators to write new values. This might look as follows:
entity ws2812<#NumLeds>(clk: clock, rst: bool, write_cmd: Option<int<20, Color>>) -> bool {
// ...
}
Just in time output
Finally, an interface which might be unfamiliar coming from the software world is to have the user generate the color on the fly, i.e. the user provides a translation from LED index to LED color. This is quite a nice setup as it doesn't intrinsically require any memory; if color selection is simple, it can be made on the fly. This interface is best demonstrated graphically
Control
signals
|
v
+---------------+
| State machine |
+---------------+
|
v
+-------------------+
| User provided |
| color translation |
+-------------------+
|
v
+------------------+
| Output generator |
+------------------+
|
V
LED
Signals
Here, as driver implementors we are responsible for providing the state
machine, whose output would be some signal which says "Right now, we should
emit byte B of the color for LED N". We'll represent it by an enum
The color translator translates that into a known color, and the output generator generates the signals that actually drive the led.
In some sense, this interface is the most general. Both the driver owned memory version, as well as the memory read port version can be implemented by plugging the read logic into the translation part. For that reason, we will implement this interface first.
With all that discussion on interfaces out of the way, it is finally time to start implementing things. The next section will introduce the finite state machine, a real work horse in any Spade project.
State Machine
Now it is finally time to write some code. The swim template project contains
some example code in main.spade, feel free to run swim upload to test it if
you'd like. However, for this project, we won't need any of it, so once you are
done playing around with it, remove all code from main.spade.
We'll start off by writing the state machine that generates the drive signals for the rest of the circuit. Before we do that though, it is a good idea to think about the input and output signals we want.
For simplicity, the state machine will not take any input control signals, it will start running as soon as the reset signal is turned off, and write data as fast as possible until the end of time.
Output type
The output is a bit more interesting. As stated before, we want the Finite State Machine (FSM) to emit information about what we are currently drawing.
For those unfamiliar, a Finite State Machine is less scary than the words make it seem. It is a way to do computation by describing a series of states and how and when to change between the states.
For example, if we want to build a circuit to toggle an LED whenever a short pulse arrives 1, our FSM would consist of two states: On and Off. If no pulse arrives, the current state remains. If the pulse arrives, we transition from the current state to the opposite state.
It is usually convenient to look at small FSMs graphically, the following figure shows the states and transitions of the pulse example
Before we discuss our state machine further, we should consider what output we want it to generate. Initially, we might do something like this:
enum OutputControl<#IndexWidth> {
/// Currently emitting the RET signal
Ret,
/// Currently emitting the specified bit of the color for LED `index`
Led{index: int<IndexWidth>, bit: int<6>}
}
We make this enum generic over the width of the led indices to not waste bits and allow an arbitrary number of LEDs
The
bitfield is anint<6>because we want to be able to express0..24. If Spade had better unsigned support, we'd be able to useuint<5>:)
This enum has a few issues though, so let's make some improvements.
First, the data coming out of the color translation block will end up being quite similar to this enum, so we can use generics to share some code. The user will translate the index into a color, so we will allow arbitrary payload instead of that index
enum OutputControl<T> {
/// Currently emitting the Ret signal
Ret,
/// Currently emitting the specified bit of the color for LED `index`
Led{payload: T, bit: int<6>}
}
This is enough information to write the color translator, but to generate the
output, it would be nice to have some more information. Specifically, because
this is a time based interface, we could more easily generate the output
waveforms if we knew how long we've been emitting the current bit. Let's add that
to the enum
enum OutputControl<T> {
/// Currently emitting the RET signal
Ret,
/// Currently emitting the specified bit of the color for LED `index`
Led{payload: T, bit: int<6>, duration: int<12>}
}
If you are curious, the width of the duration field is 12 to support a counter counting from 0 to 1250. This was chosen because the total duration of a data bit is 1.25 microseconds, which takes 1250 clock cycles at 1 GHz, and we are unlikely to run our FPGA above that frequency. Better generics over clock frequencies is something that might happen down the line
State machine entity
We can finally stop talking about interfaces and write some actual code. Let's
start off writing an entity where we can put the logic to generate the
OutputControl enum. This entity will be generic over the index width as
discussed previously, and will take a number of LEDs to control as a normal
parameter.
In practice, it would be a lot nicer to set the number of LEDs at compile time too, but Spade generics are not quite there yet.
entity state_gen<#IndexWidth>(
clk: clock,
rst: bool,
num_leds: int<IndexWidth>
) -> OutputControl<int<IndexWidth>> {
// TODO
}
Let's work on that // TODO next. Recall that when working in Spade, we always
describe the behaviour of our circuit between one clock cycle and the next.
However, we want to implement an interface that is time dependent, so we
need to do some thinking. In a high level language, we'd want to do something like
#![allow(unused)] fn main() { while true { for t in 0..ret_duration { output = Ret; wait_clock_cycle; } for i in 0..num_leds { for bit in 0..24 { for t in 0..bit_duration { output = Led(i, bit, t) wait_clock_cycle; } } } } }
Unfortunately, loops are out of reach, so we will need to encode this logic in some other way. Most of the time, this is done by writing a state machine. The exact method is somewhat situation dependent and takes some practice. To be successful at this task, we have two basic constrains: we need enough information to know what state to jump to at all times, and we need enough information to know what output to generate. In Spade, we'll almost always represent the states with an enum
enum State {
// TODO
}
The RET signal
Let's start off with the first for loop to generate the RET signal. We need to keep track of how long we've been in RET, so we know when to jump over to the output generation loop. A good starting point is therefore a state, with a duration
enum State {
Ret{duration: int<17>},
// ...
}
The state_gen will need an instance of the state enum, which is updated at
every clock cycle. A perfect use for the reg statement and match expression
entity state_gen<#IndexWidth>(
clk: clock,
rst: bool,
num_leds: int<IndexWidth>
) -> OutputControl<int<IndexWidth>> {
reg(clk) state reset(rst: State::Ret(0)) = match state {
// Compute next state here
};
}
What happened here? We have a register called state which we update by
checking the state in the current clock cycle, to build a circuit that gives
the state in the next. Since state depends on itself, it needs to be reset back
to an initial value when the FPGA, which is why write reset(rst: State::Ret(0)). This will make the circuit send the RET signal to the LEDs
when starting up, then operate as normal. We could have started emitting LED
values too, but this makes the description easier and gives the LEDs a few
microseconds to get up and running during power up.
How do we compute the next state in the Ret state? That depends on how long we have been in Ret already. If that time is longer than the minimum time in Ret, we can start emitting LED data, otherwise we'll stay in the Ret state. We'll write this logic as
entity state_gen<#IndexWidth>(
clk: clock,
rst: bool,
num_leds: int<IndexWidth>
) -> OutputControl<int<IndexWidth>> {
reg(clk) state reset(rst: State::Ret(0)) = match state {
State::Ret(duration) => {
if duration >= Tret {
// First LED state
}
else {
State::Ret(trunc(duration + 1))
}
},
// ...
};
}
You may be curious why we need
truncthere. That's because Spade does not implicitly cast away overflow.duration+1is 1 bit larger thandurationif it overflows. To make it fit back into our state, we truncate the result of the addition, since we know that we have chosen duration to be large enough for it not to be an issue.
Timing
The keen eyed might have noticed Tret there. What is its value? It represents
the minimum time that we should emit the ret signal, but duration is in
clock cycles. Eventually, Spade might support being generic over clock cycles
and allow you to reason about time natively. For now, we need to compute how
many clock cycles Tret is manually. This of course depends on the clock
frequency, a value which varies between FPGAs. At the time of writing this
guide, out of the boards that swim currently supports natively, there are 4
different clock frequencies, so we probably want to be generic over it in order
to write a library.
Since we'll need a few more time dependent parameters down the line, we'll
define a Timing struct which we pass to the modules, which contains the
relevant timings. We might write something like this
struct Timing {
Tret: int<17>,
T0h: int<12>,
T0l: int<12>,
T1h: int<12>,
T1l: int<12>,
bit_time: int<12>,
}
However, now the user needs to know what those implementation dependent times are, which probably requires going to the data sheet. To make their life simpler, let's change it to
struct Timing {
// 280 microseconds
us280: int<17>,
// 0.4 microseconds
us0_4: int<12>,
// 0.8 microseconds
us0_8: int<12>,
// 0.45 microseconds
us0_45: int<12>,
// 0.85 microseconds
us0_85: int<12>,
// 1.25 microseconds
us1_25: int<12>,
}
and update our entity
entity state_gen<#IndexWidth>(
clk: clock,
rst: bool,
num_leds: int<IndexWidth>
t: Timing,
) -> OutputControl<int<IndexWidth>> {
let t_ret = t.us280;
reg(clk) state reset(rst: State::Ret(0)) = match state {
State::Ret(duration) => {
if duration >= t_ret {
// First LED state
}
else {
State::Ret(trunc(duration + 1))
}
},
// TODO: next states
};
// TODO: Output
}
NOTE: If you've been following along with a datasheet, for example the first result on duck duck go, or the first result from google you may be confused by why we use 280 microseconds and not 50. It turns out that the manufacturers of the LEDs updated the protocol at some point without updating model numbers or datasheets. this took quite a few hours of debugging when the code worked on old LEDs, but not a new strip.
Bit signals
To generate the bit signals, i.e. the nested for loop in the example above, we need to keep track of 3 things: which LED we're working on, which bit on that LED we're working on, and how long we've been in that state. Essentially 1 variable per loop level. We'll extend the state enum to fit:
enum State<#IndexWidth> {
Ret{duration: int<17>},
Led{idx: int<IndexWidth>, bit: int<6>, duration: int<12>}
}
How do we want the logic to work? At the "innermost level", if we aren't done
emitting the current bit, we increase the duration by 1. If the duration
reaches the bit time, we move on to the next bit, and if we are done with all
bits, we move on to the next LED. Finally, if we reached the last LED, we'll go
back to the RET state.
In Spade, we'll write that as
entity state_gen<#IndexWidth>(
clk: clock,
rst: bool,
num_leds: int<IndexWidth>,
t: Timing,
) -> OutputControl<int<IndexWidth>> {
let t_ret = t.us280;
let t_bit = t.us1_25;
reg(clk) state reset(rst: State::Ret(0)) = match state {
State::Ret(duration) => {
if duration >= t_ret {
State::Led(0, 0, 0)
}
else {
State::Ret(trunc(duration + 1))
}
},
State::Led$(idx, bit, duration) => {
if duration == t_bit {
if bit == 23 {
if idx == trunc(num_leds-1) {
State::Ret(0)
}
else {
State::Led$(idx: trunc(idx+1), bit: 0, duration: 0)
}
}
else {
State::Led$(idx, bit: trunc(bit+1), duration: 0)
}
}
else {
State::Led$(idx, bit, duration: trunc(duration + 1))
}
},
};
// TODO: Output
}
Spade supports passing arguments to units both by position, i.e. argument 1 is passed to parameter 1, 2 to 2 and so on, and also by name. To specify parameters by name, the calling parenthesis are preceded by
$, i.e.State::Led$(idx, bit, duration: trunc(duration + 1))says to pass the variable calledidxto the parameteridx,bittobit, andtrunc(duration + 1)toduration. This works the same way as the rust struct initialisation syntax
Finally, generating the output signal can be done by another match statement. Since State and OutputControl are very similar in this case, the resulting match statement is not very complex:
match state {
State::Ret(_) => OutputControl::Ret(),
State::Led$(idx, bit, duration) => OutputControl::Led$(payload: idx, bit, duration)
}
Putting it all together, we end up with the following code:
enum OutputControl<T> {
/// Currently emitting the RET signal
Ret,
/// Currently emitting the specified bit of the color for LED `index`
Led{payload: T, bit: int<6>, duration: int<12>}
}
struct Timing {
// 50 microseconds
us280: int<17>,
// 0.4 microseconds
us0_4: int<12>,
// 0.8 microseconds
us0_8: int<12>,
// 0.45 microseconds
us0_45: int<12>,
// 0.85 microseconds
us0_85: int<12>,
// 1.25 microseconds
us1_25: int<12>,
}
enum State<#IndexWidth> {
Ret{duration: int<17>},
Led{idx: int<IndexWidth>, bit: int<6>, duration: int<12>}
}
entity state_gen<#IndexWidth>(
clk: clock,
rst: bool,
num_leds: int<IndexWidth>,
t: Timing,
) -> OutputControl<int<IndexWidth>> {
let t_ret = t.us280;
let t_bit = t.us1_25;
reg(clk) state reset(rst: State::Ret(0)) = match state {
State::Ret(duration) => {
if duration >= t_ret {
State::Led(0, 0, 0)
}
else {
State::Ret(trunc(duration + 1))
}
},
State::Led$(idx, bit, duration) => {
if duration == t_bit {
if bit == 23 {
if idx == trunc(num_leds-1) {
State::Ret(0)
}
else {
State::Led$(idx: trunc(idx+1), bit: 0, duration: 0)
}
}
else {
State::Led$(idx, bit: trunc(bit+1), duration: 0)
}
}
else {
State::Led$(idx, bit, duration: trunc(duration + 1))
}
}
};
match state {
State::Ret(_) => OutputControl::Ret(),
State::Led$(idx, bit, duration) => OutputControl::Led$(payload: idx, bit, duration)
}
}
While we hope that the above code will work on the first try, that is rarely the case in practice. The next section will discuss how we can test our design
Testing the state machine
Just like software, testing our code is vital. Unlike software however, we don't have (easy) access to fancy tools like debuggers, printfs or error messages when we run on hardware. Therefore, we usually simulate FPGA designs to make sure they work in simulation in order to avoid painful debugging in hardware.
Currently, writing simulation code in Spade is not possible, as the things you want to do in a simulator are quite different to describing hardware. Instead, tests are written in python using the cocotb testing framework.
If you haven't already, refer to the installation instructions to see how to install cocotb.
Setting up tests
To do our testing, we need to do a tiny bit more setup in swim.
To do testing, we need to tell swim where we put our test benches. To do so,
create a directory called test
Then edit swim.toml adding a simulation section like so:
[simulation]
testbench_dir = "test"
Inside the test folder we put our test benches in python files. Let's create
our first one by creating test/state_gen.py. Each Spade test file must start
with a comment telling swim which unit is to be tested, the "top module", like
so. The simulator already knows that we're in the project ws2812b, so we can
simply put it as state_gen
# top=state_gen
We'll also add an empty test to that file like this:
# top=state_gen
import cocotb
from spade import SpadeExt
@cocotb.test()
async def normal_operation(dut):
s = SpadeExt(dut)
Each test is annotated by @cocotb.test() and is an async python function
which takes a single parameter dut, the Design Under Test.
Running swim test (or swim t for the lazy :)) presents us with the following error1
Error:
0: In {tb}
1: state_gen is generic which is currently unsupported in test benches
Which is a limitation of the Spade python interface. To test our module, we'll need to create a dummy entity without any generic parameters, for now we'll use one with 10 LEDs.
entity state_gen_10(clk: clock, rst: bool, t: Timing) -> OutputControl<int<5>> {
inst state_gen(clk, rst, 10, t)
}
After updating the top to # top=state_gen_10 we can swim test again and we
should see a nice PASS (along with some other output which we'll ignore for now)
HEAD is now at f1c17dc0 feat!: Use Rust syntax for exclusive ranges
[INFO] /home/frans/Documents/spade/ws2812-spade/build/spade.sv is up to date
[INFO] Checking if spade-python needs rebuilding. (This may print an error, it is expected)
⚠️ Warning: `project.version` field is required in pyproject.toml unless it is present in the `project.dynamic` list
🍹 Building a mixed python/rust project
🔗 Found pyo3 bindings with abi3 support
🐍 Not using a specific python interpreter
🛠️ Using zig for cross-compiling to x86_64-unknown-linux-gnu
Finished `release` profile [optimized] target(s) in 0.15s
📦 Built wheel for abi3 Python ≥ 3.8 to /home/frans/Documents/spade/ws2812-spade/build/dist/spade-0.13.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
[INFO] Finished: Checking if spade-python needs rebuilding. (This may print an error, it is expected)
HEAD is now at f1c17dc0 feat!: Use Rust syntax for exclusive ranges
PASS test/state_gen.py [normal_operation]
ok test/state_gen.py 0/1 failed
🭼 normal_operation ok [build/state_gen_normal_operation/state_gen.vcd]
Writing Some Tests
Now we are ready to actually test our module. All Spade test functions start
with s = SpadeExt(dut) which creates a nice Spade interface around the cocotb
functions.
Since our entity is clocked, we need to generate a clock for it. This is done by starting a cocotb clock task like this. The exact clock frequency is not really important here, it only decides the mapping between simulation time and real time.
# import the clock generator
from cocotb.clock import Clock
# ...
clk = dut.clk_i
await cocotb.start(Clock(clk, 1, units='ns').start())
The design also takes a reset signal which we need to set to get our initial state defined. If we forget to do this, most of the signals will be undefined.
We can access the input ports of our design using s.i.<input_name> and give
them values by assigning strings containing Spade expressions to them. The following code sets the reset signal to true:
s.i.rst = "true"
The
s.iinterface does not work well with cocotb built in functions likeClockYou can access the raw verilog input ports on the dut viadut.<name>_ias above which is nice if you want to pass them to special cocotb functions likeClock. However, most of the time you should use the Spade interface since that doesn't require you to know the Spade internal representation of types.
For our design to start running, we need to take it out of reset again, you might
think that we can just add another line s.i.rst = false. However, this would
give the design no time to see the change in reset. Instead, we need to let the
simulation step forward a bit. The easiest way to do that is to let it step forward
one clock cycle, which we do by waiting until the next time the clock goes from 1 to 0
# import trigger
from cocotb.triggers import FallingEdge
# ...
await FallingEdge(clk)
s.i.rst = "false"
This will create a waveform that looks like this
---+ +---+
clk: | | |
+---+ +---...
---+
rst: |
+-----------...
In order to let the circuit catch up to the fact that the reset has been turned off, we'll advance the simulation another tiny time step (1 picosecond):
# import timer
from cocotb.triggers import Timer
# ...
await Timer(1, units='ps')
You can find more things to wait for in the cocotb documentation for triggers.
Now we can do our first test, ensuring that the initial output of the circuit is RET. We can access the output of our dut with s.o, and run assertions on it like this:
s.o.assert_eq("OutputControl::Ret()")
If you return a struct from a unit, you can access them as normal python fields on the
s.ofield. For examples.o.x.y.assert_eq(...)
Our test file now looks like this:
# top=state_gen_10
from spade import SpadeExt
import cocotb
from cocotb.clock import Clock
from cocotb.triggers import FallingEdge, Timer
@cocotb.test()
async def normal_operation(dut):
s = SpadeExt(dut)
clk = dut.clk_i
await cocotb.start(Clock(clk, 1, units='ns').start())
s.i.rst = "true"
await FallingEdge(clk)
s.i.rst = "false"
await Timer(1, units='ps')
s.o.assert_eq("OutputControl::Ret()")
and calling swim test should tell us that all our assertions passed.
A failing test
Next, we may want to ensure that we output Ret in the next clock cycle as well. So, we'll advance the clock and assert that.
Changes to our state happens on the rising edge of clocks, so I prefer to do my assertions on the falling edge. That way I don't have to worry about if values have or have not changed right at the RisingEdge.
await FallingEdge(clk)
s.o.assert_eq("OutputControl::Ret()")
Calling swim test results in the following output:
...
AssertionError:
Assertion failed
expected: OutputControl::Ret()
got: UNDEF
verilog ('0XXXXXXXXXXXXXXXXXXXXXXX' != 'xxxxxxxxxxxxxxxxxxxxxxxx')
assert False
**************************************************************************************
** TEST STATUS SIM TIME (ns) REAL TIME (s) RATIO (ns/s) **
**************************************************************************************
** state_gen.normal_operation FAIL 0.50 0.03 14.56 **
**************************************************************************************
** TESTS=1 PASS=0 FAIL=1 SKIP=0 0.50 0.06 8.67 **
**************************************************************************************
VCD info: dumpfile /home/frans/Documents/spade/ws2812-spade/build/state_gen_normal_operation/state_gen.vcd opened for output.
FAIL test/state_gen.py [normal_operation]
FAIL test/state_gen.py 1/1 failed
🭼 normal_operation FAILED [build/state_gen_normal_operation/state_gen.vcd]
Error:
0: 1 test case failed
Oh no, something went wrong, why? To debug our tests, the best method by far is to look at the wave dump. It contains the value of all the signals in the design over time and can give plenty of debug information. To see it, we need to install a vcd viewer, and for this guide we will use Surfer. There are other options, such as gtkwave, but Surfer has good support for the Spade type system, so that is what I will go with.
Swim translates Verilog values in the wave dump back into Spade files and stores the result in a new vcd file which is printed along with the failing tests:
normal_operation FAILED [build/state_gen_normal_operation/state_gen.vcd]
Let's open build/state_gen_normal_operation/state_gen.vcd in surfer:
surfer build/state_gen_normal_operation/state_gen.vcd
This should open a window that looks something like this:

The black portion shows the value of the signals we select over time. The left
pane contains a list of the units in our design, in this case
ws2812b::state_gen_10. If you select it, the signal list below will be populated
by all the values in that module. In this case, it is just a wrapper around the
actual design state_gen, so expand the module and select the submodule. This should give you a lot more signals

Now we have lots of signals to play with! Broadly, we can group them into
several categories. First of all are those with no prefix (clk, state, t). These contain the Spade value
of the corresponding signal, and will often have a matching signal with the suffix _n. For example clk... is a Spade value, and
clk_n... is the raw verilog bits.
Names on the form _e_<numbers> and p_e_<numbers> are subexpressions that
are not named in the Spade program. Unless you're debugging the compiler, you
can ignore those.
Names on the form <name>_n<numbers> and p_<name>_n<numbers> are values which
are named in your Spade code. These are the values you will actually want to look at
Finally, there are some signals called <name>_i. These are input input
values. The Spade translation does not translate those, so it is better to look
at the corresponding <name> signals.
To add a signal to the waveform window, double click it. To debug this value,
we'll want to look at a few signals clk, rst, state and t,
so go ahead and add those to the wave view

Here we get quite a bit of information. We see that our state is defined until
the first clock cycle after reset. We also see that all fields of t, our
timing struct is "HIGHIMP". The name is a bit confusing, but this is caused by
us forgetting to set that parameter.
Going back to the Spade code, to compute the next state in Ret, we check if
the duration we've been in Ret so far is greater than t_ret. However, we
haven't set t, so we are essentially comparing duration to t_ret, which
is undefined, resulting in another undefined value.
Spade tries to do its best to avoid undefined values, it is certainly harder to write undefined values in Spade than in verilog. However, when forgetting to specify inputs, and when working with memories, they can pop up.
Let's specify the timings to fix this issue. Again, exact timing here isn't important, we'll set some values that make testing possible:
s.i.t = """Timing$(
us280: 28,
us0_4: 4,
us0_8: 8,
us0_45: 4,
us0_85: 8,
us1_25: 12,
)"""
await FallingEdge(clk)
s.o.assert_eq("OutputControl::Ret()")
With that change, our assertions pass.
More tests
Now we can get to testing the rest of the design. Since our state space is quite small in this case, we can ensure that all state transitions happen as they should. Since this is python, we can write things like loop, helper functions etc.
First, let's ensure that we stay in Ret for the specified amount of time,
i.e. us280 clock cycles:
s.i.t = """Timing$(
us280: 28,
us0_4: 4,
us0_8: 8,
us0_45: 4,
us0_85: 8,
us1_25: 12,
)"""
for i in range(0, 28):
await FallingEdge(clk)
s.o.assert_eq("OutputControl::Ret()")
After that, we should be emitting the value of the first LED. Here we can write a function to check a whole LED output, since we'll do that quite a few times
async def check_led(clk, s, index):
# Each bit of the LED should be emitted
for b in range(0, 24):
# And each duration from 0 to us1_25 in each bit
# For simulation performance, we'll just check the first and last bit explicitly
await FallingEdge(clk)
s.o.assert_eq(f"OutputControl::Led$(payload: {index}, bit: {b}, duration: 0)")
for d in range(0, 5):
await FallingEdge(clk)
s.o.assert_eq(f"OutputControl::Led$(payload: {index}, bit: {b}, duration: 5)")
We can now test all our LEDs by calling it in a loop, and finally ensure that we go back to the ret state at the right time. The final test bench looks like this:
# top=state_gen_10
from spade import SpadeExt
import cocotb
from cocotb.clock import Clock
from cocotb.triggers import FallingEdge, Timer
async def check_led(clk, s, index):
# Each bit of the LED should be emitted
for b in range(0, 24):
# And each duration from 0 to us1_25 in each bit
# For simulation performance, we'll just check the first and last bit explicitly
await FallingEdge(clk)
s.o.assert_eq(f"OutputControl::Led$(payload: {index}, bit: {b}, duration: 0)")
for d in range(0, 5):
await FallingEdge(clk)
s.o.assert_eq(f"OutputControl::Led$(payload: {index}, bit: {b}, duration: 5)")
@cocotb.test()
async def normal_operation(dut):
s = SpadeExt(dut)
clk = dut.clk_i
await cocotb.start(Clock(clk, 1, units='ns').start())
s.i.rst = "true"
await FallingEdge(clk)
s.i.rst = "false"
await Timer(1, units='ps')
s.o.assert_eq("OutputControl::Ret()")
s.i.t = """Timing$(
us280: 28,
us0_4: 4,
us0_8: 8,
us0_45: 4,
us0_85: 8,
us1_25: 12,
)"""
for i in range(0, 28):
await FallingEdge(clk)
s.o.assert_eq("OutputControl::Ret()")
# Check all 10 leds
for i in range(0, 10):
await check_led(clk, s, i)
# Ensure we get back to the ret state
await FallingEdge(clk)
s.o.assert_eq("OutputControl::Ret()")
Running it gives us another assertion error:
AssertionError:
Assertion failed
expected: OutputControl::Led$(payload: 0, bit: 1, duration: 0)
got: Led(0,0,6)
verilog ('100000000001000000000000' != '100000000000000000000110')
assert False
**************************************************************************************
** TEST STATUS SIM TIME (ns) REAL TIME (s) RATIO (ns/s) **
**************************************************************************************
** state_gen.normal_operation FAIL 287.50 0.08 3516.24 **
**************************************************************************************
** TESTS=1 PASS=0 FAIL=1 SKIP=0 287.50 0.12 2393.34 **
**************************************************************************************
VCD info: dumpfile /home/frans/Documents/spade/ws2812-spade/build/state_gen_normal_operation/state_gen.vcd opened for output.
FAIL test/state_gen.py [normal_operation]
FAIL test/state_gen.py 1/1 failed
🭼 normal_operation FAILED [build/state_gen_normal_operation/state_gen.vcd]
Error:
0: 1 test case failed
Try to see if you can figure out what happened. Looking at the waves can be helpful, but in this case it might be enough to look at what states it transitioned to.
If you can't figure it out, jump to the next section for the answer
-
The first time you run
swim test, it will set up a python environment with the required libraries which requires compiling a separate part of the Spade compiler. Don't be alarmed at the timeswim testtakes, or the amount of output the first time you run it in a new project. ↩
Output Generation
First of all, the cause of the bug mentioned in the end of the last chapter was an incorrect equality check of the duration when transitioning between states. It should be
if duration == trunc(t_bit-1) {
instead of
if duration == t_bit {
Now that our state machine works, we have done most of the heavy lifting. We still need to translate our control signal into an actual LED output, which is what we'll work on next.
Since this will require no internal state, and is fairly simple logic we'll represent
it as a function. We'll also represent color as a struct with r, g and b
values. The output is a single bool, the actual control signal to be passed to
the LEDs.
struct Color {
r: int<8>,
g: int<8>,
b: int<8>
}
fn output_gen(control: OutputControl<Color>, t: Timing) -> bool {
// TODO
}
The ret output is easy, it is simply a low signal. The 0 and 1 signals are a
bit more complex. The output should be 1 initially, and then transition to 0 at
t0l or t1l depending on if the current bit is a 0 or a 1.
The OutputControl::Led has information about which of the 24 bits should be
emitted. To translate that into a bit value, we'll concatenate the color
channels, and "index" the correct bit. (Currently, Spade does not support bit indexing,
so we'll extract the bits using shifts and masks instead)
This logic can be written as follows:
struct Color {
r: int<8>,
g: int<8>,
b: int<8>
}
fn output_gen(control: OutputControl<Color>, t: Timing) -> bool {
let t0h = t.us0_4;
let t1h = t.us0_8;
match control {
OutputControl::Ret => false,
OutputControl::Led$(payload: color, bit, duration) => {
let color_concat = (color.g `concat` color.r `concat` color.b);
let val = ((color_concat >> sext((23-bit))) & 1) == 1;
let step_time = if val {t1h} else {t0h};
if duration > step_time {
false
}
else {
true
}
}
}
}
Testing
Again, it is good practice to test the module. Testing it is very similar to the state machine, except here we don't have a clock. Instead, we'll set a signal value, advance the simulation by a tiny time step, and assert the output. Here is an example of the test bench. Feel free to extend it with more tests that you think are reasonable. Here it might also be helpful to define some helper functions which check that a specific input gives a specific waveform, for example.
# top=output_gen
from spade import *
@cocotb.test()
async def ret_works(dut):
s = SpadeExt(dut)
s.i.t = """Timing$(
us280: 2800,
us0_4: 40,
us0_8: 80,
us0_45: 45,
us0_85: 85,
us1_25: 125,
)"""
s.i.control = "OutputControl::Ret()"
await Timer(1, units='ps')
s.o.assert_eq("false")
@cocotb.test()
async def one_at_bit_0(dut):
s = SpadeExt(dut)
s.i.t = """Timing$(
us280: 2800,
us0_4: 40,
us0_8: 80,
us0_45: 45,
us0_85: 85,
us1_25: 125,
)"""
# Sending 1 @ bit 0, time 0
s.i.control = "OutputControl::Led(Color$(g: 0b1000_0000, r: 0, b: 0), 0, 0)"
await Timer(1, units='ps')
s.o.assert_eq("true")
# Sending 1 @ bit 0, time 40
s.i.control = "OutputControl::Led(Color$(g: 0b1000_0000, r: 0, b: 0), 0, 40)"
await Timer(1, units='ps')
s.o.assert_eq("true")
# Sending 1 @ bit 0, time 80
s.i.control = "OutputControl::Led(Color$(g: 0b1000_0000, r: 0, b: 0), 0, 80)"
await Timer(1, units='ps')
s.o.assert_eq("true")
# Sending 1 @ bit 0, time 81
s.i.control = "OutputControl::Led(Color$(g: 0b1000_0000, r: 0, b: 0), 0, 81)"
await Timer(1, units='ps')
s.o.assert_eq("false")
If you want to see a more fleshed out test, have a look at https://gitlab.com/TheZoq2/ws2812-spade/-/blob/94f4f2884bfb1fbedb41aee657367c0dc9b252e3/test/output_gen.py.
With our tests now passing, we can finally run the code in hardware, which we will discuss in the next and final section of this chapter.
Testing in hardware
We are finally at a point where we think the code is correct, and all the pieces are implemented. It's time to test it on hardware.
To do so, we need to set up a demo entity which instantiates the state and
output generators, and selects a nice color for them. We'll do this in a
separate file, called src/hw_test.spade
use lib::main::state_gen;
use lib::main::output_gen;
use lib::main::OutputControl;
use lib::main::Timing;
use lib::main::Color;
#[no_mangle(all)]
entity demo(clk: clock, pmod0: inv &int<6>) {
reg(clk) rst initial(true) = false;
// Our target FPGA, the ecpix5 has a 100 MHz clock.
let t = Timing$(
us280: 28000,
us0_4: 40,
us0_8: 80,
us0_45: 45,
us0_85: 85,
us1_25: 125,
);
let ctrl: OutputControl<int<4>> = inst state_gen(clk, rst, 4, t);
reg(clk) timer: int<32> reset(rst: 0) = if timer > 100_000_000 {
0
}
else {
trunc(timer+1)
};
reg(clk) offset: int<2> reset(rst: 0) = if timer == 0 {
trunc(offset+1)
}
else {
offset
};
let brightness = 64;
let colors = [
Color(brightness, 0, 0),
Color(0, brightness, 0),
Color(0, 0, brightness),
Color(0, brightness, brightness),
];
let with_color = match ctrl {
OutputControl::Ret => OutputControl::Ret(),
OutputControl::Led$(payload: led_num, bit, duration) => {
let led_num = trunc(led_num + sext(offset));
OutputControl::Led$(payload: colors[led_num], bit, duration)
},
};
let pin = output_gen(with_color, t);
set pmod0 = if pin {0b1} else {0};
}
There is not much going on here. Since we're in a different file, we need to
include the stuff defined in the other file. lib refers to the library we are
currently building, and since our code is in main.spade, the items are put in
the main namespace
Since our top module, demo, is going to connect to the external world,
we'll mark it as #[no_mangle(all)]. This tells the Spade
compiler to name things exactly what they are called in
the Spade code. The downside of this is that we might collide with Verilog
keywords, and the module demo will not have a namespaced name.
For the output, we also use a inv &int<6>. inv & is an inverted wire, i.e. a
wire where we can set a value using set. It is an int<6> because the IO
port pmod0 on the ecpix5 board we've been using as an example is 6 bits
wide. The physical pins pmod0 is mapped to is specified in the lpf file.
The line reg(clk) rst initial(true) = false; generates a reset signal that is
active the first clock cycle after the FPGA has started.
To generate the output, we create our timing struct, this time with correct timings for the 100 MHz FPGA we're targeting. We use an array to look up color values for each of the LEDs we're going to test, and output those signals.
Then we instantiate everything, and finally set the output pin to the resulting value. Here the LED strip is connected to the first pin of pmod0
We also need to tell the synthesis tool what entity should be our top module; to do so, change the synthesis.top value in swim.toml to demo
[synthesis]
top = "demo"
With all that done, we can run swim upload, and look at our new RGB LEDs.
The pattern is static and boring at the moment, so this is a great opportunity to play around a bit and make the LEDs do something more interesting!
All the code for this project can be found at https://gitlab.com/TheZoq2/ws2812-spade
Introduction
This tutorial part of a guest lecture in the Agile Hardware Design Course a DTU, and is designed to emphasize some of the advantages Spade has over other languages when doing agile hardware design. In particular, the tutorial highlights how
- Pipelines allow you to reason about timing in a way that allows you to refactor your code without much thought.
- The type system allows you to extend functionality of your project without being afraid that your changes affect other parts of the project in unforeseen ways.
Before starting the tutorial, you should install Spade and set up your editor which is covered in the next section.
Installing Spade
Windows
Spade currently does not natively support Windows, so you have to do the rest of the tutorial inside the Windows Subsystem for Linux (WSL). If you have not installed it yet, do so by following the instructions at https://learn.microsoft.com/en-us/windows/wsl/install
Once done, open a WSL shell by just typing wsl in cmd, and then follow the Linux instructions
Linux
-
First, install Rust by running
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shand accepting the default options when prompted
-
Restart your terminal to get access to the new binaries this installed
-
If you are on a fresh Linux or WSL install, you have to install a few packages for the next step:
sudo apt install build-essential libssl-dev pkg-config git(Or the equivalent packages on your distro of choice)
-
Install the Spade build tool called
swimwithcargo install --git https://gitlab.com/spade-lang/swim -
Install some additional tools that Swim needs with
swim install-tools
macos
-
Install Rust by running
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -
Restart your terminal to get access to the new binaries this installed
-
Install the Spade build tool called
swimwithcargo install --git https://gitlab.com/spade-lang/swim -
Install some additional tools that Swim needs with
swim install-tools
Testing the setup
If all commands executed successfully, should now be able to run the first example application. First, clone the tutorial repo
git clone https://gitlab.com/spade-lang/agile-tutorial.git
cd into the first project
cd agile-tutorial/game
And run
swim build && swim command cargo run
If everything works out, this should open a window containing a black background with stars, and a purple cube that moves around in a circle.
Set up your editor
You may also want to set up your editor to support Spade, see the Editor Setup section for instructions.
Pipelines and Timing Agility
Timing, be it the maximum frequency that your hardware reaches, or the number of clock cycles a module needs to output its result is an unavoidable aspect of hardware design, yet most HDLs give you very little help when reasoning about timing. The pipelining system in Spade gives you a way to reason about timing together with the compiler, often allowing you to refactor your code to accommodate new functionality or improve your \(f_{max}\) without much thought. That is what we will explore in this first part of the tutorial.
Agile hardware design is all about gradually building up a project piece by piece, and accommodating any changes to requirements that arise as you develop. With this in mind, we will start out with an existing project which, if you haven't done so already can be cloned from
git clone https://gitlab.com/spade-lang/agile-tutorial.git
Inside this repo is a game folder which is where we will do the rest of this part of the tutorial. First, run simulation to see what we are working with so far
cd game
swim build && swim cmd cargo run
The first time you run it, it may take some time so let's discuss what this
will do in the meantime. swim build will first compile the Spade compiler,
then compile the Spade code to Verilog which can be fed to simulators and
synthesis tools.
swim cmd cargo run then simulates the resulting code, and interprets the VGA
signals that it the simulation generates to render the result in a window.
Once the compilation is done, it should show a purple box moving around in a circle over a background of stars and for this "sprint", we are going to make the graphics a bit nicer.
Exploring the Existing Code
Before we do that however, let us familiarize ourselves with the code in the
project. All the code is in the src folder and most of the code is in
main.spade, with a few utilities in other files that we will discuss if
needed. There is also a main.rs file in there which contains the logic that is used
to interpret the VGA signals that the simulation emits, and show them to you in a window.
At the top of main.spade are a few mod and use statements which specify which other files are part of the project, and import a few things from other libraries. The details are not important for this tutorial.
After the imports is the definition of a GameState struct which, as the name
implies, stores the current state of the game.
struct GameState {
Currently, it has three fields
recording the position and angle of the player.
The type of these fields is
Fp<26, 8> which is a FixedPpoint number with 26 bits in total, and 8
fractional bits.
You can do most normal arithmetic operations on fixed point
numbers, though due to language limitations, it is currently not done with +,
- etc., but with methods such as .add and .sub.
The next block we encounter is an impl block which is used to add methods to the GameState struct.
impl GameState {
The first method, called next, is a pipeline with a latency of 1, and its
job is to compute the next game state from the current game state.
pipeline(1) next(...)
Currently, it does some trigonometry to the new location and angle of the player.
The second method, called draw_player, is (predictably) responsible for
drawing the player.
pipeline(0) draw_player(...)
It gets passed a pixel and its job is to return the Color of the player at
that pixel.
Since the player is not present everywhere it returns an Option<Color> so
it can return None on the pixels where the player should not be drawn
The logic to draw the purple box that currently represents the player works
like this. First, the coordinates of the player are converted to integers so we
can compare them to the pixel coordinates. Then, the player coordinates are
subtracted from the pixel coordinates.
If the result of that subtraction on the x-axis is greater than 0, we are on
the left side of the player, and if it is smaller than the player width, the
coordinate is inside the player. With the same logic on the y-axis we get
in_x and in_y which are both true if the player should be drawn on the
current pixel.
The next block is responsible for converting a pixel to a color for the whole screen, not just the player.
pipeline(0) pixel_to_color(...)
Its implementation is relatively simple, it computes both the background color and player color, and selects which one to draw based on whether or not the player color is present.
The previous functions take a pixel coordinate and convert it to a color
pipeline(1) render_pipeline(...)
In VGA and HDMI, there are some additional signals that must be generated. The
render_pipeline pipeline get passed all the VGA signals with pixel
coordinates, uses the pixel_to_color pipeline to convert from pixels to
colors, while passing along the rest of the control signals. The details of
this unit are slightly out of scope for this tutorial. You will have to come in
here to adjust some of the pipelining numbers, but you shouldn't have to make
functional changes.
The rest of the file contains the top level module that ties everything together. If you are curious, feel free to glance through it, but it shouldn't require any changes in this tutorial.
entity top(...)
Task 1
Your first task is to make the "player sprite" a little bit more interesting
than the current purple block. To do so, there is a texture pipeline
available in sprite.spade. You pass it an x and a y coordinate, and it
returns the color value of the texture at that point, if any. Since it is read
from a memory, it has a latency of 1.
Update the drawing code to use this texture for the player
Hint: To convert between
intanduint, use.to_uint()
Task 2
One big advantage of the pipelining abstraction is that it allows you to add register and move existing registers around without affecting what is being computed. This is especially useful when trying to reach a desired target frequency, so let's try that!
The current background of stars is a bit boring, but there is a more fancy one available in src/background.rs called fancy_background. Let's try using it instead of the existing stars by replacing the stars instantiation in main.rs
// Replace
let background_color = inst(0) background::stars(clk, pixel);
// With
let background_color = inst(0) background::fancy_background(clk, pixel);
If you run this as usual, it should look different. It will most likely also run noticeably slower in simulation, since it has to simulate a more complex circuit for every pixel. This is something we can live with, but there is a bigger problem which becomes apparent if you synthesize the circuit
# Run synthesis and place & route
swim pnr
it reveals that we're quite far from hitting our target frequency of 50 MHz
[INFO] Place and route maximum frequencies:
$glbnet$_e_1739[88]: 306.1 MHz (target: 250 MHz)
$glbnet$_e_1739[89]: 13.4 MHz (target: 50 MHz)
(Ignore the strange names, those are an annoying consequence of how the code for clock generation in this example is written.)
Your job now is to fix this and make sure we can reach the required 50 MHz. Start by looking at the timing report to try to work out which part of the design is taking the longest.
Once you're done, re-simulate the circuit, does it still work?
NOTE: The second 250 MHz clock sometimes also fails timing. Don't worry about this for now, it is only used for driving the serial HDMI signal and is unrelated to what you are doing. 250 MHz is on the edge of what the ECP5 FPGA we're targeting is capable of so the randomness in the PnR process means it fails occasionally.
Extra task 1
With us back to reasonable timing, we can go on to more fancy graphics. The
game state has an angle in addition to the x and y coordinates. It would
be helpful to the player if the ship is drawn at that angle so let's do that.
src/trig.spade contains implementations of sin and cos which you can use
with trig::sin and trig::cos. Use these functions to rotate the player
sprite according to its angle.
Fearless Refactoring with Types
Refactoring is an unavoidable aspect of agile design, as requirements change and you learn more about your problem space, your code has to change to adjust. However, doing so is often an almost scary process, are you sure that the changes you make do not affect another part of the program? Good use of types in a powerful type system can mitigate this and allow you to make big changes to your code, almost without thinking while being relatively confident that you cover all cases. This is what we will explore in this lab.
A Simple Processor
In this lab, you will work with a simple processor design whose ISA is defined by a single enum, Insn,
enum Insn {
Nop,
Set{dest: uint<4>, value: uint<16>},
Add{dest: uint<4>, opa: uint<4>, opb: uint<4>},
Jump{offset: int<16>},
Out{op: uint<4>}
}
While doing this means you lose out on some tricks you can do with instruction encoding to build a more efficient processor, it does make adapting the processor very easy. This may for example be embedded deep in your chip to do management tasks that are not particularly well suited to writing a big FSM, but where using a whole RISC-V core is excessive.
Setting up
Like the last exercise, you will work in an existing codebase. If you didn't clone the code for the tutorial already, do so
git clone https://gitlab.com/spade-lang/agile-tutorial.git
Then navigate into the processor folder which is where we will do the rest of this part of the tutorial.
Again, we can compile the Spade code and run the simulation with the following commands
cd game
swim build && swim cmd cargo run
This prints a sequence of timestamps (@xxxx) and an output value for those timestamps.
A Tour of the Processor
The processor we are working with is very simple, it is a non-pipelined
processor that spreads execution out over several clock cycles in order to deal
with things like latency from memories.
In order to reduce the risk of documentation/code mismatches, we will not describe the processor in detail here, instead have a look at the src/main.spade file and the comments in it.
Task 1
For the next few tasks, we will gradually extend the processor to support more instructions. First, in the example program which increments a counter in a loop, we have to allocate a register to store the increment value. Instead, we should add an AddImmediate instruction to allow adding a constant instead.
With that change, the following program should have the same output as the original
Insn::Set$(dest: 0, value: 0),
Insn::Set$(dest: 2, value: 100),
Insn::AddImmediate$(dest: 0, opa: 1, value: 1),
Insn::Out$(op: 0),
Insn::Jump$(offset: -3),
Insn::Jump$(offset: 0)
Since the point here is that the language will let you make this change without giving git too much thought and just following the compiler. To try this, make the required change to the
Insnenum, or even just the program, and then follow the compiler errors until your code compiles again.
Debugging
If everything went well, the compiler told you all the points you had to adjust, and you adjusted those points correctly, but in case something went wrong you probably want to do some debugging in a waveform viewer. The Surfer waveform viewer was built for Spade and includes automatic translation of types from their simulated bit values back into Spade types. To look at the waveforms from your simulation, simply run
swim cmd surfer build/vcd.vcd
and you should get a waveform viewer where you can look at all the signals in your design to help guide you to a solution to whatever problem you have.
Task 2
Let us keep expanding the instruction set of our little processor, this time by adding a conditional jump instruction. Like RISC-V, it should take two registers and a comparison operator, for example LessThan, GreaterThan etc., and jump if opa Operator opb is true. For the operator, use
enum Cmp {
Lt,
Gt,
Eq
}
Add the instruction, and use it in the example program to exit the counter loop if the counter goes past 100.
Task 3
Software is nice, but it is of course often nice to have some "accelerated" instructions in a processor to speed things up. Let's try that with a hardware divider which we can get from [https://gitlab.com/spade-lang/lib/dividers/-/blob/main/src/main.spade?ref_type=heads#L21]. You can add this as a dependency by adding a line in the [libraries] section in swim.toml:
dividers.git = "https://gitlab.com/spade-lang/lib/dividers.git"
In order to save resources, this divider, as the serial part of the name implies, computes a value over multiple clock cycles so to integrate it in your core you either have to pause the core until the divider is done, or use a two stage process where one instruction starts the computation, and another reads the result when it is ready.
Pick one of these options and add a divide instruction to your core.
NOTE The
ready"input" to theserial_diventity is out of scope for this tutorial, for now, just set it toport#1.
Language Reference
This chapter is a reference for individual features in the language.
Items
Anything that appears at the top level of a Spade file is an item. This includes units, types and (sub)modules etc..
As a user, you will rarely encounter the term Item, though it might appear in
parser errors if you write something unexpected at the top level of a file.
Units
Units are the basic building blocks of a Spade project, they correspond to modules in Verilog, and entities in VHDL. Units come in three flavors: functions, pipelines and entities.
Functions
Functions are combinational circuits (or pure, in software terms), that is they have no internal state, and can not read or set mutable wires.
Pipelines
Pipelines have a specified delay between input and output, and have explicit staging statements.
Entities
Finally, entities are the most general units, they can have state, and the input-output delay is arbitrary. They therefore have roughly the same programming model as VHDL and Verilog.
Type Declarations
Struct
struct declaration include a name, optional generic arguments and a list of
fields. The fields in turn have a name and a type which may use the generic
arguments.
struct Empty {}
struct NonGeneric {
field_a: int<8>,
field_b: bool
}
struct Generic<T> {
field_a: T,
field_b: bool
}
Enum
enum declarations also include a name and optional generic arguments. Their body
consists of a list of variants. Each variant in turn has a name, and an optional
list of fields
enum Enum {
EmptyVariant,
OneField{val: int<8>}
TwoFields{val1: bool, val2: bool}
}
enum GenericEnum<T> {
EmptyVariant,
OneField{val: T}
}
Statements
The body of any unit, or block is a list of statements followed by a resulting expression. Statements can declare things local to the block and contain expressions to be evaluated
Let bindings
Let bindings bind a pattern to a value.
Those not used to bindings and patterns can view a let binding as assigning a value
to a variable.
The pattern has to be an irrefutable pattern
If the type specification is omitted, the type is inferred.
Syntax
letpattern [:type specification]=expression;
Examples
Binding a value to a variable
let a = some_value;
Binding a value to the result of an if expression
let a = if x {0} else {2};
Unpacking a tuple
let (a, b) = some_value;
Unpacking a struct with an explicit type signature
let Struct$(a, b): Struct<int<8>> = some_value;
Registers
Free-standing (i.e. non-pipelining registers) are defined using reg(clk) ...
The register definition is quite complex and includes
- The clock signal which triggers an update
- A pattern to bind the current value of the register to. It must be irrefutable
- An optional type specification. Like let bindings, the type is inferred if the type signature is omitted
- An optional reset consisting of a reset trigger and a reset value.
Whenever the reset trigger is
truethe value of the register is asynchronously set to the reset value[^1] - An expression which gives new value
On the rising edge of the clock signal, the value of the register is updated to the value of the new value. The new value expression can include variables from the register itself.
Syntax
reg(clock: expression)pattern [:type specification] [reset(reset trigger: expression:reset value expression)]=new value: expression;
Examples
A register which counts from -128 to 127 (Note that because no initial value is specified, this will be undefined in simulation):
reg(clk) value: int<8> = trunc(value + 1);
A register which counts from 0 to 200 (inclusive) and is reset to 0 by rst:
reg(clk) value reset(rst: 0) =
if value == 200 {
0
} else {
trunc(value + 1)
};
Pipeline stage markers
Stage markers (reg;) are used in pipelines to specify where pipeline registers should be inserted.
After a reg statement, all variables above the statement will be put in registers and any reference
to those variables refer to the registered version.
Syntax
reg;reg *integer;reg[expression];
Repeated
In cases where more than one stage should be inserted without any new statements in between, there is a shorthand syntax:
reg * n`
where n is an integer. This is compiled into n simple reg statements, i.e.
reg * 3;
is the same as
reg;
reg;
reg;
Conditioned
A condition for the registers to accept values can also be specified in square brackets
reg[condition]
The semantics of this are explained in the section on dynamic pipelines
Pipeline stage labels
Pipeline stages can be given names to refer to them from other stages. This is done using 'name.
'first
let x = ...;
reg;
To refer to a named stage, use a []
Set
Set the value of a mutable wire to the specified value.
set wire = value;
Set statements can only appear at the top block of a unit. This might be surprising as you would expect to be able to write
#![allow(unused)] fn main() { if condition { set wire = value; } }
However, this is not well-defined in hardware because the wire needs some value, but no value is specified if condition does not hold. This particular point isn't true if an else branch is also specified, but the exact hardware that gets generated from imperative code like this is not obvious, particularly with more nesting.
Therefore, if you want to write
if condition {
set wire = on_true;
} else {
set wire = on_false
}
you should move the set statement outside to make it unconditional, i.e.
set wire =
if condition {
on_true
} else {
on_false
}
Syntax
setexpression=expression;
Assert
Takes a boolean condition and evaluates it, raising a runtime error in
simulation if it ever evaluates to false. In synthesis, this is ignored
assert this_should_be_0 == 0;
NOTE: Assert statements are currently not supported for synthesis with Verilator, only with Icarus.
Comptime
TODO
Expressions
An expression is anything that has a value. Like most languages this includes things like integers literals, instantiations and operators. However, unlike the languages you may be used to, almost everything in Spade is an expression and has a value, for example if-expression and match-blocks.
This means, among other things, that you can assign the 'result' of an
if-expression to a variable:
let a = if choose_b {
b
}
else {
c
};
Blocks
A block is an expression which can contain sub-statements.
They are delimited by {}, contain zero or more statements
and usually end with an expression for the whole block's value.
let a = {
let partial_result = ...; // Bind a partial result to a variable
// 'return' the result of compute_on as the result of the block
compute_on(partial_result)
};
Variables defined inside blocks are only visible in the block. For example, you cannot use partial_result outside the block above.
Blocks are required in places like bodies of if-expressions and functions,
but can be used in any place where an expression is
expected.
if-expressions
Syntax
ifexpression blockelseblock
The if-expression looks a lot like an if-statement in languages you may be
used to, but unlike most languages where if is used to conditionally do
something, in Spade, it is used to select values.
For example, the following function returns a if select_a is true, otherwise it returns b.
fn select(select_a: bool, a: int<8>, b: int<8>) -> int<8> {
if select_a {
a
} else {
b
}
}
This code makes heavy use of blocks. The body of the function, as well as each if-branch is a block.
In traditional hardware description languages, this would instead look like
fn select(select_a: bool, a: int<8>, b: int<8>) -> int<8> {
var result;
if select_a {
result = a;
} else {
result = b;
}
return result
}
but the Spade version is much closer to the actual hardware that is generated. Hardware in general does not support conditional execution, it will evaluate both branches and select the result.
match-expression
Syntax
matchexpression{pattern=>expression,...}
The match-expression is used to select a value based on the value of a single
expression. It is similar to case statements in many languages, but supports
pattern-matching which allows you to bind sub-values to variables. Typically, match
statements are used on enum values:
enum MaybeByte {
Some{value: uint<8>},
None
}
fn byte_or_zero(in: MaybeByte) -> uint<8> {
match in {
// Bind the inner value to a new variable and return it
MaybeByte::Some(value) => value,
MaybeByte::None => 0,
}
}
but they can also be used on any values
If more than one pattern matches the value, the first pattern will be selected.
A match statement must cover all possible values of the matched expression. If this is not the case, the compiler emits an error.
Instantiation
The three kinds of units are instantiated in different ways in order to highlight to readers of the code what might happen beyond an instantiation. For example if you see a function instantiation, you know that there will be no state or other weird behavior behind the instantiation.
The following syntax is used to instantiate the different kinds of units:
- Functions:
unit() - Entities:
inst unit() - Pipelines
inst(<depth>) unit(). The depth is the depth of the pipeline
Instantiation rules
Functions can be instantiated anywhere. Entities and pipelines can only be instantiated in entities or pipelines.
In addition, pipelines instantiated in other pipelines check the delay to make sure that values are ready before they are readable. For example,
let x = inst(3) subpipe();
let y = function();
reg;
read(x); // Compilation error. x takes 3 cycles to compute, but is read after 1
read(y); // Allowed, function is pure so its output is available immeadietly
reg * 2;
// Allowed, x has 3 stages internally, this will be the first value out of the pipeline
read(x)
Array Indexing
Arrays can be indexed using []. Indices can either be single runtime integers
such as [i], or compile-time ranges, such as [10:15]. Arrays are written
and indexed as most software languages: the leftmost array element is at index
0.
For example, [a, b, c][0:2] returns a new array [a, b]
Single element indexing
Non-range indices can be runtime integers. The size of the index is the smallest power of two that can represent the size of the array. However, if the array size is not an even power of two, indexing outside the range causes undefined behavior.
Range indexing
The indices for range indexing can only be raw integers, i.e. not runtime values.
The leftmost, i.e. beginning of the range is included, and the end of the range
is exclusive. For example, a[0:1] creates an array with a single element
containing the first element of a.
Examples
let array = [10, 11, 12, 13, 14];
let _ = array[0]; // 10
let _ = array[1]; // 11
let _ = array[2]; // 12
let _ = array[5]; // Out of bounds access (array length is 5), result is undefined
let _ = array[0..1]; // [10]
let _ = array[1..3]; // [11, 12]
let _ = array[0..5]; // [10, 11, 12, 13, 14]
Tuple indexing
Tuples can also be indexed, though tuple indexing uses #, for example tup#0
for the leftmost tuple value. Tuple indices can only be known at compile time
Stage references
TODO
Patterns
Patterns are used to bind values to variables, and as 'conditions' in
match-expressions. Patterns match a set of values,
and bind (essentially assigns) a set of partial values to variables.
Name patterns
The simplest pattern is a
variable name, like x. It matches all values, and binds the value to the
name, x in this case.
Literal patterns
Integers and booleans can be matched on literals of their type. For example,
true only matches booleans that are true and 10 only matches integers
whose value is 10. Literal patterns do not bind any variables.
Tuple patterns
Another simple pattern is the tuple-pattern. It matches tuples of a specific length, and binds all elements of the tuples to sub-patterns. All patterns can be nested
For example
let ((a, b), c) = ((1, 2), 3);
will result in a=1, b=2 and c=3.
If parts of a tuple pattern are conditional, the pattern will only match if the subpatterns do. For example,
match (x, y) {
(true, _) => true,
_ => false,
}
will only return true if x is true, and false otherwise
Struct and enum patterns
Named patterns are used to match structs and enum variants. They consist of the name of the type or variant, followed by an argument list if the type has arguments.
Argument lists can be positional: () or named: $(). In a positional
argument list, the fields of the type are matched based on the order of the
fields. In a named list, patterns are instead bound by name, either
field_name: pattern or just field_name which binds a new local variable
field_name to field. Argument matching in patterns works the same way as in
argument lists during instantiation
This is best shown by examples
struct S {
x: int<8>,
y: int<8>,
}
// Positional pattern, binds `a` to the value of `x` and `b` to the value of `y`
S(a, b)
// Named pattern with no shorthand. The whole pattern matches if the `y` field is `0`
// in which case `a` will be bound to the value of `x`
S$(y: 0, x: a)
// Shorthand named. This binds a local variable `y` to the value of the field `y`. The field `x` is ignored.
S$(y, x: _)
enum variants work the same way, but only match the enum of the specified name. For example
enum E {
A,
B{val: int<8>}
}
match e {
E::A => {},
E::B(0) => {},
E::B(val) => {}
}
Wildcard
The wildcard pattern _. It matches all values but does not bind the value to any
variable. It is useful as a catch-all in match blocks
For example, if we want to do something special for 0 and 1, but don't care about other
values we might write:
match integer {
0 => {},
1 => {},
_ => {}
}
Refutability
A pattern which matches all values of its type is irrefutable while one which only matches conditionally is refutable.
For example, a pattern unpacking a tuple is irrefutable because all values of type (T, Y) will
match (a, b)
let tuple: (T, Y) = ...;
match tuple {
(a, b) => {...}
}
while one which matches an enum variant is refutable because the None option will
not match
enum Option<T> {
Some{val: T},
None,
}
match x {
Some(x) => {...} // refutable: None not covered
...
}
Full documentation of the type system is yet to be written.
Primitive Types
These are the types built into the Spade language which aren't defined in the standard library.
bool
Generics
In a lot of cases, you want code to be generic over many different types, therefore both types and units support generic parameters.
Defining generics
Units and types which are generic have their generic parameters specified inside
angle brackets (<>) after the name. The generics can be either integers
denoted by #, or types which do not have # sign. In the body of the generic
item, the generic types are referred to by their names
For example a struct storing an array of arbitrary length and type is defined as
struct ContainsArray<T, #N> {
inner: [T, N]
}
Using generics
When specifying generic parameters, angle brackets (<>) are also used. For example, a function
which takes a ContainsArray with 5 8-bit integers is defined as
fn takes_array(a: ContainsArray<int<8>, 5>) {
...
}
Ports and Wires
See the user-level documentation
Dynamic Pipelines
NOTE Dynamic pipelines are experimental and have soundness issues when nested. If you use them, make sure that there are no sub-pipelines that overlap with conditional registers.
For conditionally executing pipelines, an enable condition can be specified on the reg statement. If this condition is false, the old value of all pipeline registers for this stage will be held, rather than being updated to the new values.
The stall condition is specified as follows
pipeline(1) pipe(clk: clock, condition: bool, x: bool) -> bool {
reg[condition];
x
}
where condition is a boolean expression which when true updates all the registers for this stage, and when false the register content is undefined1.
The above code is compiled down to the equivalent of
entity pipe(clk: clock, condition: bool, x: bool) -> bool {
reg(clk) condition_s1 = if condition {condition} else {condition_s1}
reg(clk) x_s1 = if condition {x} else {x_s1}
x_s1
}
Pipeline enable conditions propagate to stages above the enabled stage, in order to make sure that values are not flushed. This means that in the following code
pipeline(1) pipe(clk: clock, x: bool) -> bool {
reg;
reg[inst check_condition()];
reg;
x
}
the first two stages will be disabled and keep their old value when
check_condition returns false while the registers in the final stage will
update unconditionally.
If several conditions are used, they are combined, i.e. in
pipeline(1) pipe(clk: clock, x: bool) -> bool {
reg;
reg[inst check_condition()];
reg;
reg[inst check_other_condition()];
reg;
x
}
the first two stages will update only if both check_condition() and
check_other_condition() are true, and the next two registers are only going
to update if check_other_condition is true.
stage.ready and stage.valid
In some cases it is necessary to check if a stage will be updated on the next
cycle (ready) or if the values in the current cycle are valid. This is done
using stage.valid and stage.ready.
stage.ready is true if the registers directly following the statement will
update their values this cycle, i.e. if the condition of it and all downstream
registers are met.
stage.valid is true if the values in the current stage were
enabled, i.e. if none of the conditions for any registers this value flowed
through were false.
NOTE:
stage.validcurrently initializes as undefined, and needs time to propagate through the pipeline. It is up to the user to ensure that a reset signal is asserted long enough forstage.validto stabilize.
Example: Processor
This is part of a processor that stalls the pipeline in order to allow 3 cycles for a load instruction.
The program_counter entity takes a signal indicating whether it should count up, or stall.
This signal is set to stage.ready, to ensure that if the downstream registers don't accept new instructions, the program counter will stall.
pipeline(5) cpu(clk: clock) -> bool {
let pc = program_counter$(clk, stall: stage.ready)
reg;
let insn = inst(1) program_memory(clk)
let stall = stage(+1).is_load || stage(+2).is_load || stage(+3).is_load;
reg[stall];
let is_load = is_load(insn);
reg;
let alu_out = alu(insn);
reg;
reg;
let regfile_write = if stage.valid && insn_writes(insn) {Some(alu)} else {None()}
true // NOTE: Dummy output, we need to return something
}
the last line where regfile_write is set uses stage.valid to ensure that
results of an instruction are only written for valid signals, not signals
being undefined due to a stalled register.
Example: Latency Insensitive Interface
A common design method in hardware is to use a ready/valid interface. Here, a
downstream unit can communicate that it is ready to receive data by asserting a
ready signal, and upstream unit indicate that their data is valid using a
valid signal. If both ready and valid are set, the upstream unit hands
over a piece of data to the downstream unit.
What follows is an example of a pipelined multiplier that propagates a
ready/valid signal from its downstream unit to its upstream unit
struct port Rv<T> {
data: &T,
valid: &bool,
ready: inv &bool
}
pipeline(4) mul(clk: clock, a: Rv<int<16>>, b: Rv<int<16>>) -> Rv<int<32>> {
let product = a*b;
set a.ready = stage.ready;
set b.ready = stage.ready;
reg[*a.valid && *b.valid];
reg;
reg;
let downstream_ready = inst new_mut_wire();
reg[inst read_mut_wire(downstream_ready)];
Rv {
data: &product,
valid: &stage.valid,
ready: downstream_ready,
}
}
-
Currently, the implementation holds the previous value of the register, which will also be done in hardware. However, this might change to setting the value to
Xfor easier debugging, and to give more optimization opportunities for the synthesis tool. ↩
Binding
Constructs by syntax
This is a list of syntax constructs in the language with a link to a description of the underlying language construct. The list is split in two: constructs which start with a keyword and those which do not
Keywords
entity ...Entity definitionenum ...Struct definitionfn ...Function definitioninst a(...) ...entity instantiationinst(<depth>) a(...) ...pipeline instantiationletlet bindingpipeline(<depth>) ...Pipeline definitionreg(<clk>) ...;Register definitionreg;Pipeline stage markerreg * <number>;Pipeline stage markerreg[<condition>];Pipeline stage markerstruct ...Struct definition
Symbolic
<T>,<#N>Generic argumentsa(...) ...function instantiationinst a(...) ...entity instantiationinst(<depth>) a(...) ...pipeline instantiationa[..]array indexinga[..:..]array range indexinga#..tuple indexing
Swim
Swim is a batteries-included build system and package manager for the Spade programming language. It manages rebuilds of Spade source code, the compiler and any additional Verilog, supports simulation using your favorite simulators (icarus, verilator), and automates synthesis for ECP5 and iCE40 and gowin using yosys and nextpnr. The generated Verilog can also, of course, be used with any other tool.
Learn how to:
Installing Swim
Swim can be installed using the Rust package manager, cargo.
To install Rust and cargo, follow the instructions at rustup.rs.
Once cargo is installed, you can install the latest development version of Swim by
running:
cargo install --git https://gitlab.com/spade-lang/swim
Remember to add the ~/.cargo/bin/ directory to your PATH if you haven't already.
If you want to use the simulation and "place and route" features you will also need a synthesis tool like yosys.
Using Swim
Run swim init <project name> to create a new project in a subdirectory
named <project name>. It'll setup a basic project to serve as a jumping-off
point.
To compile the Spade code to verilog, run swim build or swim b. This will
build the Spade compiler and compile your code to build/spade.sv.
swim help will list the builtin swim subcommands. Here is a short excerpt:
Commands:
build [aliases: b]
synth [aliases: syn]
pnr [aliases: p]
upload [aliases: u]
simulate [aliases: sim, test, t]
init Initialise a new swim project in the specified directory. Optionally copies from a template
update Updates all external dependencies that either have a set branch or tag, or hasn't been downloaded locally
update-spade
restore Restore (discard) changes made to git-dependencies (including the compiler)
clean
help Print this message or the help of the given subcommand(s)
For reference projects for configuration see swim-templates.
Custom subcommands
swim also supports custom subcommands as follows: if there is a binary named swim-xxx in your path, then calling swim xxx arg1 arg2 will dispatch to swim-xxx arg1 arg2.
We have a list of known community subcommands like cargo's own community subcommand wiki -- if you've made one, feel free to add it!
Simulation and test benches
Swim supports running test benches for your code. Before you do so, you must
add a few lines to swim.toml
[simulation]
testbench_dir = "test"
Test benches are currently written in Verilog, place your test benches in a
directory called test. Swim will build and run each Verilog file in that
directory separately, and if the exit code of the simulator is 0, the test is
considered successful.
Finally, run swim test to test your code.
For sample projects for configuration see swim-templates.
Synthesis, place and route
Swim can also simulate and synthesise your project using yosys and nextpnr.
Ensure those are installled, then add sections for [synthesis] and [pnr] to
your config file. The exact options you need to specify depend on the architecture, but swim should tell you which fields you need to set. As an example, here is the synthesis configuration for an ECP5 based FPGA
[synthesis]
top = "e_main"
verilog.source = []
command = "synth_ecp5"
[pnr]
architecture = "ecp5"
device = "LFE5UM-85F"
pin_file = "pins.lpf"
package = "CABGA381"
To synthesise your code, call swim synth and to place and route, call swim pnr.
Swim will make sure the prerequisite steps are performed for you, so if your end goal is pnr, you can call swim pnr directly.
For sample projects for configuration, see swim-templates.
Upload
Swim can also upload your code for a few FPGAs. To get started, add an
[upload] section to your config. Like synthesis, the exact options depend on
your target, so let the error messages from swim guide the configuration.
To upload, call swim upload.
Templates
If you're using a supported board you can copy a template repository which
contains a project that's ready to upload. Check available boards with
swim init --list-boards and then swim init --board <board>.
A note on the Spade compiler and submodules
As Spade is still early in development, it is useful to have each project pinned to a specific compiler version, rather than having a global copy of the compiler. This means that it will still build in the future even if breaking changes are made to the language.
By default, Swim tracks the compiler version in a file called swim.lock that
is created on the first build. It is probably a good idea to track this using
git or another VCS. If you want to update to the newest version of the Spade
compiler, run swim update-spade and commit the updated swim.lock.
If you prefer keeping your own submodule (perhaps you want to do your own changes to the compiler?), you can also setup a path dependency and track it like any other submodule. For example:
git submodule add https://gitlab.com/spade-lang/spade.git spade
git commit -m "Add Spade submodule"
And then, instruct Swim to use a path to the compiler instead by changing your
swim.toml to this:
compiler = { path = "spade" }
Using another compiler branch
You can depend on a specific branch by setting the compiler-field in your
swim.toml:
compiler = { git = "https://gitlab.com/spade-lang/spade.git", branch = "another-branch" }
After setting the field, run swim update-spade to update the pinned compiler.
You can also change the repository, if you wish.
Using a global Spade compiler
If you prefer using a global compiler, you can set the compiler field to point
to an absolute path to the root directory for a local Spade compiler repository:
compiler = { path = "/path/to/spade-repo" }
You can read more about configurating Swim in the docs.
Debugging spadec
In rare cases where you want to attach a debugger to the Spade compiler, you
can use --debug-spadec
Community Subcommands
Swim, like cargo, supports extending its functionality with community-defined subcommands. Feel free to add your own subcommand to this list!
| Name | Description |
|---|---|
swim-clean-all | Inspired by cargo-clean-all: recursively clean swim projects |
Config
The main project configuration specified in swim.toml
Summary
# The name of the library. Must be a valid Spade identifier
# Anything defined in this library will be under the `name` namespace
name = "…"
# List of optimization passes to apply in the Spade compiler. The passes are applied
# in the order specified here. Additional passes specified on individual modules with
# #[optimize(...)] are applied before global passes.
optimizations = ["…", …]
# List of commands to run before anything else.
preprocessing = ["…", …] # Optional
# Map of libraries to include in the build.
#
# Example:
# ```toml
# [libraries]
# protocols = {git = https://gitlab.com/TheZoq2/spade_protocols.git}
# spade_v = {path = "deps/spade-v"}
# ```
libraries = {key: <Library>, …} # Optional
# Plugins to load. Specifies the location as a library, as well
# as arguments to the plugin
#
# Example:
# ```toml
# [plugins.loader_generator]
# path = "../plugins/loader_generator/"
# args.asm_file = "asm/blinky.asm"
# args.template_file = "../templates/program_loader.spade"
# args.target_file = "src/programs/blinky_loader.spade"
#
# [plugins.flamegraph]
# git = "https://gitlab.com/TheZoq2/yosys_flamegraph"
# ```
#
# Plugins contain a `swim_plugin.toml` which describes their behaviour.
# See [crate::plugin::config::PluginConfig] for details
plugins = {key: <Plugin>, …} # Optional
# Where to find the Spade compiler. See [Library] for details
[compiler]
<Library>
# Verilog to import in both simulation and synthesis.
[verilog] # Optional
<ImportVerilog>
[simulation]
<Simulation>
[synthesis] # Optional
<Synthesis>
# Preset board configuration which can be used instead of synthesis, pnr, packing and upload
[board] # Optional
<Board>
[pnr] # Optional
<Pnr>
[packing] # Optional
<PackingTool>
[upload] # Optional
<UploadTool>
[log_output]
<LogOutputLevel>
name String
The name of the library. Must be a valid Spade identifier
Anything defined in this library will be under the name namespace
optimizations [String]
List of optimization passes to apply in the Spade compiler. The passes are applied in the order specified here. Additional passes specified on individual modules with #[optimize(...)] are applied before global passes.
compiler Library
Where to find the Spade compiler. See [Library] for details
preprocessing [String]
List of commands to run before anything else.
verilog ImportVerilog
Verilog to import in both simulation and synthesis.
simulation Simulation
synthesis Synthesis
board Board
Preset board configuration which can be used instead of synthesis, pnr, packing and upload
pnr Pnr
packing PackingTool
upload UploadTool
libraries Map[String => Library]
Map of libraries to include in the build.
Example:
[libraries]
protocols = {git = https://gitlab.com/TheZoq2/spade_protocols.git}
spade_v = {path = "deps/spade-v"}
plugins Map[String => Plugin]
Plugins to load. Specifies the location as a library, as well as arguments to the plugin
Example:
[plugins.loader_generator]
path = "../plugins/loader_generator/"
args.asm_file = "asm/blinky.asm"
args.template_file = "../templates/program_loader.spade"
args.target_file = "src/programs/blinky_loader.spade"
[plugins.flamegraph]
git = "https://gitlab.com/TheZoq2/yosys_flamegraph"
Plugins contain a swim_plugin.toml which describes their behaviour.
See [crate::plugin::config::PluginConfig] for details
log_output LogOutputLevel
UploadTool
One of the following:
icesprog
tool = "icesprog"
Fields
iceprog
tool = "iceprog"
Fields
tinyprog
tool = "tinyprog"
Fields
openocd
tool = "openocd"
config_file = "path/to/file"
Fields
config_file FilePath
fujprog
tool = "fujprog"
Fields
openFPGALoader
tool = "openFPGALoader"
board = "…"
Fields
board String
custom
Instead of running a pre-defined set of commands to upload, run the specified list of commands in a shell. #packing_result# will be replaced by the packing output
tool = "custom"
commands = ["…", …]
Fields
commands [String]
PackingTool
One of the following:
icepack
tool = "icepack"
Fields
ecppack
tool = "ecppack"
idcode = "…" # Optional
Fields
idcode String
gowin_pack
tool = "gowin_pack"
device = "…"
Fields
device String
Pnr
One of the following:
ice40
architecture = "ice40"
[device_args]
<Ice40Args>
# If set, inputs and outputs of the top module do not need a corresponding field
# in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but
# when running in hardware, it is recommended to leave this off in order to get a warning
# when pins aren't set in the pin file.
allow_unconstrained = true|false
# Continue to the upload step even if the timing isn't met.
# This is helpful when you suspect that the place-and-route tool is conservative
# with its timing requirements, but gives no guarantees about correctness.
allow_timing_fail = true|false
# The path to a file which maps inputs and outputs of your top module to physical pins.
# On ECP5 chips, this is a `pcf` file, and on iCE40, it is an `lpf` file.
pin_file = "path/to/file"
Fields
device_args Ice40Args
allow_unconstrained bool
If set, inputs and outputs of the top module do not need a corresponding field in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but when running in hardware, it is recommended to leave this off in order to get a warning when pins aren't set in the pin file.
allow_timing_fail bool
Continue to the upload step even if the timing isn't met. This is helpful when you suspect that the place-and-route tool is conservative with its timing requirements, but gives no guarantees about correctness.
pin_file FilePath
The path to a file which maps inputs and outputs of your top module to physical pins.
On ECP5 chips, this is a pcf file, and on iCE40, it is an lpf file.
ecp5
architecture = "ecp5"
[device_args]
<Ecp5Args>
# If set, inputs and outputs of the top module do not need a corresponding field
# in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but
# when running in hardware, it is recommended to leave this off in order to get a warning
# when pins aren't set in the pin file.
allow_unconstrained = true|false
# Continue to the upload step even if the timing isn't met.
# This is helpful when you suspect that the place-and-route tool is conservative
# with its timing requirements, but gives no guarantees about correctness.
allow_timing_fail = true|false
# The path to a file which maps inputs and outputs of your top module to physical pins.
# On ECP5 chips, this is a `pcf` file, and on iCE40, it is an `lpf` file.
pin_file = "path/to/file"
Fields
device_args Ecp5Args
allow_unconstrained bool
If set, inputs and outputs of the top module do not need a corresponding field in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but when running in hardware, it is recommended to leave this off in order to get a warning when pins aren't set in the pin file.
allow_timing_fail bool
Continue to the upload step even if the timing isn't met. This is helpful when you suspect that the place-and-route tool is conservative with its timing requirements, but gives no guarantees about correctness.
pin_file FilePath
The path to a file which maps inputs and outputs of your top module to physical pins.
On ECP5 chips, this is a pcf file, and on iCE40, it is an lpf file.
gowin
architecture = "gowin"
[device_args]
<GowinArgs>
# If set, inputs and outputs of the top module do not need a corresponding field
# in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but
# when running in hardware, it is recommended to leave this off in order to get a warning
# when pins aren't set in the pin file.
allow_unconstrained = true|false
# Continue to the upload step even if the timing isn't met.
# This is helpful when you suspect that the place-and-route tool is conservative
# with its timing requirements, but gives no guarantees about correctness.
allow_timing_fail = true|false
# The path to a file which maps inputs and outputs of your top module to physical pins.
# On ECP5 chips, this is a `pcf` file, and on iCE40, it is an `lpf` file.
pin_file = "path/to/file"
Fields
device_args GowinArgs
allow_unconstrained bool
If set, inputs and outputs of the top module do not need a corresponding field in the pin file. This is helpful for benchmarking when pin mapping is irreleveant, but when running in hardware, it is recommended to leave this off in order to get a warning when pins aren't set in the pin file.
allow_timing_fail bool
Continue to the upload step even if the timing isn't met. This is helpful when you suspect that the place-and-route tool is conservative with its timing requirements, but gives no guarantees about correctness.
pin_file FilePath
The path to a file which maps inputs and outputs of your top module to physical pins.
On ECP5 chips, this is a pcf file, and on iCE40, it is an lpf file.
GowinDevice
One of these strings:
"GW1NR-UV9QN881C6/I5""GW1N-LV1QN48C6/I5""GW1NZ-LV1QN48C6/I5""GW1NSR-LV4CQN48PC7/I6""GW1NR-LV9QN88PC6/I5""GW2AR-LV18QN88C8/I7""GW2A-LV18PG256C8/I7""GW1N-UV4LQ144C6/I5""GW1NS-UX2CQN48C5/I4""GW1NR-LV9LQ144PC6/I5"
Lv18QNFamily
One of these strings:
"GW2A-18C""GW2AR-18C""GW2ANR-18C""String"Specify a raw string for the family instead of the swim-provided families. Used if swim doesn't support the configuration you want.
Lv9QN88PC6Family
One of these strings:
"GW1N-9C""String"Specify a raw string for the family instead of the swim-provided families. Used if swim doesn't support the configuration you want.
Lv1qn48c6I5Family
One of these strings:
"GW1NZ-1""String"Specify a raw string for the family instead of the swim-provided families. Used if swim doesn't support the configuration you want.
Ecp5Device
One of these strings:
"LFE5U-12F""LFE5U-25F""LFE5U-45F""LFE5U-85F""LFE5UM-25F""LFE5UM-45F""LFE5UM-85F""LFE5UM5G-25F""LFE5UM5G-45F""LFE5UM5G-85F"
Ice40Device
One of these strings:
"iCE40LP384""iCE40LP1K""iCE40LP4K""iCE40LP8K""iCE40HX1K""iCE40HX4K""iCE40HX8K""iCE40UP3K""iCE40UP5K""iCE5LP1K""iCE5LP2K""iCE5LP4K"
Board
One of the following:
Ecpix5
name = "Ecpix5"
pin_file = "path/to/file" # Optional
config_file = "path/to/file" # Optional
Fields
pin_file FilePath
config_file FilePath
GoBoard
name = "GoBoard"
pcf = "path/to/file" # Optional
Fields
pcf FilePath
tinyfpga-bx
name = "tinyfpga-bx"
pcf = "path/to/file" # Optional
Fields
pcf FilePath
Icestick
name = "Icestick"
pcf = "path/to/file" # Optional
Fields
pcf FilePath
Synthesis
Summary
# The name of the unit to use as a top module for the design. The name must
# be an absolute path to the unit, for example `proj::main::top`, unless the
# module is marked `#[no_mangle]` in which case the name is used.
#
# Can also be set to the name of a module defined in verilog if a pure verilog top
# is desired.
top = "…"
# The yosys command to use for synthesis
command = "…"
# Extra verilog files only needed during the synthesis process.
[verilog] # Optional
<ImportVerilog>
top String
The name of the unit to use as a top module for the design. The name must
be an absolute path to the unit, for example proj::main::top, unless the
module is marked #[no_mangle] in which case the name is used.
Can also be set to the name of a module defined in verilog if a pure verilog top is desired.
command String
The yosys command to use for synthesis
verilog ImportVerilog
Extra verilog files only needed during the synthesis process.
Simulation
Summary
# Directory containing all test benches
testbench_dir = "path/to/file"
# Extra dependencies to install to the test venv via pip
python_deps = ["…", …] # Optional
# The simulator to use as the cocotb backend. Currently verified to support verilator and
# icarus, but other simulators supported by cocotb may also work.
#
# Defaults to 'icarus'
#
# Requires a relatively recent version of verilator
simulator = "…"
# The C++ version to use when compiling verilator test benches. Anything that
# clang or gcc accepts in the -std= field works, but the verilator wrapper requires
# at least c++17.
# Defaults to c++17
cpp_version = "…" # Optional
# Extra arguments to pass to verilator when building C++ test benches. Supports substituting
# `#ROOT_DIR#` to get project-relative directories
verilator_args = ["…", …] # Optional
testbench_dir FilePath
Directory containing all test benches
python_deps [String]
Extra dependencies to install to the test venv via pip
simulator String
The simulator to use as the cocotb backend. Currently verified to support verilator and icarus, but other simulators supported by cocotb may also work.
Defaults to 'icarus'
Requires a relatively recent version of verilator
cpp_version String
The C++ version to use when compiling verilator test benches. Anything that clang or gcc accepts in the -std= field works, but the verilator wrapper requires at least c++17. Defaults to c++17
verilator_args [String]
Extra arguments to pass to verilator when building C++ test benches. Supports substituting
#ROOT_DIR# to get project-relative directories
ImportVerilog
Summary
# Search paths for Verilog include directives.
include = ["path/to/file", …] # Optional
# Paths to Verilog files to import. Supports glob syntax.
sources = ["…", …] # Optional
include [FilePath]
Search paths for Verilog include directives.
sources [String]
Paths to Verilog files to import. Supports glob syntax.
Library
Location of a library or external code. Either a link to a git repository, or a path relative to the root of the project.
compiler = {git = "https://gitlab.com/spade-lang/spade/"}
path = "compiler/"
One of the following:
Git
Downloaded from git and managed by swim
git = "…"
commit = "…" # Optional
tag = "…" # Optional
branch = "…" # Optional
Fields
git String
commit String
tag String
branch String
Path
A library at the specified path. The path is relative to swim.toml
path = "path/to/file"
Fields
path FilePath
PluginConfig
Summary
# True if this plugin needs the CXX bindings for the Spade compiler to be built
requires_cxx = true|false
# Commands required to build the plugin. Run before any project compilation steps
build_commands = ["…", …]
# The files which this plugin produces
builds = [<BuildResult>, …]
# Arguments which must be set in the `swim.toml` of projects using the plugin
required_args = ["…", …]
# Commands to run after building swim file but before anything else
post_build_commands = ["…", …]
# Commands which the user can execute
commands = {key: <PluginCommand>, …}
# Things to do during the synthesis process
[synthesis] # Optional
<SynthesisConfig>
requires_cxx bool
True if this plugin needs the CXX bindings for the Spade compiler to be built
build_commands [String]
Commands required to build the plugin. Run before any project compilation steps
builds [BuildResult]
The files which this plugin produces
required_args Set[String]
Arguments which must be set in the swim.toml of projects using the plugin
post_build_commands [String]
Commands to run after building swim file but before anything else
synthesis SynthesisConfig
Things to do during the synthesis process
commands Map[String => PluginCommand]
Commands which the user can execute
PluginCommand
Summary
# List of system commands to run in order to execute the command
#
# Commands specified by the user, i.e. whatever is after `swim plugin <command>`
# is string replaced into `%args%` in the resulting command string. The arguments
# are passed as strings, to avoid shell expansion
script = ["…", …]
# The build step after which to run this command
[after]
<BuildStep>
script [String]
List of system commands to run in order to execute the command
Commands specified by the user, i.e. whatever is after swim plugin <command>
is string replaced into %args% in the resulting command string. The arguments
are passed as strings, to avoid shell expansion
after BuildStep
The build step after which to run this command
BuildStep
One of these strings:
"Start"Before any other processing takes place"SpadeBuild""Simulation""Synthesis""Pnr""Upload"
SynthesisConfig
Summary
# Yosys commands to run after the normal yosys flow
yosys_post = ["…", …]
yosys_post [String]
Yosys commands to run after the normal yosys flow
BuildResult
Summary
# The path of a file built by this build step
path = "…"
# The first build step for which this file is required. This will trigger
# a re-build of this build step if the file was changed
[needed_in]
<BuildStep>
path String
The path of a file built by this build step
needed_in BuildStep
The first build step for which this file is required. This will trigger a re-build of this build step if the file was changed
Compiler Internals
This chapter describes some internals of the compiler and details about code generation. Normally, this is not relevant to users of the language.
Naming
This chapter describes the naming scheme used by the compiler when generating Verilog. The goal of the Verilog generator is not to generate readable Verilog, but there should be a clear two-way mapping between signal names in the source Spade code and generated Verilog. This mapping should be clear both to users reading lists of signals, for example, in VCD files, and tools, for example VCD parsers.
Variables
Because Spade does not have the same scoping rules as Verilog, some deconfliction of names internal to a Verilog module is needed.
If a name x only occurs once in a unit, the corresponding Verilog name is \x .
(This is using the Verilog raw escape string system, and some tools may
report the name as x).
If x occurs more than once, subsequent names are given an index ordered sequentially
in the order that they are visited during AST lowering1
The kth occurrence of a name is suffixed by _n{k}
Pipelined versions of names are suffixed with _s{n} where n is the absolute
stage index of the stage.
Names of port type with mutable wires have an additional variable for the mutable
bits. This follows the same naming scheme as the forward name, but is suffixed by
_mut
The following is an example of the naming scheme
pipeline(1) pipe(
x: bool, // "\x "
y: (&bool, inv &bool) // "\y ", "y_o "
) {
if true {
let x = true; // "x_n1"
} else {
let x = false; // "x_n2"
}
let x = true; // "x_n3"
reg; // "\x_s1 ", "x_n3_s1
let z = true; // "\z "
}
Spade makes no guarantees about name uniqueness between generated Verilog modules.
-
This is currently the lexical order of the occurrences, i.e. names which occur early in the module are given lower indices. ↩
Type Representation
Description of the Verilog representation of Spade types
Mixed-direction types
Types with mixed direction wires are split in two separate variables, typically <name> and <name>_mut. The structure of the forward part is the same as if the backward part didn't exist, and the backward part is structured as if it were the forward part.
For example, (int<8>, inv &int<9>, int<2>, &mut int<3>) is stored as (int<8>, int<2>) and (int<9>, int<3>).
Tuples
Tuples are stored with their members packed side to side, with the 0th member on the left.
let x: (int<8>, int<2>, bool) = (a, b, c);
is represented as
logic x[10:0];
assign x = {a,b,c};
Binary representation
aaaaaaaabbc
Enums
Enums are packed with the leftmost bits being the discriminant and the remaining bits being the payload. The payload is packed left-to-right, meaning that the rightmost bits are undefined if a variant is smaller than the largest variant.
enum A {
V1(a: int<8>),
V2(b: int<2>),
V3(c: bool)
}
9 8 7 6 5 4 3 2 1 0
t t p p p p p p p p
-------------------
V1: 0 0 a a a a a a a a
V2: 0 1 b b X X X X X X
V3: 1 0 c X X X X X X X
