A photo of the Bang Bang Boom Box (B⁴), our hardware testing apparatus
Take a look at the avionics system design and
PSPieChart posts to get context on this document.
This document is a direct continuation of Avionics System Design and details information about flight and GSE software for Purdue Space Program Liquid's next rocket, the CraterMaker Special.
Flight Software
Development Environment
The RP2040 has a very well-documented and easy-to-use SDK written in C.
Everything can be configured through CMake.
The first step to start developing software for the rocket was determining the
source tree organization and writing a build system.
I decided on using a single CMakeLists.txt
at the project root so that it'd be
easier for new members to work without worrying too much about the build system.
The build process also includes ccache
for insanely fast clean builds.
This is how the source code is organized
PSPL_CMS_Avionics_Code/
├── README.md <- This file
├── src/ <- contains main programs for each board
│ └── someprogram/ <- Main program folder, i.e., for a board
│ ├── someprogram.h <- All header stuff for this program
│ ├── main.c <- Contains the entrypoint
│ └── other.c <- Other source files
├── lib/ <- common libraries shared by all boards
│ └── example/ <- example library
│ ├── example.c
│ └── include/ <- header files, added to include path
│ ├── example.h <- top level header, included with `#include <example.h>`
│ └── example/ <- secondary headers
│ └── whatever.h <- included with `#include <example/whatever.h>`
├── external/ <- Gitignored, for automatically downloaded libraries
├── include/ <- Global include directory, used for configs and stuff
├── build/ <- not uploaded to the repo, contains compilation outputs
│ └── bin/ <- contains the files to be uploaded to the boards
├── .clang-format <- file containing autoformatter rules
├── .gitignore <- file containing things to not be uploaded to GitHub
├── Doxyfile <- Doxygen Documentation Generator Configs, mostly autogenerated
├── CMakeLists.txt <- Main build script
└── Makefile <- User-friendly build script, calls CMake
Finally, there's a top-level Makefile
which handles dependencies, compilation,
and generating the compilation database (for intellisense)
Though Doxygen is used for generating documentation, it's not commonly
used, and we usually just refer to the docstrings.
The dependencies for flight software development are very minimal, needing only
ARM GCC and newlib to get started. And as of writing this, everything clean
builds (with ccache cleared) in under 15 seconds.
Peripheral I/O
The main peripherals on this rocket are ADCs, storage devices, radios, Ethernet
(more on this later), and digital output for pyro channels.
Digital outputs are easy enough, pico-sdk has all the infrastructure
needed for that.
One of the requirements we have for peripherals on the rocket, which most of our
selected parts comply, is that they communicate over SPI.
SPI is a fantastic protocol and can be abstracted away to the level of
selecting a peripheral, and sending and receiving a bitstream to the peripheral
at a specified rate.
The process of writing the drivers themselves is mostly straightforward -
implement the protocols that the chip manufacturers specify in the datasheet.
The first version of the driver is always a fully synchronous version, i.e., all
communication with the device is commanded by the main thread. This made it
easy to meet deadlines when it comes to upcoming tests.
The next step is implementing interrupt-driven I/O. For our purposes, this isn't
necessary 99% of the time, but it becomes a problem whenever we run any
long-running subroutines. On the hardware side, DRDY's from peripherals are
connected to a GPIO pin, which can be used to trigger a subroutine, to say, read
data off of an ADC and add it to the queue to be stored and transmitted.
Implementing DMA isn't really at the highest priority though - the only place
where it's applicable would be the Ethernet and Flash drivers. Devices
like the ADC read short bursts of data triggered by interrupts, and we decided
that it's not worth the effort to implement DMA for these devices.
Fun fact: the RP2040's DMA system is turing complete. More info
here
Network Stack
The network stack was nicely handled for us by the W5500 chip,
which is what we use to add Ethernet capability to the RP2040. Communicating
over SPI, it includes a hardwired TCP/IP stack, with 6 sockets.
All we need to do is write a driver to access this functionality, and abstract away details such as socket creation, buffer management, configuration, etc. I won't be going into much detail about this driver, but you can read the datasheet for the chip here
Communication Protocols
This is where the fun begins
At the launch site, the rocket, the GSE system (Black Cat Launch System, BCLS),
and the launch operators' computers are all connected over
Ethernet.
While the rocket is on the ground, there are two primary requirements for
communication with the rocket.
- We need live sensor data to be available to operators
- We need remote procedure call capability
The EMU board is fully responsible for the launch countdown and autosequence of
this rocket, so RPC capability is important.
But first - how do we stream live sensor data from the boards to each other and
the ground?
My solution - SensorNet. The concept is very simple, it starts with this
struct definition.
SensorNet
typedef struct {
u16 sensor_id; // globally unique
u64 time; // microseconds since UNIX epoch
u64 counter; // per sensor
i64 data;
} sensornet_packet_t;
The system is very straightforward - take one or more of these packets, put them
on a UDP packet, and send them on their merry way.
Decoding is simple enough. If it's another flight board decoding, it just has to
cast the packet it read to sensornet_packet_t*
. On the ground side, just
manually decode the binary data. There are plenty of libraries for most
programming libraries to handle this.
Doing things this way has the blazing fast serialization time of 0ms.
On the networking side of this - each flight board (EMU, LFC, UFC), will have a
UDP multicast group IP assigned to it, and any devices interested in data from
that board can simply open a socket listening to that multicast IP.
CommandNet
Now this is slightly more complicated.
I concluded that we need two features from CommandNet
- The ability to directly set the values of certain variables
- The ability to call functions that don't have any parameters.
I see this as basically a more complicated version of the "set registers, run
instruction" model.
In order to simplify the implementation of both sides of this protocol, I
decided to go with MessagePack, which is a JSON-like binary format for
serializing information. Neither throughput nor latency was an important factor
for this application, so we prioritized simplicity. These packets would then be
sent over a TCP connection to the rocket.
This is the protocol "spec":
Request-response protocol for sending commands over a TCP socket
Request format:
[request type(u8), args...]
EXEC_CMD: [EXEC_CMD, command name]
ALL_CMDS: [ALL_CMDS]
SET_VAR: [SET_VAR, variable name, value]
GET_VAR: [GET_VAR, variable name]
ALL_VARS: [ALL_VARS]
Response format:
[status(u8), args...]
EXEC_CMD: [status(SUCCESS/ERROR)]
ALL_CMDS: [status(SUCCESS/ERROR), [command 0 name, ...]]
SET_VAR: [status(SUCCESS/ERROR), old value]
GET_VAR: [status(SUCCESS/ERROR), value]
ALL_VARS: [status(SUCCESS/ERROR), [[var 0 name, var 0 value], ...]]
All names are strings (I know, strings are evil), and all values are 64-bit
integers.
A full CommandNet exchange would take place as follows:
+---------+ +---------+
| Client | | Rocket |
+---------+ +---------+
| |
| Open TCP Socket |
|---------------------->|
| |
| Send request |
|---------------------->|
| |
| Send response |
|<----------------------|
| |
| Close TCP Socket |
|<----------------------|
| |
+---------+ +---------+
| Client | | Rocket |
+---------+ +---------+
The way the network at the launch site is set up, it's possible that
people outside our team can plug into the network and send packets to the
rocket - this isn't a problem we expect to encounter, but it's one we have to be
ready for anyway.
To solve this, we simply add AES256-CBC symmetric key encryption, with a key
derived from a passphrase.
Now, instead of sending the bare MessagePack encoded packet, we send a
base64-encoded ciphertext. The reason for using base64 is so that it's easy to
know where a transmission ends by just checking if there's an endline.
EMU State Machine
The state machine is a failsafe method of performing launch control.
Interacting with this state machine is the primary objective of CommandNet. It
looks like this:
Most of the time launch will be spent in the Countdown - Go/No Go loop.
At preset points in the countdown, the launch operators will need to issue a
Go/No Go to the current poll. If the poll either times out or a "No Go" is
issued, the countdown is placed on hold until manual intervention.
The countdown also automatically goes on hold if there's an off-nominal
situation, whatever that may be.
If all goes well, and the final poll is answered, the state machine goes into
autosequence at a set point in the countdown (T-auto).
In this case, if communication with the rocket is lost before autosequence
starts, the rocket will not launch.
The autosequence phase contains the ignition sequence. A manual abort can be
issued during the autosequence, which would place the vehicle in a safe state.
DevOps
Back to the boring stuff.
All this software needs to be tested on actual hardware, and it's inconvenient
and expensive for everyone working on software to have test hardware.
To solve this problem, Taylor
devised the Bang Bang Boom Box, the picture at the start of this page.
The box contains a Raspberry Pi which is connected to whatever board is
currently being worked on, through USB and GPIO lines to the RESET and BOOTSEL
pins.
The source files include simple scripts which can then flash binaries to the
board. The RP2040 has a UF2 ROM bootloader, so when it's RESET with BOOTSEL held
active, it shows up as a USB storage device, and the firmware file can be copied
into it to write it to device flash.
The flight software can be compiled either on the Pi itself or copied over.
The Pi is added to a tailscale network, so it can be
securely accessed anywhere there's internet.
GSE Software
This basically comes down to implementing everything in this data path diagram.
Every maroon block in this image is a separate piece of software that needs to
be written.
These are the ones with development underway
PSPieChart
See this post
SensorNet Server
This is a very crucial piece of software with this core requirement
- Log every single SensorNet packet that hits its socket
We selected NodeJS as the platform for this because it made it easy to integrate
REST endpoints for setting and retrieving sensor calibrations, and to add
WebSocket support so that packets can be forwarded to PSPieChart.
This server includes endpoints to configure sensor ID to name mappings,
calibrations, custom expressions for derived data, and retrieval of historical
data.
We initially picked InfluxDB to store the sensor data. However, we found that Influx
sucks for this use case. The buffer to insert points had to be somewhat large without incurring
a penalty on the insert throughput. Latency to retrieve past data was high.
Given these challenges, I
decided to write my own "database"
(I'll probably make a separate post about this later). It's not ACID-compliant (yet) so it's not
fair to call it a true database, but it got the job done.
The concept is simple, just memory map a file to a petabyte sized chunk of the address space,
and just insert points to the tail end of the file and have a counter in the file header. Add some
basic file versioning capabilities, and you have a database.
Now that I think about it, I think it could be considered atomic inserts are single threaded
(per table) and there's a counter keeping track of inserted points which should ensure atomicity.
In any case, this database automatically buffered writes through the magic of Linux's VFS, and
the latency to retrieve past points is really low since it doesn't force you to do any sort
of reduction, and can instead just sample points at a fixed time interval. One of the reasons
Influx retrival was slow was becuase it forced you to apply some sort of reduction (max
, min
,
avg
, etc.)
CommandNet Server
This is just a Python implementation of the protocol, with a REST API.