June 27, 2023

Building a Host for the Spin Runtime - and Creating the World's Laziest Control Plane

Ivan Towlson Ivan Towlson

spin rust

Building a Host for the Spin Runtime - and Creating the World's Laziest Control Plane

Application developers work with the Spin command line to scaffold, build, and run applications. It’s the most visible facet of the Spin project. But Spin isn’t only a command line: it’s also a runtime. The command line hosts that runtime in order to run applications, but other things host that runtime too, such as Fermyon Cloud or the containerd shim. If you’re interested in writing your own host - because you need something exotic, because you’re a cloud vendor yourself, or just because you’re a huge geek like me - this is the post for you!

And it’s simpler than you might think from looking at the Spin project. A runtime doesn’t have to care about developer features like scaffolding new applications, or building from source code, or even human-editable manifests. And most of the gnarly stuff you can copy from the Spin source code anyway, as we shall see.

But Why Would You?

Still, why bother? Well, as an application developer, you wouldn’t (except, of course, out of curiosity). You’d use an established operating environment such as Fermyon Cloud or Kubernetes.

As an operator, though, you might if you had specialized needs - an on-premise solution without the overhead of Kubernetes, for example, or to run applications with non-HTTP triggers. An intriguing case is to run Spin applications within an infrastructure layer, such as the database.

And, of course, if you’re one of those lucky folks building and running a cloud yourself, you might want to offer Spin as an option in your application or functions service.

Spin as a Runtime

The Spin GitHub repository contains quite a bit of code, because it includes both the Spin command line, with all its fancy tools, and the runtime. The runtime itself is much more modest.

The core Spin runtime is in two Rust crates, spin-core and spin-app. Roughly, spin-core handles the low-level details of driving Spin’s Wasm engine, and spin-app defines what an application looks like. For the purposes of this article, we’re not going to dive right down into spin-core. Instead, we’ll use the spin-trigger and spin-http-trigger crates, which provide conventional ways of loading and running applications.

This is a compromise. spin-http-trigger is designed for the Spin command line, which is single-application, so there are some multi-application optimizations it doesn’t admit. You can bypass spin-http-trigger and implement the HTTP pipeline yourself, but this requires more knowledge, more code, and more testing. As always in engineering, the right trade-off depends on your resources and needs.

The last element is some way to load actual applications. As I mentioned, we don’t need to care about human-editable manifests, or a convenient local development loop: that’s what the command line is for. A service can limit itself to loading applications in their “published” format, from a registry. This is the same distribution infrastructure that Docker and Kubernetes use, and it doesn’t mean that applications need to be public, any more than Kubernetes does - registries can be private. Registry loading is provided by the spin-oci crate.

With these few packages, it’s possible to develop a simple HTTP application host.

It’s reasonably easy to extend this to support other triggers. But this sample sticks to HTTP.

The Plan of Attack

If you read the spin up source code - and who doesn’t - you’ll see that most of spin up is concerned with getting hold of a gadget called a LockedApp. The LockedApp is then flung over the wall to the trigger, which is concerned with loading the Wasm engine and application code, and hooking that up to a listener such as an HTTP server.

By following a similar plan of attack, we can reuse a lot of existing Spin code, either by referencing it from Spin crates, or by the noble art of copying and pasting.

A host also needs a way of knowing which applications to load (and with what settings, for example which port each application should listen on). You could think of this as the control plane of the application. From a platform and operator point of view, that is of course important, and potentially a significant differentiator of services. But from a Spin point of view, it’s not interesting, and I’ll largely gloss over it.

So the plan falls into two main parts:

  1. Get from a registry reference to a LockedApp (and a few other bits that we need)
  2. Load the LockedApp and run the corresponding trigger

Getting Started

All right, let’s put the plan into action. Create a new Rust project:

cargo new spinhost --bin
cd spinhost

and add the Spin crates, plus a couple of others we’ll need, to Cargo.toml:

[dependencies]
# Spin runtime
spin-app = { git = "https://github.com/fermyon/spin", tag = "v1.3.0" }
spin-core = { git = "https://github.com/fermyon/spin", tag = "v1.3.0" }
spin-oci = { git = "https://github.com/fermyon/spin", tag = "v1.3.0" }
spin-trigger = { git = "https://github.com/fermyon/spin", tag = "v1.3.0" }
spin-trigger-http = { git = "https://github.com/fermyon/spin", tag = "v1.3.0" }
# Utility packages
anyhow = "1.0"
futures = "0.3.28"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0.82"
tempfile = "3.3.0"
tokio = { version = "1.23", features = ["full"] }
url = "2.4.0"

This is going to be an async program so change the declaration of main to:

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    todo!()
}

The Control Plane

It’s time to meet the world’s laziest control plane:

pub struct App {
    pub reference: String,  // registry reference
    pub address: std::net::SocketAddr,  // for HTTP server to listen on
    pub state_dir: String,  // where storage like key-value and SQLite lives
}

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let apps = [
        App {
            reference: "ghcr.io/itowlson/dioxus-test:v1".to_owned(),
            address: "127.0.0.1:4001".parse()?,
            state_dir: "app1".to_owned(),
        },
        App {
            reference: "ghcr.io/itowlson/dioxus-test:v2".to_owned(),
            address: "127.0.0.1:4002".parse()?,
            state_dir: "app2".to_owned(),
        },
    ];

    todo!()
}

In reality, you’d read the set of applications from a database or something like that, and you’d offer way more settings like TLS information and so on. But you’re not reading this blog to see database code, so this will do for now.

The apps I use here are from my learning efforts with the Dioxus Web framework. This post explains how they were built, but for this post we only need the published form.

Getting from a Reference to a LockedApp

The first part of the plan calls for us to get from a registry reference to a LockedApp. Conveniently, spin up can load from registry references, so we can just reuse code from there. The key functions on the registry path are prepare_app_from_oci and write_locked_app.

use anyhow::{anyhow, Context, Result};
use spin_app::locked::LockedApp;
use spin_oci::OciLoader;
use std::path::Path;
use url::Url;

async fn prepare_app_from_oci(reference: &str, working_dir: &Path) -> Result<LockedApp> {
    let mut client = spin_oci::Client::new(false, None)
        .await
        .context("cannot create registry client")?;

    OciLoader::new(working_dir)
        .load_app(&mut client, reference)
        .await
}

async fn write_locked_app(
    locked_app: &LockedApp,
    working_dir: &Path,
) -> Result<String, anyhow::Error> {
    let locked_path = working_dir.join("spin.lock");
    let locked_app_contents =
        serde_json::to_vec_pretty(&locked_app).context("failed to serialize locked app")?;
    tokio::fs::write(&locked_path, locked_app_contents)
        .await
        .with_context(|| format!("failed to write {:?}", locked_path))?;
    let locked_url = Url::from_file_path(&locked_path)
        .map_err(|_| anyhow!("cannot convert to file URL: {locked_path:?}"))?
        .to_string();

    Ok(locked_url)
}

Both of these need a working directory, for which it’s easiest to use a temporary directory. The Rust tempfile crate provides an implementation that will delete the directory when the object is dropped. Here’s the beginning of a function that implements the plan:

async fn run(app: &App) -> anyhow::Result<()> {
    let working_dir = tempfile::tempdir()?;

    let locked_app = prepare_app_from_oci(&app.reference, working_dir.path()).await?;
    let locked_url = write_locked_app(&locked_app, working_dir.path()).await?;

    todo!("the rest of the plan")
}

Getting from LockedApp to a Running Server

The trigger loading code is a bit trickier, because we can’t just copy and paste it there’s a lot more of it, and some of it is intended to be executed as part of a command line trigger, and yeesh there are a lot of options. Again, a production service might well need a lot of those options, but for this post I want to focus on just getting something running.

A command line trigger is a Spin subcommand (or plugin), and it enters at the TriggerExecutorCommand::run function. Reading through that, the nub of it is where it builds and runs an ‘executor’. With suitable changes, that segment can be copied into run:

async fn run(app: &App) -> anyhow::Result<()> {
    let working_dir = tempfile::tempdir()?;

    let locked_app = prepare_app_from_oci(&app.reference, working_dir.path()).await?;
    let locked_url = write_locked_app(&locked_app, working_dir.path()).await?;

    // --- New code starts here ---

    // `trigger.run` needs the trigger configuration. In the Spin CLI this is loaded
    // as `self.run_config`; here we have to construct it from the App object.
    let http_run_config = spin_trigger_http::CliArgs {
        address: app.address.clone(), tls_cert: None, tls_key: None
    };

    // `build` needs to know if there is any initialization for host services
    // like key-value storage and SQLite. In the Spin CLI these are loaded as
    // `self.key_values` and `self.sqlite_statements`; here we skip providing
    // for initialization.
    let init_data = spin_trigger::HostComponentInitData::default();

    // And now back to your regularly scheduled copy-and-pasting.
    let loader = spin_trigger::loader::TriggerLoader::new(working_dir.path(), false);
    let trigger = build_executor(&app, loader, locked_url, init_data).await?;

    let run_fut = trigger.run(http_run_config);

    todo!("manage that running server")
}

spin_trigger doesn’t expose the TriggerExecutorCommand::build_executor function, but its main job is to call TriggerExecutorBuilder::build. Here is a simplified adaptation that sets up a few dependencies then dives straight in:

use spin_app::Loader;
use spin_trigger::{
    HostComponentInitData, RuntimeConfig, TriggerExecutorBuilder, TriggerExecutor
};
use spin_trigger_http::HttpTrigger;

async fn build_executor(
    app: &App,
    loader: impl Loader + Send + Sync + 'static,
    locked_url: String,
    init_data: HostComponentInitData,
) -> Result<HttpTrigger> {
    let runtime_config = build_runtime_config(&app.state_dir)?;

    let mut builder = TriggerExecutorBuilder::new(loader);
    builder.wasmtime_config_mut().cache_config_load_default()?;

    builder.build(locked_url, runtime_config, init_data).await
}

fn build_runtime_config(state_dir: impl Into<String>) -> Result<RuntimeConfig> {
    let mut config = RuntimeConfig::new(None);
    config.set_state_dir(state_dir);
    Ok(config)
}

Notice also that build_executor returns a HttpTrigger instead of the Executor type of the original. In the HTTP case, Executor in the original is set to HttpTrigger, so this is simply substituting the concrete type for the generic one.

If you compare these to the original Spin functions, you’ll see I’ve made a few changes - either for the sake of simplicity, or because a server would want to handle it in a different way. One notable simplification is that the built-in key-value and SQLite APIs will always use the default implementation of local SQLite files: if you were following this approach in a real server, you’d want to tweak build_runtime_config to replace that with something more robust.

With just this modest amount of code, we’ve got an application that loads multiple Spin applications from a registry, and starts HTTP servers for them. All we need to do now is keep them alive long enough to test them…

Managing the Trigger Lifecycle

In a host like Fermyon Cloud, users deploy new applications and delete old ones all the time. The “world’s laziest control plane” isn’t up to that. Still, our host does need to keep the triggers we started going, so that users can actually use the applications. A lazy way is to make the run function block forever after starting the trigger, but it may be better not to make that run’s concern - that leaves the way open to use the same run function in a more dynamic environment. Instead, we can change run to return the running trigger where the control plane can manage it (in our case, admittedly, by blocking on it forever).

In Tokio, a running task is represented as a JoinHandle. So change run to spawn the trigger future, and return the resulting join handle:

// This now returns a `JoinHandle`
pub async fn run(app: &App) ->
    anyhow::Result<tokio::task::JoinHandle<anyhow::Result<()>>>
{
    let working_dir = tempfile::tempdir()?;

    let locked_app = prepare_app_from_oci(&app.reference, working_dir.path()).await?;
    let locked_url = write_locked_app(&locked_app, working_dir.path()).await?;

    let http_run_config = spin_trigger_http::CliArgs {
        address: app.address.clone(), tls_cert: None, tls_key: None
    };
    let init_data = HostComponentInitData::default();

    let loader = spin_trigger::loader::TriggerLoader::new(working_dir.path(), false);
    let trigger = build_executor(&app, loader, locked_url, init_data).await?;

    let run_fut = trigger.run(http_run_config);

    // --- New code starts here

    let join_handle = tokio::task::spawn(async move {
        let _wd = working_dir;  // Keep the TempDir in scope! Letting it drop would delete the directory
        run_fut.await
    });

    Ok(join_handle)
}

All that’s left is for main to start the applications and block on the resulting triggers. Oh, and to dump errors just in case something goes wrong:

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let apps = [ /* as before */ ];

    let mut running_apps = vec![];

    for app in &apps {
        running_apps.push(run(app).await?);
    }

    let results = futures::future::join_all(running_apps).await;
    dump_errors(&results);

    Ok(())
}

fn dump_errors(results: &[Result<anyhow::Result<()>, tokio::task::JoinError>]) {
    for r in results {
        if let Err(e) = r {
            println!("{e:#}");
        }
        if let Ok(Err(e)) = r {
            println!("{e:#}");
        }
    }
}

Putting it All Together

With this all in place, it’s time to try it out!

$ cargo run --release
    Finished release [optimized] target(s) in 1.99s
     Running `target/release/spinhost`

Serving http://127.0.0.1:4001
Available Routes:
  dioxus-test: http://127.0.0.1:4001 (wildcard)

Serving http://127.0.0.1:4002
Available Routes:
  dioxus-test: http://127.0.0.1:4002 (wildcard)
  images: http://127.0.0.1:4002/images (wildcard)

And clicking on the link shows that all is working as we’d hope.

App running on port 4002

If all isn’t as you hope, you can get the working code from here.

Wrapping Up

You might think this is a heck of a lot of effort to go to for a badly centred cat picture, and to be honest you’d have a point. But hopefully you’ve also found it interesting to see how Spin goes about loading applications and running triggers, and about different environments in which Spin can run user applications - all isolated from each other, and sandboxed from the hosting environment, by the WebAssembly runtime. What I’ve shown here is the bare bones of a ‘serverless’ platform, but also imagine being able to run user-provided Spin applications on database triggers or storage uploads, within the database or storage system.

The code I’ve shown you is based on a demo called lepton, which you can find at https://github.com/itowlson/lepton and is ready to run out of the box. lepton uses a manifest file to control which apps to run, rather than hardwiring it (the world’s second laziest control plane), and the repository also contains a big sibling, tauon, which shows starting and stopping applications based on the contents of a control directory.

I hope you’ve enjoyed the adventure! If you want to chat more about building Spin hosts, drop in to the Fermyon Discord server, or say hi on Twitter @fermyontech and @spinframework. Or if you sensibly want somebody else to take care of it all for you, check out Fermyon Cloud!


🔥 Recommended Posts


Quickstart Your Serveless Apps with Spin

Get Started