How I Built an AI Inferencing API With Llama2 on Spin
Hey my name is Caleb and I’m a software engineer at Fermyon. Just a couple weeks ago we launched the private preview for Fermyon Serverless AI. I wasn’t working directly on this feature but was pretty excited about it so I spent some time last week kicking the tires on this new feature and built a little demo. It was super fun to build and I’ve never had an easier time working with LLMs before. Let me tell you about it.
Fermyon @ The AI Conference 2023
Join us for the grand finale of this month’s conferences, an entirely new experience for us and all attendees! We invite you to San Francisco next week for an inaugural action-packed event; The AI Conference 2023!
A “Silly Walk” through Fermyon Serverless AI
AI should be put to good use. And what use is better than generating Pythonesque quotes from a large language model (LLM)? Let’s take a quick tour of the basics of Fermyon Serverless AI by creating an homage to one of the most famous Monty Python sketches.
If you’re new to AI and want a quick and entertaining way to get started, this post is for you.
Announcing Spin v1.5
Today, we are excited to introduce Spin 1.5, which includes performance improvements, a few bugfixes, and a new exciting set of features:
- support for running AI inferencing for Large Language Models (LLMs) and for generating sentence embeddings
- improved performance when handling concurrent requests by using Wasmtime’s pooling memory allocator
- support for intra-component outbound HTTP with
allowed_http_hosts = ["self"]
- SQLite support in the TinyGo SDK
Let’s dive into some of the highlights from this release!
WasmCon 2023: The Rise and Realization of the WebAssembly Component Model
In retrospect, the defining image of WasmCon 2023 was one that I didn’t even think to take a picture of at the time. In the style of a metro map, Bailey Hayes’ keynote laid out a set of work streams, all converging on a central point: the delivery of the first stable version of the WebAssembly (Wasm) Component Model. It asked: are we there yet? And the answer was: actually, pretty much, yes.
The component model wasn’t the only story of WasmCon, of course. The “better together” theme reflected not only how projects within the Wasm community benefit each other, but also Wasm’s growing role in the broader cloud ecosystem. But if there was one major thread across the two days, this was it: the realization of the Wasm component model vision.