October 13, 2023

Serverless AI Inferencing Using Python

Tim McCallum Tim McCallum

serverless ai python

Serverless AI Inferencing Using Python

Unleashing the power of serverless AI has never been more accessible, and Python enthusiasts are in for a treat! Dive into this article to discover how Spin seamlessly bridges the gap between Python and Fermyon Serverless AI. There are no models to download. Just create, build and deploy! This step-by-step walkthrough serverless AI inferencing using Python promises to elevate your game instantly. Let’s Go!

Installing Spin

We will be using Spin to build our Serverless AI application in this article. If you haven’t already, please go ahead and install Spin.

Upgrading Spin: If you already have Spin installed please see the Spin upgrade page of the developer documentation.


There are a few things, that will only take a few minutes of your time, to do before we start. Once these prerequisites are out of the way there is only a few minutes required to create, build and deploy our Serverless AI application.

Templates and Plugins

If you’ve used Spin’s install script or homebrew to install Spin, your templates and plugins should already be installed, updated and ready. Our developer documentation does have information about managing templates and managing plugins, if you need it.

Fermyon Serverless AI

It is super easy to sign up - checkout out our Serverless AI page, and create a free Fermyon Cloud account.

Python Serverless AI Inferencing Application

This article is based on a pre-existing Serverless AI inferencing application (so feel free to reference that application, at any point, if you like). Creating, building and deploying a brand-new application should only take a few minutes at most. So here we go. Let’s dive straight in and create our new Python Serverless AI inferencing application to get the full experience:

$ spin new http-py
Enter a name for your new application: sentiment-analysis
Description: A Serverless AI application written in Python and deployed to Fermyon Cloud
HTTP base: /
HTTP path: /...

Configuring our application for Serverless AI inferencing on Fermyon Cloud is done by adding a single line to the application manifest (the spin.toml file):

$ cd sentiment-analysis
$ vi spin.toml

Simply paste the ai_models = ["llama2-chat"] line directly underneath the [[component]] section:

// --snip --
ai_models = ["llama2-chat"]

The Python source code is pretty straight forward, you can just go ahead paste the following code over (and completely replace) the existing sentiment-analysis/app.py file’s contents:

from spin_http import Response
from spin_llm import llm_infer
import json
import re

PROMPT = """<<SYS>>
You are a bot that generates sentiment analysis responses. Respond with a single positive, negative, or neutral.
Follow the pattern of the following examples:

User: Hi, my name is Bob
Bot: neutral

User: I am so happy today
Bot: positive

User: I am so sad today
Bot: negative

User: """

def handle_request(request):
    request_body = json.loads(request.body)
    sentence = request_body["sentence"].strip()
    result = llm_infer("llama2-chat", PROMPT + sentence)
    response_body = json.dumps({"sentence": re.sub("\\nBot\: ", "", result.text)})
    return Response(
        200, {"content-type": "application/json"}, bytes(response_body, "utf-8")

We now build our application using the spin build command:

$ spin build
Building component sentiment-analysis with `spin py2wasm app -o app.wasm`
Spin-compatible module built successfully
Finished building all Spin components

To deploy as a Fermyon Cloud Serverless AI application we simply type the spin deploy command:

$ spin deploy
Uploading sentiment-analysis version 0.1.0-rc3a6c2bd to Fermyon Cloud...
Waiting for application to become ready ...
Available Routes:
  sentiment-analysis: https://sentiment-analysis-abc-xyz.fermyon.app/ (wildcard)

Note, all Fermyon Cloud applications are initially hosted on a domain that is made up of the application’s name sentiment-analysis, followed by some unique randomly assigned characters. If you would like to give your Fermyon Cloud application a stronger sense of branding/identity you can apply your own custom domain. You do not even need to provide a SSL certificate during this custom domain process. Fermyon Cloud creates certificates internally using Let’s Encrypt, and will keep renewing your certificates for the lifetime of your application.

To test the application, we can use curl to make a request. For example:

$ curl -X POST --data '{"sentence":"Everything is awesome!"}' https://sentiment-analysis-abc-xyz.fermyon.app/

    "sentence": "positive"

Visit Fermyon Developer Home for more information and please reach out on Discord if you have any further questions.

Thanks for reading!

🔥 Recommended Posts

Quickstart Your Serveless Apps with Spin

Get Started