Empowering Developers With AI Inferencing: A New Coding Essential

Whether you’re a seasoned developer or a curious novice, you can embrace AI inferencing to boost your coding capabilities. AI is no longer only available in the domain of the rare AI specialist. Read on to discover how accessible AI actually is and what new and interesting things we are building with it.

Artificial Intelligence (AI) Inferencing Is a Basic Coding Skill

As September wound to a close, the first AI Conference took place on the campus of UC San Francisco. While many conferences have failed to gain critical mass after the pandemic-related shutdowns, the AI Conference maxed out the facility. I am told that while over 1,000 people attended, just as many were on the waitlist. This is a testament to the present popularity of a burgeoning technology.

While AI itself is a broad area, the most popular topic seemed to be Large Language Models (LLMs). This is an area we at Fermyon have found very interesting. Being able to use these trained models like LLaMa2 to perform an array of tasks based on simple text prompts and refinements has opened a world of possibilities. Summarizing text, interacting in chat, analyzing sentiment, and even generating new text. All of these are good uses for an LLM.

The audience at the AI Conference surprised me. Of course, there were plenty of Ph.D.-holding experts well-versed in the technical nuances of AI. But there were far more who were still in the early stages of learning and applying the nascent technology. At our booth, we had dozens and dozens of conversations about what new and interesting things could be achieved.

This re-enforced what has now become a core belief of mine:

AI inferencing, especially with LLMs, is no longer the domain of the AI specialist. It is now just another tool in the full stack developer’s toolbox.

Those who have not already dabbled with coding against an LLM are in danger of falling behind the curve, as working with inferencing is a skill akin to writing SQL: It’s just something one needs to be proficient with.

With developers unfamiliar with inferencing, I frequently hear confusion. I hear similar questions: “Are you suggesting we use AI to generate our code?” While that is certainly a possibility, that is not what I mean when suggesting that inferencing is a new and necessary skill.

Inferencing is the process of making an inquiry of an LLM using a prompt. For example, we can ask an LLM to analyze the speaker’s tone, suggest similar texts to a given article or even just ask it to describe something in a silly way. But this is all done in code. And as a developer, you may deal with the results in code. In that sense, it really is no different than querying a database: You submit a query, you receive a response, and then it is up to you what to do with the response.

Much has been made recently of vector databases. The reason is that they can be used to augment what you can do with an LLM. More specifically, you can take an existing LLM (like LLaMa2) and refine it so it can generate more specific answers. This process is called generating embeddings. Those embeddings can then be stored in a special database. And prompts can then be augmented with that data when you are querying.

In the Spin Up Hub, we have an example that uses embeddings and a vector database. The code crawls through the content of a Bartholomew site and ingests all of that text into an embeddings database. Then, for each article on the site, it generates a “related links” section, where the LLM generates those links based on the similarity of the text to other articles on the site.

Rather than requiring you to bring your own vector database, though, Fermyon merely enables vector functions on our SQL Database. This means you need to learn one less thing when writing these new inferencing apps.

We’re excited at the possibilities that in-code LLM interactions afford to developers, our Serverless AI API Guide covers how the Spin SDK surfaces the Serverless AI interface to a variety of different languages.

If one thing was clear from the AI Conference, it’s that we are far from alone in this enthusiasm. The next wave of cloud computing is about efficiency, performance, and reducing costs. But it’s also about incorporating the most powerful computing tools at hand in a way that lets you write code quickly and easily.

If you’d like to try your hand at writing a first inferencing app, get started with the Spin AI inferencing tutorial. Simply sign up for our Serverless AI private beta to get started with Fermyon Serverless AI.

Interested in learning more?

Talk to us