August 19, 2022

The WebAssembly (Wasm) Content Management System (CMS) and Search Engine Optimization (SEO): A Short Story

Tim McCallum

spin cms wasm webassembly cloud rust seo microservices bartholomew

The WebAssembly (Wasm) Content Management System (CMS) and Search Engine Optimization (SEO): A Short Story

Meet Charlie. Charlie is in the process of growing a thriving online business and is elated about the possibility of writing and publishing new content almost every day. Charlie already knows a little bit about tech i.e. the Linux, Apache, MySQL and PHP (LAMP) stack, but does not want to personally put a lot of time and effort into installing, configuring and maintaining servers, databases, security, networking and so forth. Charlie is aware of the concept of ever-changing technology. Specifically, how Virtual Machines (VMs), containers and cloud-computing services are ubiquitous nowadays. In a nutshell, Charlie needs the most frictionless Content Management System (CMS) with optimal Search Engine Optimization (SEO) compliance.

Charlie is exploring the best way to build a fast and seo friendly

Discovering WebAssembly (Wasm)

A little bit of online research is leading Charlie to discover a relatively new technology called WebAssembly (Wasm). Charlie is learning how Wasm is not only being utilized in the browser but that Wasm is also increasingly becoming complementary to other technologies. Technologies such as containers, cloud-computing services, and more generally, the distributed computing paradigm known as edge-computing.

All of this makes Charlie extremely curious about:

  • minimizing costs
  • reducing carbon footprint
  • eliminating maintenance tasks
  • increasing web page speeds
  • improving overall site performance
  • integrating rich media (images, videos)
  • integrating social media
  • simplifying workflows
  • scheduling and automating publishing
  • saving time and just focusing on the content!

Discovering WebAssembly

Bartholomew - A Wasm CMS Optimized for SEO

Charlie’s newfound passion to learn about the future of Wasm eventually led to a product called Bartholomew; an open-source CMS offering from Fermyon with well-documented SEO optimization.

Bartholomew - Convenience

As a recent article explains, Bartholomew is not a toy, and not just another Wasm demo. Bartholomew is a production-grade application with a large feature set including the use of both Markdown (and embedded HTML if required) for creating content. It also offers TOML content formatting and Handlebars templating. Simply put, creating content in Bartholomew is a breeze.

Bartholomew - Flexibility

In addition to the above feature set, which is designed to enable optimal convenience, Bartholomew also offers the use of Rhai. Rhai is an embedded scripting language for Rust which provides a way to safely and easily add scripting to any application. This is a very powerful aspect of the CMS because it allows users the flexibility to create their own customizable functionality and grow the CMS into whatever their future desires require. You will soon see how simple Rhai is to use and how Bartholomew uses Rhai to enhance SEO; creating a sitemap.xml file on the fly. More about that in just a minute …

Bartholomew - Cost Savings & Performance

Bartholomew is deliberately built with Wasm as its core. Specifically, Wasm modules are playing a key role in the fundamental operation of every interaction, of each of the CMS users. This type of architecture is essentially treating each Wasm module as a microservice. This means that compute resources (which execute these microservices) are only provisioned when the user needs them. Fermyon is re-thinking microservices. The advantages of using Wasm in terms of cost savings and performance are as follows. The stand-alone stack-based Wasm VM can spin up, execute the business logic and then spin down at a moment’s notice. If set up correctly, this integration of Wasm can arguably offer the highest speeds and performance whilst offering the lowest operating costs of competing cloud-based architectures.

Bartholomew


We are about to go slightly off-topic. However, this brief digression shows an excellent (and current) example of the intersection between introducing Wasm and re-thinking microservices. Please allow me to introduce Finicky Whiskers, the world’s most adorable load generator.

Finicky Whiskers game

Finicky Whiskers is composed of several microservices, all running as WebAssembly components. If you would like to learn more about how this game works there is a 4 part blog series that will help you dive deeper into the technology that runs this game (behind the scenes).

I digress, back to CMS and SEO!


Back to CMS & SEO

Charlie wants everyone from casual readers, to the titans in Charlie’s industry to be fully engaged with all of the online content that is published on the CMS. The first logical step is Google verification.

Google Verification using Bartholomew

Charlie has to officially verify the CMS with Google. This is an important first step because the verification process provides Charlie with access to the Google Search Console. The Google Search Console (formerly Google Webmaster Tools) allows webmasters to check indexing status, search queries, and crawling errors and also optimize the visibility of their site. More on the Search Console in just a minute …

Let’s take a look at how the verification process is accomplished using Bartholomew.

Markdown

The first step in the Google verification process is where Google provides the owner of the specific website with a specially named file i.e. abcdefg.html. Google now wants Charlie, the owner, to make this file available on the site so that Google can fetch it as proof that Charlie is the owner and has access control to the site. This is a really simple task. First Charlie creates a Markdown file (in Bartholomew’s content directory) called abcdefg.md. Bartholomew uses templating so Charlie just has to be explicit about a couple of things inside that new .md file. Specifically, Charlie makes sure that there is a template name (Charlie will create the template next) and that the content type of this file is rendered as text/html. This is shown in the screen capture of the new abcdefg.md file below.

title = "Google Verification"
description = "Google verification file which provides us with access to Google Search Console"
date = "2022-07-11T00:01:01Z"
template = "google_verification"
content_type = "text/html"

Template

As discussed previously, Bartholomew uses Handlebars templating. Charlie goes ahead and creates a new google_verification.hbs file in Bartholomew’s template directory and populates it with the content which Google requested to be in that file.

google-site-verification: abcdefg.html

Charlie clicks a button, Google fetches the file from Charlie’s site and the verification is complete!

Search Engine Optimization (SEO) using Bartholomew

In addition to just verifying the ownership of a site/domain, Charlie can see that there are specific SEO requirements in relation to how the Googlebot indexes content. Googlebot is the web crawler software used by Google.

Let’s take a look at how the SEO compliance (i.e. sitemap and robots.txt) is accomplished using Bartholomew.

Generating a Sitemap

Google expects the standard sitemap protocol to be implemented. Thankfully, Bartholomew automatically builds a sitemap file based on the entire set of content in the CMS. The heavy lifting of the work is performed using the Rhai scripting language (which we spoke about earlier). Here is a snippet, as an example.

// This function lists all of the posts, filtering a few.
//
// It returns an array of objects of the form:
//  [
//    #{ uri: "path/to/page", page: PageObject }
// ]

// These should be skipped.
let disallow = [
    "/sitemap", // Don't list self.
    // "/tag", // tag will list all of the tags on a site. If you prefer this not be indexed, uncomment this line.
    "/index", // This is a duplicate of /
    "/atom",
    "/robots",
];

// Param 1 should be `site.pages`
let pages = params[0];

let site_pages = [];
let keys = pages.keys();
for item in keys {
    let path = item.sub_string(8);
    let page = pages[item];

    path = path.sub_string(0, path.index_of(".md"));
    if !disallow.contains(path) {
        site_pages.push(#{
            uri: path,
            page: page,
            priority: prioritize(path),
            frequency: "weekly",
        });
    }

}

// This is an example of how we could prioritize based on information about the page.
//
// Specifically, here we use path to boost docs and blogs while reducing the priority
// of author pages and features.
fn prioritize(path) {
    let boost = ["/blog/", "/docs/"];
    for sub in boost {
        if path.contains(sub) {
            return 0.8
        }
    }
    let nerf = ["/author/", "/features/"];
    for sub in nerf {
        if path.contains(sub) {
            return 0.3
        }
    }
    0.5
}

// Return the blogs sorted newest to oldest
fn sort_by_date(a, b) {
    if a.page.head.date < b.page.head.date {
        1
    } else {
        -1
    }
}

// Sort by the value of the page date.
site_pages.sort(Fn("sort_by_date"));

site_pages

In conjunction with the above scripting, the aforementioned Handlebars templating assists in this work being performed dynamically (using variables common between the script and the template); as shown in the sitemap.hbs file’s contents below.

<?xml version="1.0" encoding="UTF-8" ?>
{{!
    For sitemap.xml, see https://www.sitemaps.org/protocol.html
    For date/time format, see https://www.w3.org/TR/NOTE-datetime
}}
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
        <loc>{{site.info.base_url}}/</loc>
        <changefreq>daily</changefreq>
        <priority>0.8</priority>
    </url>
    {{#each (sitemap site.pages) }}
    <url>
        <loc>{{../site.info.base_url}}{{uri}}</loc>
        {{#if page.head.date }}<lastmod>{{date_format "%Y-%m-%dT%H:%M:%SZ" page.head.date}}</lastmod>{{/if}}
        <changefreq>{{frequency}}</changefreq>
        <priority>{{priority}}</priority>
    </url>
    {{/each}}
</urlset>

If you ever need assistance with any of the scripting, templating or Markdown mentioned here, please go ahead and jump into our Discord server. We are here to assist and would love to see what you are building


From a display point of view Charlie again just uses Markdown (creates a sitemap.md file in the site’s content directory, correctly references the name of the template (sitemap) and then ensures that the content type is set to text/xml). The above process will generate an XML sitemap called sitemap.xml at the root of the site. Perfect!

title = "Sitemap XML file"
description = "This is the sitemap.xml file"
date = "2021-12-29T22:36:33Z"
template = "sitemap"
content_type = "text/xml"
---

This is the autogenerated sitemap. Note that the suffix .xml is replaced with .md by Bartholomew.

Creating a Robots File

Charlie can actually control the Googlebot and tell it which files it may access on the site. This is done via the use of a robots.txt file.

Similarly to the process above, Charlie creates a robots.md Markdown file in the content directory and also a robots.hbs in the template directory. These are shown below (in that order).

title = "Robots"
description = "This is the robots.txt file"
date = "2021-12-30T03:17:26Z"
template = "robots"
content_type = "text/plain"
---

This is the robots.txt file. It is autogenerated.
{{!
For info on what can be placed here, see http://www.robotstxt.org/
See also: https://developers.google.com/search/docs/advanced/robots/intro
}}
User-agent: *
Sitemap: {{site.info.base_url}}/sitemap.xml
Disallow: /index

Google Search Console

The above steps of a) verifying and b) complying with the SEO requirements will give Charlie great control over what is indexed by Googlebot, and other Web Crawlers also. From a Google Search Console perspective specifically, Charlie can now enjoy specific features and benefits such as on-demand page indexing i.e. allowing Google to go ahead and index specific content (like a new blog post) and much more.

Google Search Analytics

Google Analytics tracks and reports website traffic. Showing not only where users are visiting from but how long they are staying and which pages they are reading and so forth. Charlie can even see how many users are on the site in real-time.

Charlie has managed to drive search traffic to the content in a big way but can now also analyse how users interact with the content.

This is not a Drill

We at Fermyon are also passionate about creating content. In fact, the day we launched Fermyon, we decided that our first project would be to host our own website on our new Wasm stack. Yes, that’s right, this very blog post that you are reading is presented using Bartholomew. For example, you can see the Fermyon robots.txt file and the sitemap.xml file implemented; just as shown in Charlie’s story above. At Fermyon, we actually run Bartholomew CMS using Hippo deploying to a Nomad cluster.

A Little Help?

If you like the idea of a frictionless production ready CMS which can help grow a business or a brand, whilst reducing costs please reach out. For example, if you are trying to set up your own Bartholomew CMS and have any questions, please go ahead and jump into our Discord space. You will find channels in there specifically related to Bartholomew and more.

There are a few other ways to get in touch, see the Fermyon contact details section below … and thanks for reading.

Fermyon Contact Details

We would love to hear what you are building and also help you out to ensure that you create Wasm applications beyond your wildest dreams.

Discord

Discord: We have a great Discord presence. Please join us in Discord and ask questions and share your experiences with Fermyon products like Spin.

Twitter: Following and subscribing to our Twitter is a great way to keep in touch.

GitHub: We can be reached via GitHub.

Email: Please feel free to Email us.

Become an Insider

If you would like to “Become an Insider”, please fill out this brief form to get early access, deeper insights and other insider invitations.

We are Hiring

We have a number of job openings available in areas of training, software engineering, developer relations, community management and more.


🔥 Recommended Posts


Quickstart Your Serveless Apps with Spin

Get Started