About
Cover image for my ssg blog post.

🍦 DIY Static Site Generation (SSG)

In this blog post I describe how I wrote my own solution to static-site generation (SSG) using less than 100 lines of node code. It even generated this blog post! (How meta.)

by yours truly

Click here to skip to the meat-and-potatoes. Otherwise you will end up reading a story about why I wrote this article. It reads like a preface to an online recipe and ymmv. Last chance, click here NOW to go to the good shit.

... ok now that the nerds are gone ... this one is for all the online recipe fans out there ... click here to reveal a bonus preface ...

oof, got'ya ... ok for real now, let's skip ahead ...


Writing can be as tedious as publishing, but the pain of publishing can be turned into a quick and easy experience with the help of the right software. Sites like Blogger and Medium, as well as products like Wordpress are among the easiest and most popular solutions for blogging. In my case, I wanted to make my own solution because damn it, I'm a web developer and this is my website. How hard can it really be to write software to publish blog posts?

So I took a stab at writing my own solution, that's what I'll describe here in this blog post. I sought out to create a solution that matched a few criteria that I imagined would be my ideal self-publication experience. In particular, I needed my custom blogging software to give me the ability to:

  • write blog posts in plain HTML, nothing fancy like markdown,
  • save my blog posts to files, avoiding the need for databases,
  • easily integrate into my current site styles, for consistency,
  • avoid javascript frameworks, to keep my site light and quick,
  • create the system myself, for the sake of fun.

I realized I needed static site generation.

Static Site Generation (SSG)

What the he|| is SSG and why would we use it? Cloudflare defines SSG as ...

... a tool that generates a full static HTML website based on raw data and a set of templates. Essentially, a static site generator automates the task of coding individual HTML pages and gets those pages ready to serve to users ahead of time. Because these HTML pages are pre-built, they can load very quickly in users' browsers.

So SSG is a tool that generates static files and because those files are static, pages load quickly. But what exactly do SSG tools do to generate those static files, and when does this process happen? FreeCodeCamp describes SSG as a process of ...

... compiling and rendering a website or app at build time. The output is a bunch of static files, including the HTML file itself and assets like JavaScript and CSS.

The same FreeCodeCamp article above mentions Next.js as an option for applications built in React, which would give me excellent control and out-of-the-box config for performing SSG.

Unfortunately Next.js is not a good option for me simply because my personal website does not use React. I briefly unpacked this tangent, click here to read more.

This probably deserves a blog post in-and-of-itself, but for illustration, most interactivity on my site is either written in typescript and compiled down to a javascript module (e.g., see my show-html-comments module; it's responsible for the HTML comment easter egg on my about page), or a small web component (e.g., see my poloroid-like carousel). (And because I wanted my blog posts to be written in HTML, I could even write one-off javascript interactivity, like the "cooking recipe" easter egg functionality at the beginning of this post.)

I just haven't written interactivity that requires stateful layers or super complex user experiences to justify reaching for javascript frameworks (yet). Instead, my use cases are small and discrete, which can be perfectly addressed by importing small and discrete global javascript modules.

This allows me to easily add-and-remove scripts to different pages, as well as gain a performance boost because each script is bundled and loaded separately. So although I am a big fan of React, for now I'd prefer an option that will keep my javascript overhead light.

Anyways, I'm in the mood to build an SSG tool myself with vanilla javascript and node. So let's begin.

My Solution

Currently my personal website follows a simple folder structure that is traditionally used in server directories, which is recommended in the modern web dev guide:


  www
  └── css
      └──  ... css assets in here
  └── html
      ├── index.html
      └── ... other html
  └── js
      └──  ... js assets in here

To get started, I updated my package.json with the following scripts that will trigger my custom SSG script in two situations, hot module reloading and builds:


  "start": "yarn start:server & yarn start:scss & yarn start:ssg",
  "start:server": "wds --config ./wds.config.mjs --app-index www/html/index.html",
  "start:scss": "sass --no-source-map -w www/css --style compressed",
  "start:ssg": "yarn start:ssg:blog",
  "start:ssg:blog": "esr .bin/watch.mjs --dir www/html/blog --ignore index.html --cmd 'yarn build:ssg:blog'",
  "build": "rm -rf public && yarn build:ssg && yarn sass && rollup -c ./rollup.config.mjs",
  "build:ssg": "yarn build:ssg:blog",
  "build:ssg:blog": "esr .bin/ssg.mjs --template www/html/blog/template.html && prettier --write www/html/blog/**/index.html",

A couple critical additions here:

  1. start runs start:ssg and therefore runs start:ssg:blog in parallel with other local development config.
  • Notice that start:ssg:blog calls .bin/watch.mjs.
  • Also notice that yarn build:ssg:blog is passed as the "--cmd" argument to the .bin/watch.mjs script.
  1. build runs build:ssg and therefore runs build:ssg:blog.
  • Notice that build:ssg:blog calls .bin/ssg.mjg.

watch.mjs is a handy script that will watch a directory for file changes and trigger a side-effect command in response. In this case, the command yarn build:ssg:blog runs in response to file changes. Essentially this provides hot module reloading (HMR) in my local development environment for all SSG'd content. This means that I can make changes to a blog file, save it, and immediately see the SSG'd version of that blog post in my browser.

Much of the heavy lifting for this functionality is provided by the chokidar node library, meaning the source code for this watch script is as little as 20 lines of node code. Not bad!

/**
 * watch.mjs -- file change side-effect script
 *
 * Given a directory and a command, this script will watch the target
 * directory and run the provided command whenever any files change.
 *
 * @arg {string} cmd The command to run, e.g., `--cmd 'yarn build'`
 * @arg {string} dir The target directory, e.g., `--dir www/html/blog`
 * @arg {string | undefined} ignore File patterns to ignore, e.g., `built-file.html`
 *
 */

import chokidar from "chokidar";
import { spawn } from "child_process";
import args from "./args.mjs";

const {
  dir,
  cmd,
  ignore: ignored,
} = args(["dir", "cmd"], { optional: ["ignore"] });

console.log(
  `watch.mjs watching for changes in "${dir}" and will run "${cmd}" in response \n`
);

chokidar.watch(dir, { ignoreInitial: true }).on("all", function (event, path) {
  if (ignored && path.match(ignored)) return; // break infinite loops
  // run the provided command with its args
  const [command, ...args] = cmd.split(" ");
  spawn(command, args, {
    stdio: "inherit",
  });
});

For both HMR and builds, I call ssg.mjs. This script requires two arguments: "--base" (optional) and "--template". If a base is not provided, the template's directory is assumed to be the base directory. The template is HTML that includes content "slots" within HTML comments using the following syntax: <!--ssg:xyz.html-->. For example, here is the template I used when this blog post was generated.

Content slots are identified by parsing the template for the matching syntax. For example, if the slot <!--ssg:article.html--> is defined, then the SSG script will walk recursively from the base directory to find files matching "article.html". For each file found that matches the slot, a page is generated. If multiple slots are defined and multiple matches are found in a specific directory, the content for both files are injected into the template and one file is generated.

Global slot content can be defined at the base directory. For example, if the slot <!--ssg:header.html--> is defined at "/base-dir/header.html", then the content within "/base-dir/header.html" will be considered the global or fallback content to use for each page. For example, if the SSG script finds a directory with only "article.html" defined, then the global content for the header defined in "/base-dir/header.html" will be used for that page. If however that same directory contained a "header.html", that content is used for that slot instead.

Here is a diagram to further illustrate how the script will take a template and apply it to all matching slots from the base directory:


    blog                     -- "base" - nested folders are targeted for iteration
    └── index.html           -- this file is ignored
    └── blog-template.html   -- "template" - this file is the template
    |                             -- within this file are two slots:
    |                             -- <!--ssg:article.html--> and <!--ssg:header.html-->
    └── article.html         -- this is the GLOBAL article slot default / fallback
    |                             -- (global slots are not required; script will fail graciously)
    └── header.html          -- this is the GLOBAL header slot default / fallback
    └── 2020/10                
        └── article.html     -- this is the LOCAL article slot override
        └── index.html       -- SSG'd file w/ LOCAL article and GLOBAL header
    └── 2020/11                
        └── article.html     -- this is the LOCAL article slot override
        └── header.html      -- this is the LOCAL header slot override
        └── index.html       -- SSG'd file w/ both LOCAL article AND header

For a more complex example, take a look at my blog directory as-of the publication of this blog post. (It looks similar to the above diagram, but with more slots!)

With just under 60 lines of source code, ssg.mjs will take a template, walk a directory, and generate static content for all matching slots.

 /**
  * ssg.mjs -- static site generation script
  * Given a template and a target directory base, this script will
  * iterate over the nested directories from the base directory and
  * files that match the content specified in the template.
  * @arg {string | undefined} base
  * The directory to target for ssg iteration. If not defined,
  * the template's path is used for iteration.
  * @arg {string} template The path to the template.html file.
  * The template file should include one or more HTML comments with
  * slots defined using the syntax: ``.
  * For each directory in the base path that includes at least 1 slot,
  * an `index.html` is generated with 
  * @example Folder structure:
     blog                     <-- "base" - nested folders are targeted for iteration
     └── index.html           <-- this file is ignored
     └── blog-template.html   <-- "template" - this file is the template
     |                             - within this file are two slots:
     |                              and 
     └── article.html         <-- this is the GLOBAL article slot default / fallback
     |                           (global slots are not required; script will fail graciously)
     └── header.html          <-- this is the GLOBAL header slot default / fallback
     └── 2020/10                
         └── article.html     <-- this is the LOCAL article slot override
         └── index.html       <-- SSG'd file w/ LOCAL article and GLOBAL header
     └── 2020/11                
         └── article.html     <-- this is the LOCAL article slot override
         └── header.html      <-- this is the LOCAL header slot override
         └── index.html       <-- SSG'd file w/ both LOCAL article AND header
  */
 
 import { readFileSync, writeFileSync } from "fs";
 import glob from "glob";
 import { dirname, join } from "path";
 import args from "./args.mjs";
 
 // setup helper data / fxs
 const ssgSlotSyntax = /(?<=)/g;
 
 function content(slots, __path, content = {}) {
   slots.forEach((slot) => {
     try {
       // if there is content in the taraget directory, use it
       const __file = readFileSync(join(__path, slot), "utf-8");
       if (__file) content[slot] = __file;
     } catch {
       // else if no global content, use empty string to clear html comments
       if (!content[slot]) content[slot] = "";
     }
   });
   return content;
 }
 
 // destructure script args
 const {
   template = null,
   base: __base = join(process.cwd(), dirname(template)),
   __template = join(process.cwd(), template),
 } = args(["base", "template"], { optional: ["base"] });
 
 // derive ssg slots
 const html = readFileSync(__template, "utf-8");
 const slots = html.match(ssgSlotSyntax);
 
 // gather global slot content from base into obj
 // keys are file names for the content, values are the content within that file
 const globalSlotContent = content(slots, __base);
 
 // iterate over relevant nested files from the base directory
 glob(
   `${__base}/**/*.html`,
   { ignore: `${__base}/*.html` },
   function (_, files) {
     const __directories = new Set(files.map((file) => dirname(file)));
     __directories.forEach((__dir) => {
       // update slots
       const localSlotContent = content(slots, __dir, globalSlotContent);
       const regex = new RegExp(
         Object.keys(localSlotContent)
           .map((key) => ``)
           .join("|"),
         "gi"
       );
       let sscontent = html.replace(
         regex,
         (matched) => localSlotContent[matched.match(ssgSlotSyntax)]
       );
       // update relative paths
       const relativity = (__dir.split(__base)[1].match(/\//g) || []).length + 1;
       const relativePathStr = /(?<=)("\.\.\/)/g;
       sscontent = sscontent.replace(
         relativePathStr,
         () => `"${'../'.repeat(relativity)}`
       );
       // write file
       writeFileSync(`${__dir}/index.html`, sscontent);
     });
   }
 );

Wrapping Up

It is not much but it'll do. What I have described here is my high-level approach to designing a custom SSG tool. Credit owed to this article that describes SSG in node, which I drew a little bit of influence from when creating my solution.

There are certainly opportunities to improve my solution, some ideas of mine include:

  • Single Page SSG: rather than recursively generating files from a base directory, it would be nice to target just a single directory for cases where there may be many subdirectories (e.g., generation on the fly).
  • SSG from a fountain: pull data from a remote source and plug the returned content into a template.
  • Hybrid SSG: pull some data from a remote source (e.g., define an endpoint within a file to ping, parse, and print).
  • SSR On Top: If the above 3 tasks are implemented then it would be interesting to respond to unfound pages within a known SSG target with an SSG runner, for SSG on-the-fly. This might be helpful for cases where there are hundreds or more entries to generate. (Is this when SSG becomes SSR??)
  • CMS layer, but let's not get too ahead of ourselves.