Chapter 7: HTML part two

Cover image by Blue Bird Recap Last time, we installed PostHTML plus a few of its close, personal friends. Then we started cutting up our web page into modules. We also learned: Currently, MVC is the dominant paradigm for application development on the web JAMSTACK is MVC but with an added static website How to handle Node security concerns How to configure PostHTML Splitting HTML into modules can save us effort Let's improve how images are rendered next. Images in HTML5 When we build web pages as a developer, we're used to the idea of making the same content fit wildly different devices. From phones to wide-screens. But what about images? Images have an inherent width and height and while modern browsers are good at downscaling and even stretching images to fit, this feels less than ideal. What other options do we have? The srcset attribute HTML5 brough some additional attributes for the img tag which maintain backward compatibility but also allow us to smuggle in additional paths to larger or smaller images in the same tag. Here's an example of the markup: Let's go through those attributes: src - this is just your common-or-garden path to the default image sizes - this borrows the syntax from CSS media queries to define thresholds for switching between images. Specifically, the width of the viewport of the browser. In this example, there are two: The size between zero and 600px (max-width: 600px) The size above 600px sizes (again) So what does the 480px, 800px mean? This is how much space is left for the image, once all the padding and so on around the image is taken into account. srcset - this echoes a lot of the above but gives us paths to images. Each image and width is separated from the next with a comma. So we have two here, example-image-480w.jpg which has a width of 480 pixels and example-image-default.jpg which has a width of 800 pixels. Note that browsers know nothing about an image's width before they load it in, so this is useful for the browser's layout engine to know so it can allocate space while it's rendering the page. alt - the alternative text which represents the image before it loads in I ... do not like this syntax. It's complicated and ugly and very difficult for humans to type out. However, we have an alternative! The picture tag The picture tag in HTML5 split this out into mutliple source tags, inside a wrapper tag of picture. It looks like this: I find this much easier to read. But that's not all it can do! It can also give us a top-to-bottom way to check if a browser supports a particular image format. For example: When the web browser parses this picture tag, it starts from the top and loads the first image which it has support for. Then, like an if/else statement, it skips the rest of the code and continues. Legacy browsers will render the img tag. But modern browsers will still honour its alt text in the final markup. This finally gives us a use for all the additional Sharp image pipelines we created in chapter 4 and never did anything with the output files. Revisiting our markup Currently, inside index.html, we reference a single image: Let's use a picture tag, but let's store it in a new fragment. This is cool, I promise! Create a new file in src/fragments called picture.html. It should look like this: What's with all the moustaches? The double curly braces are a little like escape characters. But instead of telling the parser to ignore the contents, it indicates that the language has changed. Now go back to your index.html page and replace the paragraph with the image tag inside it with the following: This works exactly like the method we used to import the fragment into the head tag, but this time, we're calling the picture.html module and passing data along with the request. Specifically, we're passing some JSON via the locals attribute. Note that because JSON has to have both the names and the values surrounded by double-quotation marks (and HTML isn't fussy if we use single or double quotes), the locals attribute has a single quote mark surrounding the JSON data. Once these variables reach the picture.html fragment, we extract them and add them to the markup. We can't use hyphens to separate words for these variable names ("kebab case"), so we've used underscores (this is known as "snake case"). Limitations of this approach While this allows us to put our markup into components and pass data between them, it doesn't allow us to use logic to change the markup depending upon what input data is fed. For a more robust approach, we'd probably need to install a templating language of some kind, such as Twig, EJS, Handlebars, Pug or Mustache (this is not a complete list!). Reading the documentation for posthtml-modules, you'll notice it doesn't mention package.json or any of the

Mar 11, 2025 - 11:57

Cover image by Blue Bird

Recap

Last time, we installed PostHTML plus a few of its close, personal friends. Then we started cutting up our web page into modules. We also learned:

Currently, MVC is the dominant paradigm for application development on the web
JAMSTACK is MVC but with an added static website
How to handle Node security concerns
How to configure PostHTML
Splitting HTML into modules can save us effort

Let's improve how images are rendered next.

Images in HTML5

When we build web pages as a developer, we're used to the idea of making the same content fit wildly different devices. From phones to wide-screens. But what about images? Images have an inherent width and height and while modern browsers are good at downscaling and even stretching images to fit, this feels less than ideal. What other options do we have?

The `srcset` attribute

HTML5 brough some additional attributes for the img tag which maintain backward compatibility but also allow us to smuggle in additional paths to larger or smaller images in the same tag. Here's an example of the markup:


  src="example-image-default.jpg"
  sizes="(max-width: 600px) 480px, 800px"
  srcset="example-image-480w.jpg 480w, example-image-default.jpg 800w"
  alt="This is an example image.">

Let's go through those attributes:

src - this is just your common-or-garden path to the default image
sizes - this borrows the syntax from CSS media queries to define thresholds for switching between images. Specifically, the width of the viewport of the browser. In this example, there are two:
- The size between zero and 600px (max-width: 600px)
- The size above 600px
sizes (again) So what does the 480px, 800px mean? This is how much space is left for the image, once all the padding and so on around the image is taken into account.
srcset - this echoes a lot of the above but gives us paths to images. Each image and width is separated from the next with a comma. So we have two here, example-image-480w.jpg which has a width of 480 pixels and example-image-default.jpg which has a width of 800 pixels. Note that browsers know nothing about an image's width before they load it in, so this is useful for the browser's layout engine to know so it can allocate space while it's rendering the page.
alt - the alternative text which represents the image before it loads in

I ... do not like this syntax. It's complicated and ugly and very difficult for humans to type out. However, we have an alternative!

The `picture` tag

The picture tag in HTML5 split this out into mutliple source tags, inside a wrapper tag of picture. It looks like this:


   media="(max-width: 799px)" srcset="example-image-small.jpg">
   media="(min-width: 800px)" srcset="example-image-large.jpg">
   src="example-image-small.jpg" alt="Just imagine a really cool image.">

I find this much easier to read. But that's not all it can do! It can also give us a top-to-bottom way to check if a browser supports a particular image format. For example:


  
   srcset="example-image.jxl" type="image/jxl">
   srcset="example-image.avif" type="image/avif">
   srcset="example-image.webp" type="image/webp">
   src="example-image.jpg" alt="Just imagine a really cool image.">

When the web browser parses this picture tag, it starts from the top and loads the first image which it has support for. Then, like an if/else statement, it skips the rest of the code and continues.

Legacy browsers will render the img tag. But modern browsers will still honour its alt text in the final markup.

This finally gives us a use for all the additional Sharp image pipelines we created in chapter 4 and never did anything with the output files.

Revisiting our markup

Currently, inside index.html, we reference a single image:

 src="/img/example-01.webp" alt="An animal, yesterday">

Let's use a picture tag, but let's store it in a new fragment. This is cool, I promise!

Create a new file in src/fragments called picture.html. It should look like this:


   srcset="{{ path }}.avif" type="image/avif">
   srcset="{{ path }}.webp" type="image/webp">
   src="{{ path }}.jpg" alt="{{ alt_text }}">

What's with all the moustaches? The double curly braces are a little like escape characters. But instead of telling the parser to ignore the contents, it indicates that the language has changed.

Now go back to your index.html page and replace the paragraph with the image tag inside it with the following:


  href="src/fragments/picture.html"
  locals='{
    "path": "/img/example-01",
    "alt_text": "An animal, yesterday"
  }'
>

This works exactly like the method we used to import the fragment into the head tag, but this time, we're calling the picture.html module and passing data along with the request.

Specifically, we're passing some JSON via the locals attribute. Note that because JSON has to have both the names and the values surrounded by double-quotation marks (and HTML isn't fussy if we use single or double quotes), the locals attribute has a single quote mark surrounding the JSON data.

Once these variables reach the picture.html fragment, we extract them and add them to the markup.

We can't use hyphens to separate words for these variable names ("kebab case"), so we've used underscores (this is known as "snake case").

Limitations of this approach

While this allows us to put our markup into components and pass data between them, it doesn't allow us to use logic to change the markup depending upon what input data is fed.

For a more robust approach, we'd probably need to install a templating language of some kind, such as Twig, EJS, Handlebars, Pug or Mustache (this is not a complete list!). Reading the documentation for posthtml-modules, you'll notice it doesn't mention package.json or any of the approaches we've used in this guide. Instead, the examples are in JavaScript and we've advised to add this to our Node application.

Side quest: Node apps

The approach we've used up until now is to avoid task runners as much as possible and string together commands until we have a site which meets our needs. And this works! But another approach is to write code in JavaScript to be run by Node which produces our site. This works in a similar way to how we created image-compress.js and ran that in package.json using the command node tools/image-compress.js rather than npm run ..., like we did for the other commands.

We could also use a dedicated static site generator, of which there is no shortage. But this course was intended to give you an introduction to some of the tools which are used to make these packages and how you might string them together.

Adding a second page

Sadly, PostHTML and its pals don't work in the same way as the sass package does: we call an instance of PostHTML and it processes one file at a time. This means we need to add a couple of new requirements:

The watch task needs to know when any source page has changed, and to update the corresponding distribution page
We need a task which rebuilds all of the HTML pages, for Our Hypothetical Second Developer, on first-run

Hey! This feels familiar. Isn't this what we had to do with sharp too? Perhaps we can reuse code!

Calling PostHTML

There's two different ways to call PostHTML (strictly speaking, we're calling posthtml-cli, but whatever):

Calling it and pointing it at a configuration file (posthtml.json)
Calling it and pointing it at a configuration file, but specifying the input and output file at the same time

Currently, the input and output files are hard-coded into posthtml.json, here:

{
  "input": "src/views/**/*.html",
  "output": "dist",
  "plugins": {
      "posthtml-modules": {
          "root": "./src/views",
          "initial": true
      },
      "htmlnano": {}
  }
}

Let's get rid of the input and output nodes from this file, so all it does is establishes the defaults for the plugins. It should look like this:

{
  "plugins": {
      "posthtml-modules": {
          "root": "./src/views",
          "initial": true
      },
      "htmlnano": {}
  }
}

Now let's write a script which calls PostHTML with the right paths.

The page has updated

The first requirement we discovered was to update a file in the dist directory once the corresponding src/views file changes.

Inside your tools directory, create a new file called html-update.js. It should look like this:

import { argv } from "node:process";

// Destructuring the Array from Node which includes data we need
const [node, thisFile, srcPath, fileEvent] = argv;

// White-list of events which should cause PostHTML to rebuild pages
const triggerEvents = ['add', 'change'];

// If the wrong kind of event triggers this script, do nothing
if (triggerEvents.includes(fileEvent)) {

  console.log("HTML change detected", srcPath, fileEvent);

}

This is basically image-compress.js, but with some of the guts removed.

Edit your package.json watch-html task so it now reads:

"watch-html": "onchange \"src/views\" \"src/fragments\" -- node tools/html-update.js {{file}} {{event}}"

This should ring a bell as well - this is very similar to the watch-images command, even down to the {{file}} and {{event}} arguments. Now run this task from the terminal:

npm run watch-html

It won't open a browser window (because serve isn't involved) but we can fiddle around with files and see what happens.

Side quest: Renaming variables and arguments

You might notice that we're passing an argument called {{file}} via package.json and it's being called srcPath inside html-update.js. This is normal - as we're destructing argv into different variables, they are given names which are valid within the scope of this file. The JavaScript doesn't care what they were called before they arrived, it'll use whatever name you want.

Because we're dealing with input (source) and output (distribution) filenames here, I've tweaked the variable names to reflect this.

Adding a new folder

Add a new folder inside src/views called about. Nothing will appear in the terminal because it's looking out for new html files. Take a copy of your index.html and put it inside the about directory.

The terminal should report this:

HTML change detected src\views\about\index.html add

Default files and directories

Perhaps you're wondering why we created a new folder, rather than just a file called about.html. Here's why: directories on web servers can contain multiple different files inside but the server can be configured to look for a default file. index.html is often one of these default file names. This means that you can specify the directory, but you don't have to specify the file name inside it.

In terms of URLs, this is the difference between:

http://www.mycoolsite.com/about.html

... and:

http://www.mycoolsite.com/about/

The second one looks better, is easier to say if someone is talking about your site at a party and (to some degree) disguises the technology you used to create the site.

It also means that if you change the technology in the future, the URLs can stay the same, which will save you a lot of headaches because you won't need to set up redirects.

The srcPath part of the console.log() looks wrong, as usual. Hey - we can reuse get-dist-path.js and use that to sort it! Import it at the top of html-update.js:

import getDistPath from "./get-dist-path.js";

Now let's use it like we did before. Replace your console.log() with this:

const { distPath, fileName } = getDistPath(srcPath);
console.log("HTML change detected", srcPath, distPath, fileName);

Remember how we destructed the object before? Now we'll see what we get back from getDistPath(). Try renaming src/views/about/index.html to src/views/about/index2.html. You should see this:

HTML change detected src\views\about\index2.html ./dist/views/about index2

We have most of the information here, but not the file extension. This wasn't important when we were dealing with images, but let's change /tools/get-dist-path.js so it sends it though. Luckily, it already exists as a variable in getDistPath(), so we just need to update the return statement from:

return {
  distPath,
  fileName
}

...to:

return {
  distPath,
  fileName,
  extName
}

Ignoring passed data

Even though we've changed what getDistPath() returns, we don't need to alter image-compress.js. It's cherry-picking data from the object which getDistPath() returns and doesn't care that we've stuffed even more data inside.

Back in html-update.js, update the code inside your if statement to:

const { distPath, fileName, extName } = getDistPath(srcPath);
console.log("HTML change detected", srcPath, distPath, fileName, extName);

Now rename src/about/index2.html back to src/about/index.html and look in your terminal. You should see:

HTML change detected src\views\about\index.html ./dist/views/about index .html

Couple of fixes to those paths:

1) The source path needs \ replaced with /
2) The distribution path needs views/ removed from it (this directory was useful within src so we could keep all the HTML files in one place, but we need to mix things up on the live site)

Let's create new variables with those alterations. After your destructing of getDistPath(), add a couple of new variables:

const { distPath, fileName, extName } = getDistPath(srcPath);
const editedSrcPath = srcPath.replaceAll('\\', '/');
const editedDistPath = distPath.replace('/views', '');

Why are we creating new variables?

You might reasonably look at the above code and wonder why we don't just change the original variables, rather than creating a new variable. You know, like this:

let { distPath, fileName, extName } = getDistPath(srcPath);
srcPath = srcPath.replaceAll('\\', '/');
distPath = distPath.replace('/views', '');

(we need to use let rather than const because the value is changing)

This code is frowned upon in some corners of The Internet because we're never quite sure what the value of srcPath might be at any particular moment. srcPath is initialised right at the top of html-update.js and here, more than half way down, becomes something different.

Call PostHTML

We need a new script which does for HTML what write-images.js does for images (reminder: write-image.js calls Sharp multiple times and outputs different images). Create a new file called call-posthtml.js inside the tools directory. It should look like this:

import { exec } from 'child_process';

export default function callPostHTML(inputFilePath, outputFilePath) {
  exec(`npx posthtml ${inputFilePath} -o ${outputFilePath} -c posthtml.json`, (err) => {
    if (err) {
      console.error(`exec error: ${err}`);
      return;
    }
  });
};

The exec function allows us to run commands in the terminal from within JavaScript. It's like reaching outside of the script and interacting directly with the terminal. Needless to say, JavaScript running outside of Node can't do this.

This is the line which runs in the terminal:

npx posthtml ${inputFilePath} -o ${outputFilePath} -c posthtml.json

Quick reminder of what this means:

npx - gives us access to functions within node_modules
posthtml - calls PostHTML from within node_modules
${inputFilePath} - the JavaScript within call-posthtml.js will replace this with a string we pass it, which represents the path to the input file
-o this flag means that the following path represents the output file
${outputFilePath} - the JavaScript within call-posthtml.js will replace this with a string we pass it, which represents the path to the output file
-c this flag means the next value represents the configuration file we'd like to use with PostHTML
posthtml.json - this is the configuration file we've added already, which sets up the plugins used by PostHTML

Lowering a flag

The documentation for posthtml-cli tells us that we can pass a path to posthtml and it will be assumed to be the input. An output needs a -o flag before it. And a -c flag points to a configuration file.

Perhaps you're wondering why I'm spelling this out. Actually, I don't care if you're wondering or not. Because I spent a couple of hours baffled as to why my input wasn't being found when I used a -i flag, as specified in the documentation. And it turns out I just needed to omit the -i flag.

Thanks for listening. I needed to get that off my chest. Real talk: sometimes documentation isn't accurate.

The exec function has a callback function which is run if there's an error. But let's not dwell on past mistakes. We're just going to log the error to the terminal and move on with our lives:

(err) => {
  if (err) {
    console.error(`exec error: ${err}`);
    return;
  }
}

Finally, we export callPostHTML so we can use it elsewhere.

Calling `callPostHTML` from inside `html-update.js`

Update html-update.js so that it looks like this:

import { argv } from "node:process";
import getDistPath from "./get-dist-path.js";
import callPostHTML from "./call-posthtml.js";

// Destructuring the Array from Node which includes data we need
const [node, thisFile, srcPath, fileEvent] = argv;

// White-list of events which should cause PostHTML to rebuild pages
const triggerEvents = ['add', 'change'];

// If the wrong kind of event triggers this script, do nothing
if (triggerEvents.includes(fileEvent)) {

  const { distPath, fileName, extName } = getDistPath(srcPath);
  const editedSrcPath = srcPath.replaceAll('\\', '/');
  const editedDistPath = distPath.replace('/views', '');

  // Pass `callPostHTML()` all our paths
  callPostHTML(editedSrcPath, `${editedDistPath}/${fileName}${extName}`);
}

The new bit is we import callPostHTML then call it with the correct paths, rather than just logging them to the console.

Cancel, then re-run the npm run watch-html task in the terminal. Now delete the contents of dist and rename src/views/index.html to src/views/index2.html. You should see index2.html appear inside your dist directory.

Rename it back to index.html. You should now see both index.html and index2.html. This is working as expected!

There's still a little bit more to do here: