About the website

This website is created with a Haskell program which compiles pages from Markdown to HTML using the Hakyll static website generator and the Pandoc document converter. The program, presented below in literate form, assembles a navigation menu from YAML metadata headers, resolves relative references, translates mathematical equations from LaTeX to MathML, and optionally includes a bibliography and table of contents.

The source of the website is published at https://git.colberg.org/peter/peter.colberg.org.

1 Prerequisites

{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative (Alternative (empty))
import Data.Char (isLower, toLower)
import Hakyll
import Skylighting (styleToCss, pygments)
import System.FilePath
import Text.Pandoc.Citeproc (processCitations)
import Text.Pandoc.Definition
import Text.Pandoc.Options
import Text.Pandoc.Walk (walkM)
import qualified Data.Aeson.KeyMap as KeyMap
import qualified Data.Scientific as Scientific
import qualified Data.Text as T
import qualified Data.Yaml as Yaml
import qualified Network.URI as URI
import qualified Text.Pandoc as Pandoc

On Debian 12 “Bookworm”, the following packages are required:

apt install libghc-hakyll-dev libghc-io-streams-dev pandoc-data

The Makefile compiles the site program and deploys the website to a separate git branch.

If building the site fails with the error invalid byte sequence, ensure the locale supports Unicode, e.g.,

export LANG=C.UTF-8

2 Pandoc compiler

A custom Pandoc compiler converts pages from Markdown to HTML.

main :: IO ()
main = do
  let pageCompiler = getResourceBody
        >>= readPandoc
        >>= withItemBody (walkM $ transformRelativeReference)
        >>= withItemBody (unsafeCompiler . Pandoc.runIOorExplode . processCitations)
        >>= writePandocWithToc

The Markdown is parsed into a Pandoc AST, which is processed with a chain of Pandoc filters. A bibliography is created with Pandoc citeproc from references given in a separate YAML metadata block that is parsed by Pandoc (see, e.g., nano-dimer/README.md). The AST is rendered as HTML, optionally prepending a table of contents.

3 Hakyll rules

Hakyll is invoked with a set of routes and compilation rules for the site’s resources, where a route is the path of a resource in the output directory. Hakyll tracks dependencies between resources and recompiles pages when the underlying sources are updated, which is particularly useful when previewing drafts with the built-in webserver.

  hakyll $ do
    match ("*.asc" .||. "*.js" .||. "fonts/*") $ do
      route idRoute
      compile copyFileCompiler

The style sheet is created by concatenating and minimizing all CSS source files.

    create ["site.css"] $ do
      route idRoute
      compile $ makeItem =<< compressCss . concatMap itemBody <$> loadAll "css/*.css"

    match "css/*.css" $ compile getResourceBody

    create ["css/highlight.css"] $
      compile $ makeItem =<< return (styleToCss pygments)

After Pandoc compilation, the page template adds the title, navigation menu and footer.

    match ("*.md" .||. "*.lhs") $ do
      route $ cleanRoute
      compile $ do
        identifier <- getUnderlying
        let context = bodyField "body" <>
                      homeField "home" identifier "index.md" <>
                      pageContext

        pageCompiler >>= loadAndApplyTemplate "templates/page.html" context

    match "templates/*" $ compile templateBodyCompiler

Projects have their own home page and navigation menu.

    match ("*/*.md" .||. "*/doc/*.md") $ do
      route $ gsubRoute "/doc/" (const "/") `composeRoutes` cleanRoute
      compile $ do
        identifier <- getUnderlying
        let project = fromFilePath . (</> "README.md") . head . splitPath . toFilePath
            context = bodyField "body" <>
                      homeField "home" identifier "index.md" <>
                      homeField "project" identifier (project identifier) <>
                      pageContext

        pageCompiler >>= loadAndApplyTemplate "templates/page.html" context

    match ("*/doc/*.svg" .||. "*/doc/*.png") $ do
      route $ gsubRoute "/doc/" (const "/")
      compile copyFileCompiler

4 Clean routes

For user-friendly URLs, each page is stored in a separate directory with the filename index.html.

cleanRoute :: Routes
cleanRoute = customRoute $
  joinPath . map allUpperToLower . splitPath . addIndexHtml . toFilePath

addIndexHtml :: FilePath -> FilePath
addIndexHtml path
  | takeBaseName path `elem` ["index", "README"] = replaceFileName path "index.html"
  | otherwise = dropExtension path </> "index.html"

All-uppercase path elements are transformed to lowercase, e.g., NEWS.md becomes news/index.html.

allUpperToLower :: String -> String
allUpperToLower s
  | any isLower s = s
  | otherwise = map toLower s

The filename index.html is stripped when referencing pages within the website.

cleanUrl :: String -> String
cleanUrl url
  | takeFileName url == "index.html" = takeDirectory url
  | otherwise = url

cleanUrlField :: String -> Context a
cleanUrlField = mapContext cleanUrl . urlField

6 Relative references

Images and links to internal resources are given as relative references. A Pandoc filter transforms the relative references to the corresponding routes, preserving the fragment identifier that links to a section of a page.

transformRelativeReference :: Inline -> Compiler Inline
transformRelativeReference (Image attr text (url, title)) = do
  url' <- T.pack <$> (relativeReference $ T.unpack url)
  return $ Image attr text (url', title)

transformRelativeReference (Link attr text (url, title)) = do
  url' <- T.pack <$> (relativeReference $ T.unpack url)
  return $ Link attr text (url', title)

transformRelativeReference x = return x

relativeReference :: String -> Compiler String
relativeReference url = case URI.parseRelativeReference url of
  Just uri@URI.URI { URI.uriPath = path@(_:_) } -> do
    identifier <- getUnderlying
    value <- getRoute $ relativeIdentifier path identifier
    case value of
      Just path' -> return $ show $ uri { URI.uriPath = cleanUrl $ toUrl path' }
      Nothing -> fail $ "invalid reference " ++ url
  _ -> return url

relativeIdentifier :: FilePath -> Identifier -> Identifier
relativeIdentifier path =
  fromFilePath . dropDrive . URI.uriPath . URI.relativeTo (uri path) . uri . toFilePath
  where uri p = URI.nullURI { URI.uriPath = p }

7 Table of contents

If the metadata contains a toc field, a table of contents is generated from headings down to the given depth.

writePandocWithToc :: Item Pandoc -> Compiler (Item String)
writePandocWithToc item = do
  metadata <- getMetadata (itemIdentifier item)
  (`writePandocWith` item) <$>
    case KeyMap.lookup "toc" metadata of
      Just (Yaml.Number value) -> case Scientific.toBoundedInteger value of
        Just depth -> return $ writerOptionsWithToc depth
        Nothing -> fail $ "invalid toc depth " ++ show value
      _ -> return writerOptions

writerOptionsWithToc :: Int -> WriterOptions
writerOptionsWithToc depth = writerOptions
  { writerTableOfContents = True
  , writerTOCDepth = depth
  , writerTemplate = Just tocTemplate
  , writerNumberSections = True
  , writerHTMLMathMethod = MathML
  }
  where
    tocTemplate = either error id $ either (error . show) id $
      Pandoc.runPure $ Pandoc.runWithDefaultPartials $
      Pandoc.compileTemplate "" "$if(toc)$<nav class=\"toc\">\n$toc$\n</nav>\n$endif$$body$"

writerOptions :: WriterOptions
writerOptions = def
  { writerSectionDivs = True
  }