Quarto integrated website system

Academic blog publishing that is easy on me, albeit hard on my computer

December 1, 2023 — October 7, 2024

academe
faster pussycat
how do science
javascript
julia
language
making things
plain text
premature optimization
python
R
writing
Figure 1

Quarto includes its own website system, which supplements pandoc’s inbuilt toolchain with a javascript-based build system using standard HTML tools such as Bootstrap, Sass and EJS.

Notably the site you are reading right now is built using quarto’s native website system. I have no way of confirming this, but I suspect that my blog site is the largest quarto site on the internet, in terms of number of pages and words, and probably frequency of updates. As such, it is a stress-test of the the quarto website system. As the maybe-biggest quarto user on earth, I can report that

  1. Quarto is capable of handling a million-word website like this, but
  2. Not smoothly.

Let us break down the good and the bad.

1 Vibes

tl;dr

Does enough of what I want that I use it, despite some qualms. I can ignore most of the complexity involved in delivering what it delivers, because per default it mostly does what I want. It has an active, friendly developer community. It was probably not really designed for websites as hefty as this one and has performance issues.

After nearly one year of full-time, daily quarto use, I have feelings and thoughts about quarto.

Quarto’s strong ecosystem is a vote in its favour. There are vibrant discussion boards, many active developers, many active users, and good integrations into various IDEs (VS Code, RStudio etc). It has a corporate sponsor, Posit FKA RStudio, who sell some extra add-ons, some of which I am a fan of (Shiny, Posit Connect).

I have said on-record that a vibrant community is a better test of usefulness and predictor of future support for some software than my feeble, biased personal, aesthetic judgement. But please, read on, if you nonetheless desire to know my aesthetic judgement.

The quarto system is more tightly integrated than hugo is with blogdown, which, not to get into the weeds, was the annoyance I was trying to salve. OTOH quarto is not substantially simpler than the mess it replaces. I count that as a marginal win.

Quarto is more opinionated than blogdown if I use the built-in website system (although in principle I could build my own different website system). If I happen to like a 2-3 column layout blog with standard features (search, overview by date) everything is easy. OTOH, deviating from this layout is difficult, poorly documented, and surprisingly complicated. For example, it is not trivial to vary the CSS framework from the default Bootstrap.

Opinnated is not bad per se. There are some worrying signs of code chaos. Quarto websites will not win the Grug Brained Developer seal of approval. On the forums we learn that the code has some band-aid bits, e.g. there are two colliding template systems in use whose relationship is under-documented. A core developer has left the project and would like a minimalist holiday. The code chaos is, however, not yet worse than other systems I have tried.

The overall theming and site structuring is somewhat less flexible than hugo, the backend used by blogdown, but the integration with said backend is better. Quarto leverages many more features of pandoc than was possible with blogdown, which leads to many well-supported advanced typographical features. That means things like citations and cross-references work without much pain. The overall experience is somewhat better on net, since much of the flexibility of hugo was useless to me in any case, hidden behind feature mismatch.

If one wished to use the quarto engine to experiment with quirky, alternate features (such as the content ranking, recommendation or the “constantly updated” indexing systems as seen on this site) then one is, AFAICT, out of luck we can use custom listings. That seems to do about 80% of what I want, albeit buggily. YOLO! Let us 80/20 it.

Quarto websites are hefty, and slow to build. Since I am not a web developer but rather an academic, this price seems acceptable to me for my specific use case — the opinionated default is pretty close to what I want — but this might not be the optimal trade-off if your own needs differ.

The fact that I am mentioning these things on my (Quarto) blog rather than fixing them should be taken as a sign that I am I still think quarto is a net win over the alternatives. These friction points are annoying, but it would take a week or two in expectation to make substantial progress on fixing any of them problems, and the fix in each case is not valuable enough for me to do that.

There were some things that were too annoying, and I got fixes for those already either by my own efforts or from the very helpful community.

2 Community support

3 Quarto websites are slow to load

Figure 2: Quarto on this website produces produced a comically large front page download per default. UPDATE: much better now.

Quarto websites can be enormous compared to the equivalent blogdown site, in various senses. For the reader, browser memory usage by all the javascript wizardry etc is substantial. Even though they look like small, efficient static sites, the actual cost of all the bells and whistles behaviour adds up.

Some progress has been made on making quarto sites smaller to download.

The listings in particular, can be huge. When I migrated this website to quarto the front-page download went from less than 1MB to 135MB, which is, for reference, comically huge for a 3 paragraphs and list of the titles of the last 10 blog posts.

I assume that this is because the listing on that front page, in order to provide dynamic sorting etc, loads essentially all the posts on the blog, no matter how old, and their associated images, at full resolution. AFAICT quarto does not generate image thumbnails, and also AFAICT the listing system is was not built around lazy loading. Background:

None of these extensions worked for me, however., so I wrote a custom script that postprocesses the site, which does work.

4 Quarto websites are slow to build

tl;dr a typical CLI invocation of blogdown was about 1 second. A typical CLI invocation of quarto render for this site takes about 12 minutes 17 minutes and rising. Uploading the files incrementally to the server takes an extra 5 minutes on top of that.

I am surprised how much I miss the speed and efficiency of blogdown, with its smugly high-speed hugo backend. I honestly thought the speed was not a thing I cared about until I did not have it. Switching from blogdown to quarto website made my site muuuuuuuuch slower and the friction of the slow build process became a constant annoyance. To build the 1000+ posts on this site typically took hugo a few seconds, and I miss that now that I do not have that, and spend a lot more time coaxing results out of a stubborn build process. For reference, I probably update this site 10 times a day. My computer is constantly grinding away trying to get this thing on the internet.

There are various tricks to make rendering go faster, such as caching the code execution, but ultimately quarto render is still slow, with even the most aggressive cache settings, compared to hugo, and sometime the cache gets corrupted and I have to start over anyway.

AFAICT the problem is partly that the quarto website engine is slower than hugo, and partly that quarto is re-rendering too many every time (in the sense of converting markdown to HTML, not of executing code inside the markdown, which we can avoid by using the cache and freeze facilities). Blogdown+hugo was smart enough to only re-render the things it needed to, and reused the HTML from before, so there were few things it needed to. I think? Or maybe hugo is just much faster because it is a compiled binary that doesn’t arse around with javascript and pandoc and stuff. Or both. Knowing which will not change much for me so I will not investigate further for now.

UPDATE: according to a core dev:

In generalities, our runtime is roughly spent 2/3s inside Pandoc, and 1/3 in Deno. Our Pandoc filtering infrastructure is pretty extensive, and some of our early decisions there have performance consequences that we didn’t foresee: we’re now working on them. In Deno, our performance profile is relatively flat, and so the work is going to be more of the “continued small fixes” kind.

I suspect the friction might be that the default quarto workflow favours a small number of immaculate, unique snowflake documents, where as I am more of a sit on the snow machine and make a blizzard kind of guy.

The simplest workaround for a quick incremental seems to be to use quarto preview --no-serve, which only renders the recently changed things, and so is much faster than rendering 1000+ things. quarto preview is still not that fast. It takes 32 seconds on this machine, on a typical invocation, to decide what part of this blog incrementally render, which is already 5 or 6 times longer than blogdown took to finish an incremental render and build of the site.

I can leave quarto preview running, which amortises the start-up time but has its own problems; see next.

5 Preview server

This section got so long I gifted it its own page: quarto website preview server.

tl;dr the quarto preview server is slow, memory-hungry, and unreliable. Avoid it and use a normal file server.

6 Accelerate deployment

When deploying to a static website host from the git repository such as netlify, the build time on their server from the source is unfortunately prohibitive for the above slow build reasons. I only get a few free build hours per month, which would restrict me to a weekly publication schedule.

We can economise on build-time by not requiring their server to do any rendering work.

There are two options in that case.

Firstly (not recommended): committing the raw site HTML/JS and serving that. This leads to a huge repository and horrible diffs and also tends to crash the preview server during merges. Merging can be made easier via the git merge theirs trick, but it is still a pain.

Alternatively, don’t publish from a git provider but rather use the publish command to upload files directly from the local machine to their server.

quarto publish netlify

Except that way of invoking it is needy and asks for confirmation and messes with the browser and so on. Meh. This is better:

quarto publish netlify --no-render --no-browser --no-prompt

This seemingly copies (in my case) thousands of files to the server every time I deploy, which feels like a waste of network time, but at least it is saving me human time managing the merge failures and server compute budget.

I have noticed that quarto render often fails, either crashing early (ERROR: Directory not empty (os error 66): remove 'livingthing/_site etc) or producing malformed HTML, and it is safest to 1) only run it for deployment without the cache and 2) only upload if the render was successful, otherwise the entire website will be broken. Putting this all together, here is the command I use to publish this blog. Fish shell:

killall deno
rm -rf _site
quarto render --cache-refresh ; and quarto publish --no-render --no-prompt --no-browser

The whole thing takes about 22 minutes at the moment.

7 Theming

Custom HTML theming is not too bad for simple CSS tweaks. Although the documentation is brusque, this part mostly “just works” in the sense that if I guess what to do, my guess usually ends up being correct.

See

7.1 Template mechanics

General notes: there are two parallel template systems, Pandoc Templates and EJS Templates, which have a confusing and AFAICT undocumented separation of responsibilities.

  • although the pandoc templates are mentioned under the journal format, they are universal and apply to all formats. (Bigger lesson: The journal format documentation seems to function as the “generic advanced quarto” documentation and is much more general than you might assume)
  • EJS templates are website format specific.
  • although both EJS and pandoc formats include partial templates, these partials are not compatible or connected and have a different syntax. I suspect that means that if I wish to customise the metadata in a listing, and in a specific page, I will end up implementing it in two different syntaxes, in two different template systems
  • The relationship can be complicated; for example even though the HTML templates are rendered by pandoc, the website system performs major surgery on them by a combination of EJS templating and javascript post-hoc modification. Discovering which line of HTML output is generated by which system is a forensic operation.

Gotchas:

For some reason I do not understand, in EJS it is best to wrap even templates in markup:

```{=html}
<table>
<tbody>
<tr>
<th scope="row">Hello</th>
<td><strong style="background-color:purple; border-radius: 9px; padding: 5px;">text</strong></td>
<td>1</td>
</tr>
</tbody>
</table>
```

Symptoms of not doing that include batshit crazy bananas fuckery of an unpredictable nature, except when sometimes it just totally works as expected.

7.2 Listings

The next level of sophistication after customising CSS is customising content overviews.

Index pages are called “listings”, and customisation of listings is supported, and reasonably powerful, but fragile; the errors that I get if I do something wrong are utterly baffling. See Document Listings for the basics and Custom Listings to get fancy.

tl; dr:

title: "Listing Example"
listing:
  contents:
    - "reports/*.qmd"
    - "lab-notes/*reports.qmd"

Various things about them are not obvious to me. Here are some discussion I am having about them:

If you can set up what you want using just front matter YAML config, things are simple. OTOH, for this blog I needed to use custom listing templates, and that got complicated.

Currently the template development development workflow is stilted since “resource files” such as custom templates are not watched in preview mode. actually watched in v1.5. This fact necessitates a lot of restarting the preview server to display updates.

7.3 Individual pages

OK, what if we do not what to change CSS style, OR a custom listing, but do something more complicated, like change the layout of a single page?

At the single page level we need to know about the (at least) two interacting template systems involved in the websites per default, EJS and the pandoc template system. Poking around the code reveals that their interaction is messy and non-obvious to an outsider. Some stuff is generated by the lower level pandoc templates, but these are then thoroughly transformed by the EJS website-mashing system. It isn’t really clear what to update to accomplish what goal.

There is a system of template-partials which should allow us to override small bits of the page for minor adjustments, but documentation is incomplete. Custom templates are mentioned under HTML Options, and there is some incomplete documentation at Template partials, but it seems that the best reference of how to use them is the source code or perhaps user forums. Templates for individual pages are complex; AFAICT the default HTML page for a single post is the pandoc HTML template but then there is a whole bunch of EJS stuff that gets smushed into that granddaddy pandoc template in a non-trivial manner. AFAICS, you can override the pandoc stuff by defining a custom template or template-partial, but the EJS stuff is more of a look-but-do-not-touch thing that we modify through settings, unless we are talking about a listings page in which case there an EJS API which we are invited to fiddle with using a different syntax. Got it?

Gotcha: pandoc templates seem to include the similar-looking html.template and template.html. Which to use? AFAICT it is html.template; the other one is, I think, a copy of the pandoc default template, kept around for reference.

I am currently tracking the following forum discussions for help trying to improve the display of metadata on this blog:

After a while I settled on the following for title-block.html:

<header id="title-block-header">
$if(title)$<h1 class="title">$title$</h1>$endif$
$if(subtitle)$
<p class="subtitle">$subtitle$</p>
$endif$
$for(author)$
<p class="author">$author$</p>
$endfor$

$if(date)$
<p class="date"><span class="created">$date$ </span>$if(date-modified)$
<span class="modified">— $date-modified$</span>
$endif$</p>
$endif$
<span class="ratings">
    <span class="rating rating-usefulness-${if(usefulness)}${ usefulness  }${else}0${endif}"></span>
    <span class="rating rating-certainty-${if(novelty)}${ novelty  }${else}0${endif}"></span>
    <span class="rating rating-novelty-${if(certainty)}${ certainty  }${else}0${endif}"></span>
    <span class="rating rating-polish-${if(polish)}${ polish  }${else}0${endif}"></span>
</span>
$if(audience)$
<div class="audience">
    <span class="notification-title">Assumed audience:</span>
    <p>$audience$</p>
</div>
$endif$
$if(content-warning)$
<div class="content-warning">
    <span class="notification-title">Content warning:</span>
    <p>$content-warning$</p>
</div>
$endif$
$if(abstract)$
<div class="abstract">
<div class="abstract-title">$abstract-title$</div>
$abstract$
</div>
$endif$
</header>

7.4 Bootstrap, bootswatch, dark mode

There is a hairball of tangled theming and variable systems involved in choosing the styling of the page. I am trying desperately not to understand it, but unfortunately it is obtrusive. The key thing to realise is that there are SCSS variables that are used to set the theme, and also CSS variables that are used to set the theme, and which one to use to change what or whose variables will get propagated to what is kind of a specialist engineering, where the “bootstrap” CSS themes clash with the CSS technology. I am not a neophyte to CSS, I’ve been doing it reluctantly for decades. This must be pure torture for people who do not have that background.

For one example, the navigation headers, for some unknowable reason, are not controlled by the SCSS variables, decided that they are in “dark mode” and made themselves illegibly pale even though I do not mention dark mode anywhere on the site and all the relevant colours in my stylesheet are dark. After trying to change many variable names to fix them I settled upon this SCSS

.navbar {
    // --bs-navbar-color: #050505;
    // --bs-nav-link-color: #050505;
    // --bs-navbar-color: $body-color;
    // --bs-nav-link-color: $body-color;
    font-family: $headings-font-family;
    background-image: $bg-shaded-image, $bg-image;
    background-color: $bg-color;
    color: $body-color;
    .navbar-brand{
        // Cannot fucking work out where the header color gets set to something dumb
        color: $body-color;
    }
    // Trying to eliminate that fucking pale header color die die die
    .navbar-nav .nav-link {
        // color: var(--bs-body-color) !important;
        color: $body-color !important;
    }
}

I have a vague suspicion that this leaves a half-digested bolus of undigestible CSS rules clogging the browser, but I have run out of care.

I switched different lines of this declaration on and off mindlessly until it worked. Key point: I will never support “light” and “dark” modes for this site. If that is your passion, write your own stylesheet..

8 Tips

10 Code matters

10.1 Supporting javascript

The keyword to inject headings into the page is include, for example, include-in-header or include-after-body.

10.2 Migrating from blogdown

A few people found it easy. See

My blog, as I keep on mentioning, is sprawling and chaotic, and for me it was not easy. I found it best to script a migration.

hacky migration scripts

The script is on github. You are free to use it under the MIT license.

10.3 Example _quarto.yml

Putting all that together, for this site, we get

project:
  type: website
  output-dir: _site
  resources:
    - keybase.txt
    - "*.bib"
    - notebook/*.yaml
    - post/*.yaml
website:
  title: "The Dan MacKinlay family of variably-well-considered enterprises"
  site-url: https://danmackinlay.name
  favicon: _theme/logo.png
  twitter-card:
    creator: "@dan_mackinlay"
  open-graph: true
  navbar:
    right:
      - text: About
        file: about.qmd
      - text: Currently
        file: notebook/currently.qmd
      - text: Incoming
        file: notebook/incoming.qmd
      - text: Blogroll
        file: notebook/blogroll
      - text: Blog
        file: post.qmd
      - text: Notebook
        file: notebook.qmd
      - text: Everything
        file: everything.qmd
  search:
    algolia:
      index-name: danmackinlay_quarto
      application-id: LNWYJ42WO6
      search-only-api-key: a038347e5450a6426f008faf22c1a4c4
      show-logo: true
format:
  html:
    template-partials:
    - /_theme/metadata.html
    theme:
    - cosmo
    - style.scss
    html-math-method: mathjax
    strip-comments: true
    max-width: 1400px
    code-fold: true
    code-line-numbers: true
toc: true
number-sections: true
execute:
  freeze: true
  cache: true