Improving page reload times in astro dev by ~40% via profiling
I spend a lot of my time working on components, rather than content, in our developer documentation. That side of Astro is more than fast enough, with HMR taking care of the reload before I’ve even clicked onto the browser again! Whilst components are a large part of how the site looks, they are almost inconsequential compared to the amount of content we write:
───────────────────────────────────────────────────────────────────────────────Language Files Lines Blanks Comments Code Complexity───────────────────────────────────────────────────────────────────────────────MDX 5398 359641 104334 0 255307 0SVG 352 27310 0 0 27310 0YAML 239 11100 1345 98 9657 0JSON 89 244378 372 0 244006 0Markdown 72 4898 2104 0 2794 0TypeScript 51 2754 334 66 2354 263CSS 11 13959 2151 3 11805 0JavaScript 10 459 54 43 362 89Jupyter 6 8666 0 0 8666 0Plain Text 5 1610 78 0 1532 0JSX 4 350 32 0 318 31TypeScript Typings 4 22 1 3 18 0Go 2 184 37 6 141 30TOML 2 25 4 2 19 0CSV 1 2 0 0 2 0INI 1 21 4 1 16 0License 1 395 90 0 305 0Patch 1 49 0 0 49 0Python 1 176 8 1 167 3───────────────────────────────────────────────────────────────────────────────Total 6250 675999 110948 223 564828 416───────────────────────────────────────────────────────────────────────────────
86% of the files and 53% of the lines in our repository are MDX. That’s what the majority of our contributors work on a daily basis, so how do we make them more productive?
Tighter feedback loops!
Whether you’re experimenting with different ways to present data, actioning feedback from code review or pairing up with someone, no-one wants to wait a long time for the page to load again.
Anatomy of a page reload in Astro
All of our content is organised with content collections, an Astro feature that allows you to query, reference and validate Markdown and JSON / YAML content. Types are generated based on generated slugs, frontmatter and data is parsed with Zod and these are the building block of any grid, list or RSS feed that we offer on our documentation.
Astro 4
Here is a profile of a single visit page → edit page → new content appears on page cycle. Does anything stand out? Not really!
Content collections hold everything in memory, and editing a single entry leads to a reload of everything else! Adding one more file slows down every other file by just a tiny bit. If you think this is a bad approach, then you’ll be glad to know that Astro 5 released in December 2024 & overhauled how Content Collections work.
We’re not on Astro 5 just yet but we will be soon, so let’s focus our efforts there!
Astro 5
If you’re interested in a deep dive in how Content Layer works, Astro themselves published a blog doing just that! The general gist is that now the backing store is on disk & takes the shape of a Map stringified with devalue
, and this is important to remember as we dig deeper!
This profile is an identical visit page → edit page → new content appears on page scenario. It’s a little faster, at 6 seconds, but you can tell just from the profile that it’s being held back by a few slow things rather than Content Collections suffering from death by a thousand cuts.
generateMap
generateMap
is responsible for 3.17 of our 5.94 seconds - more than half - why is it so slow!?
generateMap
is part of the magic-string
library, used to manipulate strings and generate sourcemaps. It’s insanely popular, used by 2,680 dependents with more than 30 million weekly downloads, so we’re not getting bitten by a 7 year old library in our hot path at least.
Sourcemaps are helpful, but they aren’t really relevant to the experience of authoring content in Astro. It’s extraordinarily rare that you’ll hit a JavaScript error, and we’d much rather save these seconds, but can generating a sourcemap really take this long?
transform
transform
is coming from one of Astro’s Vite plugins, vite-plugin-import-meta-env.js
. It’s used with esbuild’s define option (replace variables with values in-place) to support using import.meta.env.X
to access the environment variable X
.
It isn’t doing anything out of the ordinary, it spends 99.9% of it’s time in generateMap. We can patch this plugin and figure out what is taking so long, and what exactly it is receiving as input.
diff --git a/node_modules/astro/dist/env/vite-plugin-import-meta-env.js b/node_modules/astro/dist/env/vite-plugin-import-meta-env.jsindex 2246279..86edda1 100644--- a/node_modules/astro/dist/env/vite-plugin-import-meta-env.js+++ b/node_modules/astro/dist/env/vite-plugin-import-meta-env.js@@ -68,9 +68,17 @@ function importMetaEnv({ envLoader }) { if (!options?.ssr || !source.includes("import.meta.env")) { return; }++ const path = id.replace(process.cwd(), "");++ console.log(path + `: source ${source.length / 1_000_000} MB`)+ privateEnv ??= envLoader.getPrivateEnv(); if (isDev) {+ console.time(path + ": new MagicString(source)"); const s = new MagicString(source);+ console.timeEnd(path + ": new MagicString(source)");+ if (!devImportMetaEnvPrepend) { devImportMetaEnvPrepend = `Object.assign(import.meta.env,{`; for (const key in privateEnv) {@@ -78,10 +86,24 @@ function importMetaEnv({ envLoader }) { } devImportMetaEnvPrepend += "});"; }++ console.time(path + ": s.prepend(devImportMetaEnvPrepend)"); s.prepend(devImportMetaEnvPrepend);+ console.timeEnd(path + ": s.prepend(devImportMetaEnvPrepend)");++ console.time(path + ": s.toString()");+ const code = s.toString();+ console.timeEnd(path + ": s.toString()");++ console.time(path + ': s.generateMap({ hires: "boundary" })');+ const map = s.generateMap({ hires: "boundary" });+ console.timeEnd(path + ': s.generateMap({ hires: "boundary" })');++ console.log(path + `: map ${map.mappings.length / 1_000_000} MB`)+ return {- code: s.toString(),- map: s.generateMap({ hires: "boundary" })+ code,+ map, }; } if (!defaultDefines) {
On startup, we’ll get some logs like this:
/node_modules/astro/dist/content/runtime.js: source 0.017651 MB/node_modules/astro/dist/content/runtime.js: new MagicString(source): 0.075ms/node_modules/astro/dist/content/runtime.js: s.prepend(devImportMetaEnvPrepend): 0.018ms/node_modules/astro/dist/content/runtime.js: s.toString(): 0.017ms/node_modules/astro/dist/content/runtime.js: s.generateMap({ hires: "boundary" }): 3.611ms/node_modules/astro/dist/content/runtime.js: map 0.03976 MB
Lots of tiny files that only take a few milliseconds to generate sourcemaps. Keep in mind, our profile showed a single call and that will happen when we edit a Markdown file:
astro:data-layer-content: source 25.497908 MBastro:data-layer-content: new MagicString(source): 0.131msastro:data-layer-content: s.prepend(devImportMetaEnvPrepend): 0.016msastro:data-layer-content: s.toString(): 0.018msastro:data-layer-content: s.generateMap({ hires: "boundary" }): 1.322sastro:data-layer-content: map 54.23153 MB
A 25.49 MB file that takes 1.3 seconds to generate a sourcemap for! Remember that data store that is created with devalue? Yeah, it’s that.
The astro:data-layer-content module
This is a virtual module which wraps around the on-disk data store, using devalue to stringify and later parse it again when needed. In our case, .astro/data-store.json is 26.08 MB so we’re pretty close to that 25.49 MB which appears when parsed back into a JavaScript Map.
What is it made up of?
Everything! All of our Markdown frontmatter and content, our changelogs releases, the JSON files we fetch from the Workers AI API - all of our content collections are synchronised into this store.
Our src/content folder totals up to 37 MB, and the majority of that is our Markdown content in the main docs collection.
$ du -hd1 src/content | sort -hr 37M src/content 31M src/content/docs4.1M src/content/partials1.1M src/content/workers-ai-models428K src/content/products428K src/content/changelogs160K src/content/compatibility-flags156K src/content/glossary 76K src/content/plans 48K src/content/learning-paths 44K src/content/notifications 20K src/content/videos 16K src/content/apps8.0K src/content/pages-framework-presets8.0K src/content/pages-build-environment4.0K src/content/i18n
Anyways, now that we know why it takes so long (this is a big file!), we can start investigating if this Vite plugin should even be generating a sourcemap for it. After all, it’s for import.meta.env usage and this isn’t code that you can use environment variables in!
Can we avoid generating sourcemaps?
In the patch for one of Astro’s Vite plugins, you might have spotted this condition:
if (!options?.ssr || !source.includes("import.meta.env")) { return;}
If we’re not in SSR (local development) or if our source file doesn’t contain the string import.meta.env then we should be returning early. Does our data store contain this string, because we have content that does?
In one of our tutorials, we use this functionality in a code example and so the string appears in our “source” file. Let’s just remove that for testing, and see what our profile looks like now.
We’ve saved a considerable amount of time!
Can we reduce the size of this file?
Normally, I’d say no - we just have a lot of content. However, there is an outlier:
$ find . -type f -print0 | xargs -0 ls -l | sort -k5,5rn | head -n 2-rw-r--r-- 1 kian staff 5078035 Jan 17 04:29 ./docs/magic-wan/legal/3rdparty.mdx-rw-r--r-- 1 kian staff 605951 Jan 17 04:29 ./docs/warp-client/legal/3rdparty.mdx
./docs/magic-wan/legal/3rdparty.mdx is 8.38 times larger than the second largest file in our content collections. Considering our data store file is 26.08 MB, this one Markdown file (out of roughly 5,500) is nearly 20% of the file. Let’s add back in that import.meta.env snippet, remove ./docs/magic-wan/legal/3rdparty.mdx and see how the stats we’re logging out change:
Before
astro:data-layer-content: source 25.497907 MBastro:data-layer-content: new MagicString(source): 0.04msastro:data-layer-content: s.prepend(devImportMetaEnvPrepend): 0.005msastro:data-layer-content: s.toString(): 0.003msastro:data-layer-content: s.generateMap({ hires: "boundary" }): 1.306sastro:data-layer-content: map 54.23153 MB
After
astro:data-layer-content: source 20.30652 MBastro:data-layer-content: new MagicString(source): 0.041msastro:data-layer-content: s.prepend(devImportMetaEnvPrepend): 0.004msastro:data-layer-content: s.toString(): 0.005msastro:data-layer-content: s.generateMap({ hires: "boundary" }): 853.841msastro:data-layer-content: map 44.207613 MB
It’s 1.52x faster to generate a sourcemap for that file now so even if we can only upstream a quick win into that Astro plugin & Vite keeps on generating sourcemaps, we should still save a nice amount of time!
We still want this file but we can just store it as text next to the Markdown file and import it, keeping it out of our data store.
import text from "./3rdparty.txt?raw"import { Markdown } from "~/components";
<Markdown text={text} inline={false} />
Conclusion
For something that seems as simple as a static site for some Markdown, we have a lot of moving parts and enough scale where all the optimisations matter. I did notice something slowing down builds in Astro 5, so I’ll be back soon with those findings!
If you think something is slow, profile it - you might not find anything that is worth optimising but you’ll learn more about what you’re working with! It’s easy to discount “saved two seconds per edit” but that adds up very quickly when you have as many contributors as we do.