Add Pagefind backend indexing and search route#40
Open
ritorhymes wants to merge 20 commits into
Open
Conversation
This was referenced May 18, 2026
Add the .build-eips.repo.toml schema, loader, validation rules, and manifest tests for active proposal repositories and declared sibling repositories. Introduce ActiveRepoIdentity so later workspace lifecycle and execution layers can select manifest-backed repository metadata while the legacy EIPs/ERCs fallback continues to operate.
Add the .build-eips.toml schema, starter config text, upward discovery, and loaded workspace config accessors. Define server/site defaults, workspace build-root paths, local theme and repo paths, and strict parsing for unsupported config fields. Leave init, doctor, runtime consumption, and render-only filtering to the later workspace, execution, and targeted rendering PRs.
Add source materialization modes, reshape RepositoryUse around resolved repository endpoints, and pass source mode explicitly into Fresh. Add dirty working-tree copying, tracked-path sync, dirty rejection errors, and sibling merge behavior that follows local file sibling HEADs. Route existing build and changed-file flows through clean source materialization so current behavior stays unchanged.
Add build-eips init to create a workspace root, clone declared sibling repos and the shared theme, optionally clone template, and create the local build root. Write starter .build-eips.toml only when missing and regenerate a base WORKSPACE.md guide for the initialized workspace. Use active repository identity and staging repo metadata for workspace bootstrap while leaving doctor, platform-dev repos, and runtime behavior to later PRs.
Add build-eips doctor for checking workspace config discovery, active repo identity, repository layout, sibling manifests, theme checkout, and required local tools. Report ok/warn/fail diagnostics and fail the command when any check records a failure. Keep this focused on workspace diagnostics; execution policy, runtime commands, and platform-dev setup land in later PRs.
Add ResolvedExecution for command source policy, build roots, base URL overrides, staging/production selection, and clean versus dirty source materialization. Add CLI execution controls for production, remote siblings, build roots, build/serve base URL resolution, plain build/serve/check clean mode, and parity commands, with tests for the command matrix. Route build, check, serve, clean, and changed-file listing through the resolved policy while leaving targeted --only behavior to later PRs.
Resolve a workspace-local theme for Zola runtime commands and remove the remote theme cache path. Mount the selected theme under repo/themes/eips-theme for Zola, load Zola config from the mounted theme, and load eipw config from the local theme checkout. Require workspace theme setup for build, check, serve, and parity commands while leaving the prepared runtime pipeline to the next PR.
Move the Prepared runtime pipeline out of main.rs into pipeline.rs and keep main.rs focused on dispatching resolved runtime operations. Prepare runtime inputs from ResolvedExecution by cloning and fetching sources, force-refreshing prepared git scratch refs, merging sibling proposal content while keeping the active homepage, running eipw lint, preprocessing markdown, and materializing the local theme for Zola. Keep the existing minimal Prepared::serve method with the type, while leaving serve watcher and sync behavior to the serve runtime PR.
Add server binding resolution and serve-only host/port flags for local Zola serve commands. Run Zola serve with the resolved server binding, optional base URL override, fast/force serve flags, and generated output directory. Add dirty serve watching for dirty active-repo paths and local theme changes. Clean mode disables active-repo sync but keeps theme sync.
Add build-eips preview for serving the existing resolved output directory without rebuilding or starting dirty sync. Reuse server binding resolution and preview-only host/port flags, and report missing output before binding the local server. Add a tiny_http static file server with safe path resolution, index-file fallback, basic content types, and preview path tests.
Add proposal number parsing, editorial selector classification, and content-path helpers for flat and directory proposal layouts. Add OnlyRenderPlan to index proposal content for selected rendering, derive EIP/ERC public URLs, choose internal versus external required references, filter/prune content, and gate dirty path sync. Keep this as internal foundation for editorial integration and targeted build/serve rendering; no user-facing --only config lands here.
Add `build-eips editorial lint` and `build-eips editorial check` as the first user-facing proposal-selection commands. Keep eipw options scoped to editorial lint/check. Normal build, check, and serve prepare runtime sources, preprocess markdown, and run Zola without carrying eipw source-selection flags. Run editorial-selected eipw lint against the prepared merged source tree so cross-repo EIP/ERC references resolve through the same content layout used by local builds. Prepare runtime sources from the local active checkout, merge sibling repositories, and keep active-upstream fetches in changed-file comparison and editorial `--against-upstream` target selection.
Add build-only render selection from --only and workspace [render].only, deduping proposal numbers into ResolvedExecution. Build OnlyRenderPlan during prepared runtime setup, rewrite omitted proposal links and requires entries to public URLs, and prune unselected proposal content before Zola runs. Restrict targeted rendering to local dirty build mode for now, leaving targeted serve sync to the next PR. Remove the eipw lint step from the prepared runtime pipeline so linting is reached only through editorial commands.
Resolve cross-proposal asset links before Zola sees prepared markdown. Add proposal asset path resolution, rendered URL builders, and an OnlyRenderPlan asset inventory so links can be validated before targeted pruning removes omitted proposal content. Rewrite static asset links to rendered relative URLs when targets are available locally, and to public EIP/ERC asset URLs when targeted rendering omits the target proposal. Keep selected asset markdown links on the existing Zola @/... path, while omitted asset markdown links use public page URLs. Preserve query strings and fragments, leave fragment-only and raw HTML links untouched, and skip already-generated Zola markdown so repeated preprocessing remains idempotent.
Extend targeted rendering to local dirty serve by accepting --only on serve and applying workspace [render].only to local serve runs. Pass OnlyRenderPlan into dirty serve sync and filter active-repo dirty paths to selected proposal content. Avoid reintroducing omitted proposal markdown or assets into the materialized repo. Add incremental targeted markdown preprocessing for dirty serve updates, including selected asset markdown, retained non-proposal pages, selected deletions, and filesystem timestamp fallback for new dirty files.
Add --platform-dev workspace initialization for cloning optional preprocessor and eipw repos alongside proposal repos and theme. Add POSIX and PowerShell dev-setup scripts that build the local build-eips binary, ensure a supported Zola is available, and install 0.22.1 when needed while reusing existing 0.22.1-or-newer installs. Add setup documentation, release archive checksum sidecars, doctor helper checks, and focused setup tests for contributor workspace setup.
Finalize the preprocessor integration docs and command help. Describe staging, production, and parity as clean local-active runtime modes that use remote sibling sources and selected environment metadata. Keep CLI help, workspace guide, and architecture source-policy wording aligned around the local active checkout source model.
Parse section index front matter directly as YAML before writing the generated Zola TOML front matter. This preserves structured section metadata such as extra.homepage_badges instead of flattening nested YAML through the proposal preamble parser. Keep proposal markdown on the existing proposal preamble path, and keep body link rewriting active for section index pages.
Generate static/assets/data/proposals.json from prepared proposal sources during the runtime build pipeline. Collect proposal metadata through a shared catalog so JSON writing and future prepared-runtime data passes use the same path validation, preamble parsing, duplicate detection, targeted URL policy, and stable proposal ordering. Preserve the existing JSON shape, active repository prefix selection, pretty formatting, omitted optional fields, and output-collision protection.
Add build-only Pagefind indexing behind the search module boundary. Use the Rust Pagefind crate from build-eips so indexing stays inside the Rust preprocessor. This avoids requiring a separate non-Rust search toolchain, such as Node.js or Python, and avoids shelling out to a Pagefind binary that contributors would need to install, version, and manage on PATH. Add search config and build CLI controls: a workspace [search] block with pagefind = true by default, and build --no-search on build and parity build only. Indexing runs only on build, after Zola has produced rendered HTML, and writes assets under output/pagefind/. Running Pagefind from serve or check would require separate lifecycle support and is out of scope here. Generate the search route page and search state data used by the theme. The route page is written into the prepared repo, not the source worktree, for build, serve, and check so the theme always has a stable shell; its state is marked disabled on serve, check, and --no-search builds, and only an enabled build also writes the Pagefind bundle. Refuse to write the route if user-authored content already occupies content/search.md or content/search/, so the generated route never silently overwrites user content. Keep Pagefind crate imports isolated to src/search/pagefind.rs and expose the rest through build-eips-owned search types. This keeps the integration compartmentalized enough to review, maintain, or remove without intermingling Pagefind API usage through the preprocessor. Add targeted validation for the route and state contract: route collision detection against user content, route placement in the prepared repo under targeted --only builds, resolved base path behavior including --base-url overrides, the disabled state for non-build modes and --no-search builds, and the policy that disabling search does not silently delete previously written Pagefind assets.
abfdc35 to
cd7f53d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds build-time Pagefind indexing and the generated
/search/route contract needed by the theme search UI.This resolves the investigation in #5 into search options beyond Fuse.js by adopting Pagefind and moving search indexing into the build output instead of relying on a browser-side Fuse corpus as the primary search backend.
What Changes
output/pagefind/for enabled build output.[search] pagefind = trueworkspace configuration, with search enabled by default for builds.build --no-searchandparity build --no-searchto skip Pagefind indexing for a build, overriding the workspacepagefindsetting for that run.content/search.mdin the prepared Zola repo withtemplate = "search.html"and the sameextra.searchstate used bybuild_eips_search.toml.data/build_eips_search.tomlwith the search enabled state, base path, and Pagefind bundle path for theme consumption.--no-searchbuilds.--base-urloverrides.content/search.md,content/search/,content/search/index.md, orcontent/search/_index.md.output/pagefind/directory before re-indexing on enabled builds, and keeps stale Pagefind assets untouched when search is disabled instead of silently deleting prior build output.--onlybuilds, base-path behavior, disabled route state, stale asset handling, and the Pagefind import boundary.Design Rationale
Why Pagefind instead of Fuse.js?
Pagefind is built for static sites: it indexes rendered HTML at build time, writes static search assets, and lets the browser load the search runtime and index data when search is used. That is a better fit than making the client parse and search a large generated corpus directly.
Why run Pagefind from
build-eips?The preprocessor can call the Rust Pagefind crate directly, so contributors do not need a separate non-Rust search toolchain such as Node.js or Python, and do not need to install, version, or manage a separate Pagefind binary on
PATH.Why index rendered HTML?
The rendered site is the search source of truth. Pagefind can use the same rendered pages and indexing hooks that the theme exposes, rather than treating feed or metadata output as a search corpus.
Why keep the integration behind a search module boundary?
Pagefind API usage is isolated in
src/search/pagefind.rs, while the rest of the preprocessor talks through build-eips-owned search types. That keeps the integration reviewable and easier to maintain or remove without intermingling Pagefind API usage through the preprocessor.This PR does not add theme templates, browser search runtime behavior, full search filters, search pagination, Created Date filtering, search restore, or the rendered search corpus artifact. Those land in later stacked PRs.
Closes #5