Hosting Operations
Find HTML Pages Missing from the Sitemap
You need to compare generated HTML files against sitemap URLs.
Command
find public -name '*.html' -print | sed 's#^public#https://example.com#' | while read -r url; do grep -q "$url" public/sitemap.xml || echo "$url"; done
What changed
Nothing changes. The command maps file paths to URLs and prints pages absent from the sitemap.
Danger
safe
When to use it
Use when new routes are not appearing in sitemap output.
When not to use it
Do not treat every result as an error; drafts and private pages may be intentionally omitted.
Undo or recovery
No undo needed because this command is read-only.
Expected output
Generated page URLs that do not appear in sitemap.xml.
demo script
Disposable terminal steps
find public -name '*.html' -printfind public -name '*.html' -print | sed 's#^public#https://example.com#' | while read -r url; do grep -q "$url" public/sitemap.xml || echo "$url"; done
simulated output
What it looks like
::fixture-ready::
$ find public -name '*.html' -print
public/about.html
public/draft.html
public/blog/post.html
public/index.html
::exit-code::0
$ find public -name '*.html' -print | sed 's#^public#https://example.com#' | while read -r url; do grep -q "$url" public/sitemap.xml || echo "$url"; done
https://example.com/draft.html
https://example.com/index.html
::exit-code::0
YouTube Short
Which pages missed the sitemap?
Map generated HTML paths to public URLs, then print any URL missing from sitemap.xml.
LinkedIn hook
A page can exist in the build but never make it into the sitemap.
Question: Do you compare generated pages against sitemap output?
experiments
A/B tests to run
Metric: save_rate
A: Built but not in sitemap.
B: Compare files to sitemap URLs.