Find Broken Internal Links in Built HTML

You need to list internal href paths that do not exist in the static build.

Command

grep -Rho --include='*.html' 'href="/[^"]*"' public | sed 's#href="##;s#"##' | while read -r path; do test -e "public${path}" || echo "$path"; done | sort -u

Before you run this

System impact: Read-only. Can create load on large logs, directories, filesystems, or process tables.

When not to use it: Do not use it for JavaScript-routed apps or remote URLs without adapting the path logic.

Expected output

Root-relative paths linked from HTML that do not exist in the public directory.

System impact

Read-only, can be slow. Nothing changes. The command extracts internal links and checks whether the target path exists locally.

Scope this to the smallest useful path or service on busy systems.

Recovery / rollback: no state is changed.

When to use it

Use before deploys, after URL changes, or after moving content between sections.

When not to use it

Do not use it for JavaScript-routed apps or remote URLs without adapting the path logic.

Watch this command run

Command transcript

This sanitized transcript shows the commands and output shape without exposing host details.

demo@lab:~$

$ grep -Rho --include='*.html' 'href="/[^"]*"' public | sed 's#href="##;s#"##' | sort -u

/
/assets/site.css
/blog/post.html
/missing.html

$ grep -Rho --include='*.html' 'href="/[^"]*"' public | sed 's#href="##;s#"##' | while read -r path; do test -e "public${path}" || echo "$path"; done | sort -u

/missing.html

View commands shown

These are the commands shown in the sanitized transcript.