Back to lessons

Hosting Operations

Find Top 404 URLs

You need to see which URLs are producing 404 responses.

Command

awk '$9==404 {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head

What changed

Nothing changes. The command counts missing paths.

Danger

safe

When to use it

Use this after deploys, migrations, or SEO cleanup.

When not to use it

Do not assume every 404 is a bug; scanners and bots create noise.

Undo or recovery

No state is changed.

Expected output

A ranked list of 404 paths.

demo script

Disposable terminal steps

  1. awk '{print $9, $7}' /var/log/nginx/access.log
  2. awk '$9==404 {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head
  3. awk '$9 ~ /^5/ {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head

simulated output

What it looks like

disposable vessel
::fixture-ready::
$ awk '{print $9, $7}' /var/log/nginx/access.log
200 /
404 /missing.css
502 /api
::exit-code::0
$ awk '$9==404 {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head
      1 /missing.css
::exit-code::0
$ awk '$9 ~ /^5/ {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head
      1 /api
::exit-code::0

YouTube Short

Find the URLs throwing 404s.

One missing asset can show up again and again. Count 404 paths before guessing.

LinkedIn hook

The missing file was not random. The access log had a pattern.

Question: Do you check 404s after deploys?

experiments

A/B tests to run

Metric: search_click_rate

A: Find top 404 URLs.

B: The access log had a pattern.