Cybersecurity Triage
Read-onlyFind Paths Repeatedly Returning 404
You need to identify missing paths that are being requested repeatedly.
Command
awk '$9==404 {count[$7]++} END {for (path in count) if (count[path] >= 3) print count[path], path}' ./fixtures/nginx/access.log | sort -nr | head
Before you run this
System impact: Read-only. Low when scoped to the shown target.
When not to use it: Do not treat every repeated 404 as malicious; it may be a stale link, deployment mistake, or crawler behavior.
Expected output
A descending list of 404 counts followed by requested paths.
System impact
Read-only. Nothing changes. The command counts repeated 404 paths and hides one-off misses.
Recovery / rollback: no state is changed.
When to use it
Use this when distinguishing harmless broken links from suspicious or noisy repeated requests.
When not to use it
Do not treat every repeated 404 as malicious; it may be a stale link, deployment mistake, or crawler behavior.
Watch this command run
Command transcript
This sanitized transcript shows the commands and output shape without exposing host details.
$ awk '$9==404 {print $1, $7}' ./sample-files/nginx/access.log | head
203.0.113.44 /missing
203.0.113.44 /missing
203.0.113.44 /missing
203.0.113.44 /wp-login.php
203.0.113.44 /wp-admin
$ awk '$9==404 {count[$7]++} END {for (path in count) if (count[path] >= 3) print count[path], path}' ./sample-files/nginx/access.log | sort -nr | head
3 /missing
$ awk '$9==404 {count[$1]++} END {for (ip in count) print count[ip], ip}' ./sample-files/nginx/access.log | sort -nr | head
5 203.0.113.44
View commands shown
These are the commands shown in the sanitized transcript.
Commands shown
awk '$9==404 {print $1, $7}' ./fixtures/nginx/access.log | headawk '$9==404 {count[$7]++} END {for (path in count) if (count[path] >= 3) print count[path], path}' ./fixtures/nginx/access.log | sort -nr | headawk '$9==404 {count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | head
next steps
Related commands
Find Clients Repeating the Same Path
The suspicious pattern is sometimes one client hammering one URL.
awk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./fixtures/nginx/access.log | sort -nr | head
Find the IPs Creating the Most 4xx Noise
One address can turn a normal access log into a wall of failed requests.
awk '$9 ~ /^4/ {count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | head
Spot Request Bursts by Minute
Traffic spikes are easier to read when you bucket them by time.
awk '{minute=substr($4,2,17); count[minute]++} END {for (m in count) print count[m], m}' ./fixtures/nginx/access.log | sort -nr | head
Group Server Errors by URL Path
A 500 spike is easier to triage when the broken path is obvious.
awk '$9 ~ /^5/ {count[$7]++} END {for (path in count) print count[path], path}' ./fixtures/nginx/access.log | sort -nr | head
Find Common Admin Probe Paths
A site does not need WordPress to receive WordPress-looking probes.
awk '$7 ~ /(admin|login|wp-|phpmyadmin)/ {print $1, $7, $9}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr | head
Study mapping
Use this as independent command practice: read the notes, predict the output, then compare it with the example before using a real shell.
Useful for
- LPIC-1 style command-line practice
- LFCS style performance tasks
- Linux+ style troubleshooting review
Independent study support only. No affiliation, endorsement, exam dumps, or real exam questions.