Cybersecurity Triage
Read-onlyFind the IPs Creating the Most 4xx Noise
You need to identify which client IPs are generating the most client-side errors in a web access log.
Command
awk '$9 ~ /^4/ {count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | head
Before you run this
System impact: Read-only. Low when scoped to the shown target.
When not to use it: Do not block an IP from this count alone; review paths, time windows, and business context first.
Expected output
A descending list of request counts followed by source IP addresses.
System impact
Read-only. Nothing changes. The command counts matching log lines and prints the busiest 4xx sources.
Recovery / rollback: no state is changed.
When to use it
Use this during defensive web-log triage when you want to separate normal missing pages from noisy clients.
When not to use it
Do not block an IP from this count alone; review paths, time windows, and business context first.
Watch this command run
Command transcript
This sanitized transcript shows the commands and output shape without exposing host details.
$ awk '$9 ~ /^4/ {print $1, $7, $9}' ./sample-files/nginx/access.log | head
203.0.113.44 /missing 404
203.0.113.44 /missing 404
203.0.113.44 /missing 404
203.0.113.44 /wp-login.php 404
203.0.113.44 /wp-admin 404
203.0.113.45 /admin 403
203.0.113.45 /login 403
203.0.113.46 /api/profile 405
203.0.113.46 /api/profile 405
$ awk '$9 ~ /^4/ {count[$1]++} END {for (ip in count) print count[ip], ip}' ./sample-files/nginx/access.log | sort -nr | head
5 203.0.113.44
2 203.0.113.46
2 203.0.113.45
$ awk '$9 ~ /^4/ {print $1, $7, $9}' ./sample-files/nginx/access.log
203.0.113.44 /missing 404
203.0.113.44 /missing 404
203.0.113.44 /missing 404
203.0.113.44 /wp-login.php 404
203.0.113.44 /wp-admin 404
203.0.113.45 /admin 403
203.0.113.45 /login 403
203.0.113.46 /api/profile 405
203.0.113.46 /api/profile 405
View commands shown
These are the commands shown in the sanitized transcript.
Commands shown
awk '$9 ~ /^4/ {print $1, $7, $9}' ./fixtures/nginx/access.log | headawk '$9 ~ /^4/ {count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | headawk '$9 ~ /^4/ {print $1, $7, $9}' ./fixtures/nginx/access.log
next steps
Related commands
Find Paths Repeatedly Returning 404
One missing URL is normal. A repeated missing URL is a signal.
awk '$9==404 {count[$7]++} END {for (path in count) if (count[path] >= 3) print count[path], path}' ./fixtures/nginx/access.log | sort -nr | head
Find Clients Repeating the Same Path
The suspicious pattern is sometimes one client hammering one URL.
awk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./fixtures/nginx/access.log | sort -nr | head
Spot Request Bursts by Minute
Traffic spikes are easier to read when you bucket them by time.
awk '{minute=substr($4,2,17); count[minute]++} END {for (m in count) print count[m], m}' ./fixtures/nginx/access.log | sort -nr | head
Find Common Admin Probe Paths
A site does not need WordPress to receive WordPress-looking probes.
awk '$7 ~ /(admin|login|wp-|phpmyadmin)/ {print $1, $7, $9}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr | head
Count the Most Common User Agents
A strange traffic spike often has a strange user agent.
awk -F'"' '{print $6}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr | head
Study mapping
Use this as independent command practice: read the notes, predict the output, then compare it with the example before using a real shell.
Useful for
- LPIC-1 style command-line practice
- LFCS style performance tasks
- Linux+ style troubleshooting review
Independent study support only. No affiliation, endorsement, exam dumps, or real exam questions.