Cybersecurity Triage
Find Clients Repeating the Same Path
You need to find IP and path pairs that appear repeatedly in a web access log.
Command
awk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./fixtures/nginx/access.log | sort -nr | head
What changed
Nothing changes. The command counts repeated source-IP and path combinations.
Danger
safe
When to use it
Use this when looking for repeated polling, scraping, broken clients, or suspicious request loops.
When not to use it
Do not use this alone to decide intent; repeated requests can come from health checks, retries, or caches.
Undo or recovery
No undo needed because the command is read-only.
Expected output
Counts followed by source IP and requested path.
demo script
Disposable terminal steps
awk '{print $1, $7}' ./fixtures/nginx/access.log | headawk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./fixtures/nginx/access.log | sort -nr | headawk '{count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | head
simulated output
What it looks like
::fixture-ready::
$ awk '{print $1, $7}' ./fixtures/nginx/access.log | head
198.51.100.10 /
198.51.100.11 /docs
198.51.100.12 /api/search
203.0.113.44 /missing
203.0.113.44 /missing
203.0.113.44 /missing
203.0.113.44 /wp-login.php
203.0.113.44 /wp-admin
203.0.113.45 /admin
203.0.113.45 /login
::exit-code::0
$ awk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./fixtures/nginx/access.log | sort -nr | head
5 198.51.100.30 /health
::exit-code::0
$ awk '{count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | head
5 203.0.113.44
5 198.51.100.30
3 198.51.100.25
2 203.0.113.46
2 203.0.113.45
2 198.51.100.24
1 198.51.100.23
1 198.51.100.22
1 198.51.100.21
1 198.51.100.12
::exit-code::0
YouTube Short
Find IP-path repeaters.
Counting requests by IP is useful, but pairing IP with path shows a tighter pattern: one client repeating one URL again and again.
LinkedIn hook
The suspicious pattern is sometimes one client hammering one URL.
Question: Do you group web traffic by IP alone, or by IP plus path?
experiments
A/B tests to run
Metric: youtube_retention_15s
A: One IP hammering one URL is easy to miss.
B: Pair the client with the path.