Back to commands

Cybersecurity Triage

Read-only

Find Clients Repeating the Same Path

You need to find IP and path pairs that appear repeatedly in a web access log.

Command

awk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./fixtures/nginx/access.log | sort -nr | head

Before you run this

System impact: Read-only. Low when scoped to the shown target.

When not to use it: Do not use this alone to decide intent; repeated requests can come from health checks, retries, or caches.

Expected output

Counts followed by source IP and requested path.

System impact

Read-only. Nothing changes. The command counts repeated source-IP and path combinations.

Recovery / rollback: no state is changed.

When to use it

Use this when looking for repeated polling, scraping, broken clients, or suspicious request loops.

When not to use it

Do not use this alone to decide intent; repeated requests can come from health checks, retries, or caches.

Watch this command run

Command transcript

This sanitized transcript shows the commands and output shape without exposing host details.

demo@lab:~$

$ awk '{print $1, $7}' ./sample-files/nginx/access.log | head

198.51.100.10 /
198.51.100.11 /docs
198.51.100.12 /api/search
203.0.113.44 /missing
203.0.113.44 /missing
203.0.113.44 /missing
203.0.113.44 /wp-login.php
203.0.113.44 /wp-admin
203.0.113.45 /admin
203.0.113.45 /login

$ awk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./sample-files/nginx/access.log | sort -nr | head

5 198.51.100.30 /health

$ awk '{count[$1]++} END {for (ip in count) print count[ip], ip}' ./sample-files/nginx/access.log | sort -nr | head

5 203.0.113.44
5 198.51.100.30
3 198.51.100.25
2 203.0.113.46
2 203.0.113.45
2 198.51.100.24
1 198.51.100.23
1 198.51.100.22
1 198.51.100.21
1 198.51.100.12
View commands shown

These are the commands shown in the sanitized transcript.

Commands shown

  1. awk '{print $1, $7}' ./fixtures/nginx/access.log | head
  2. awk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./fixtures/nginx/access.log | sort -nr | head
  3. awk '{count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | head

next steps

Related commands

Cybersecurity Triage Read-only

Find Paths Repeatedly Returning 404

One missing URL is normal. A repeated missing URL is a signal.

awk '$9==404 {count[$7]++} END {for (path in count) if (count[path] >= 3) print count[path], path}' ./fixtures/nginx/access.log | sort -nr | head
Cybersecurity Triage Read-only

Find the IPs Creating the Most 4xx Noise

One address can turn a normal access log into a wall of failed requests.

awk '$9 ~ /^4/ {count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | head
Cybersecurity Triage Read-only

Spot Request Bursts by Minute

Traffic spikes are easier to read when you bucket them by time.

awk '{minute=substr($4,2,17); count[minute]++} END {for (m in count) print count[m], m}' ./fixtures/nginx/access.log | sort -nr | head
Cybersecurity Triage Read-only

Find Common Admin Probe Paths

A site does not need WordPress to receive WordPress-looking probes.

awk '$7 ~ /(admin|login|wp-|phpmyadmin)/ {print $1, $7, $9}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr | head
Cybersecurity Triage Read-only

Count the Most Common User Agents

A strange traffic spike often has a strange user agent.

awk -F'"' '{print $6}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr | head
Study mapping

Use this as independent command practice: read the notes, predict the output, then compare it with the example before using a real shell.

  • lpic1:103-gnu-unix-commands
  • lpic1:110-security
  • lfcs:essential-commands
  • lfcs:security-hygiene
  • linuxplus:automation-scripting
  • linuxplus:provisional
  • risk:read-only

Useful for

  • LPIC-1 style command-line practice
  • LFCS style performance tasks
  • Linux+ style troubleshooting review

Independent study support only. No affiliation, endorsement, exam dumps, or real exam questions.