Your Site Is Not Down. DNS Might Be Lying.
The browser said the site was gone. The server was answering fine.
curl --resolve example.com:443:203.0.113.10 https://example.com/
linux fixes you can inspect first
Short, copyable commands for common Linux, hosting, server, and terminal problems. Every published fix includes safety notes, when not to use it, and simulated output from a disposable demo environment.
Current library
215 checked one-liners across Linux basics, web hosting, dangerous commands, and server triage.
215/215 commands have disposable demo output.
verified fixes
The browser said the site was gone. The server was answering fine.
curl --resolve example.com:443:203.0.113.10 https://example.com/
The disk was full, but guessing at folders was the slow part.
find /var -type f -printf '%s %p\n' | sort -nr | head -20
One rsync flag can save you. Another can erase the wrong side.
rsync -avhn --delete ./source/ ./backup/
The app was failing now. Opening a giant log file was the wrong move.
tail -n 80 -f /var/log/nginx/error.log
The error was in the log. The problem was finding it without reading noise.
grep -iE 'error|failed|denied|timeout' /var/log/nginx/error.log | tail -40
The app was running. The port was not listening.
ss -tulpn | grep ':80\|:443'
The permission fix was easy. Knowing what not to chmod was the hard part.
namei -l /var/www/example/index.html
The error was there. The useful part was knowing exactly where it was.
grep -inE 'error|failed|denied|timeout' /var/log/nginx/error.log
The disk was full. The fastest clue was the folder, not the file.
du -sh /var/* 2>/dev/null | sort -h
The log had old failures too. I only cared about the newest ones.
grep -iE 'error|failed|denied|timeout' /var/log/nginx/error.log | tail -10
`rsync --delete` is useful. It is also how people erase the wrong side.
rsync -avhn --delete ./source/ ./backup/ | grep '^deleting'
The file existed. The owner and mode explained why it still failed.
stat -c '%A %U:%G %n' /var/www/example/index.html
The server felt slow. Memory pressure was the first thing to rule out.
ps -eo pid,comm,%mem,%cpu --sort=-%mem | head
Byte counts are precise. Human units are faster under pressure.
find /var -type f -printf '%s %p\n' | sort -nr | head -10 | awk '{printf "%.1f MB %s\n", $1/1024/1024, $2}'
Your dev server says port 3000 is busy. Ask macOS who is holding it.
lsof -nP -iTCP:3000 -sTCP:LISTEN
Free a stuck dev port without hunting through Activity Monitor.
lsof -ti tcp:3000 | xargs kill
Wrong Node, Python, or FFmpeg? Start by reading your PATH clearly.
echo "$PATH" | tr ':' '\n' | nl -ba
Before blaming npm, Python, or Git, check the binary your shell actually found.
command -v node && node -v
Before committing, check whether a huge video, build artifact, or export slipped into your repo.
find . -type f -size +100M -print
When your Mac is full, start with the biggest folders in the current directory.
du -sh ./* 2>/dev/null | sort -h
Changed DNS but your Mac still visits the old place? Flush the resolver cache.
sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder
Need to see whether a file is still changing? Let tail follow it live.
tail -f ./app.log
A wall of logs is useless until you pull the error and the lines around it.
grep -n -C 2 'ERROR' ./app.log
Before opening a broken page in five browsers, ask the server for headers.
curl -I https://example.com
Before trusting a backup, know which files changed most recently.
find source -type f -printf '%TY-%Tm-%Td %TH:%TM %p\n' | sort
A file list says what exists; checksums say whether bytes match.
sha256sum source/app/config.yml source/content/index.md source/content/about.md source/assets/logo.svg
A checksum file is only useful if you actually verify it.
sha256sum -c checksums.sha256
A backup can be missing files and still look plausible at a glance.
comm -3 <(find source -type f | sed 's#^source/##' | sort) <(find backup -type f | sed 's#^backup/##' | sort)
Rsync can tell you what would change before it changes anything.
rsync -ain --delete source/ backup/
Zero-byte files can be normal, or they can be failed writes.
find backup -type f -size 0 -print
Large backup files are where storage surprises usually start.
find backup -type f -printf '%s %p\n' | sort -nr | head
You can inspect an archive without extracting it.
tar -tf archives/site-backup.tar | sort | head
A quick extension count can show whether expected content made it into the source tree.
find source -type f -printf '%f\n' | sed -n 's/.*\.//p' | sort | uniq -c | sort -nr
Files newer than the last snapshot are the ones most likely missing from it.
find source -type f -newer backup/.snapshot -print | sort
A restore drill starts by proving which backups actually exist.
cd restore-dr && find backups -maxdepth 2 -type f -name MANIFEST.txt -printf '%TY-%Tm-%Td %TH:%TM %h\n' | sort -r
The manifest should say what backup you are about to trust.
cd restore-dr && cat backups/2026-06-25/MANIFEST.txt
You can inspect a tar backup before it writes a single file.
cd restore-dr && tar -tf backups/2026-06-25/site.tar | sed 's#^./##' | sort
The fastest failed restore drill is the one that finds missing critical files early.
cd restore-dr && tar -tf backups/2026-06-24/site.tar | sed 's#^./##' | sort | comm -23 required-files.txt -
A restore drill should write to a sandbox, not production.
cd restore-dr && rm -rf restore-sandbox/full && mkdir -p restore-sandbox/full && tar -xf backups/2026-06-25/site.tar -C restore-sandbox/full
A restore is not validated until the bytes match.
cd restore-dr && rm -rf restore-sandbox/full && mkdir -p restore-sandbox/full && tar -xf backups/2026-06-25/site.tar -C restore-sandbox/full && (cd restore-sandbox/full && sha256sum -c CHECKSUMS.sha256)
A restored config can exist and still be the wrong config.
cd restore-dr && rm -rf restore-sandbox/full && mkdir -p restore-sandbox/full && tar -xf backups/2026-06-25/site.tar -C restore-sandbox/full && diff -u expected/app/config.yml restore-sandbox/full/app/config.yml
A successful extraction still needs a required-file check.
cd restore-dr && rm -rf restore-sandbox/full && mkdir -p restore-sandbox/full && tar -xf backups/2026-06-25/site.tar -C restore-sandbox/full && find restore-sandbox/full -type f | sed 's#^restore-sandbox/full/##' | sort | comm -23 required-files.txt -
Permissions are part of the restore, not decoration.
cd restore-dr && tar -tvf backups/2026-06-25/site.tar | awk '/secrets.env|deploy.sh/ {print $1, $6}'
A restore drill that leaves no evidence is hard to trust later.
cd restore-dr && grep -E 'status=|rpo_minutes=|rto_seconds=|checksum=|file_count=' reports/restore-dr-2026-06-25.txt
The failing file is usually one of the newest artifacts.
find artifacts logs -type f \( -name '*.log' -o -name '*.txt' \) -printf '%TY-%Tm-%Td %TH:%TM %p\n' | sort -r | head
One grep pass can turn a log pile into a failure list.
grep -RInE 'error|failed|failure|exception|traceback' artifacts logs | head -50
The line before the error often explains the error.
grep -RInC 3 -m 1 'ERROR' artifacts logs
The XML report already knows which tests failed.
grep -RIn '
Before debugging a test failure, measure the blast radius.
grep -RhoE 'tests="[0-9]+"|failures="[0-9]+"|errors="[0-9]+"|skipped="[0-9]+"' artifacts/test/*.xml | sort | uniq -c
Coverage failures usually say the threshold out loud.
grep -RInE 'coverage|threshold|minimum|below' artifacts logs
A bloated artifact can explain a slow or failed pipeline.
find artifacts -type f -printf '%s %p\n' | sort -nr | head -10
The deploy failed because the build never produced the file.
find artifacts/dist -maxdepth 2 -type f | sort
Artifacts are public more often than you think.
grep -RInE 'AWS_ACCESS_KEY|SECRET|TOKEN|PRIVATE KEY|PASSWORD' artifacts logs | head -50
A green retry can still hide a flaky test.
grep -RInE 'rerun|retry|flaky|passed on retry|failed attempt' artifacts logs
The database was running, but it was not ready.
pg_isready -h 127.0.0.1 -p 5432
The database was not down. It was full.
psql -X -A -F '|' -c "select pid,usename,datname,state,client_addr from pg_stat_activity order by state, pid;"
One query can make the whole app look broken.
psql -X -c "select pid, now() - query_start as age, state, left(query, 80) as query from pg_stat_activity where query_start is not null order by age desc limit 10;"
The outage was a queue, not a crash.
psql -X -c "select pid, wait_event_type, wait_event, state, left(query, 80) as query from pg_stat_activity where wait_event_type is not null order by pid;"
Disk pressure starts with knowing what grew.
psql -X -c "select datname, pg_size_pretty(pg_database_size(datname)) as size from pg_database order by pg_database_size(datname) desc;"
The port was open. MySQL still had to answer.
mysqladmin ping -h 127.0.0.1 -P 3306
The app was waiting behind busy sessions.
mysql -e "show full processlist;"
One old query explained the whole slowdown.
mysql -e "select id,user,host,db,command,time,state,left(info,80) as info from information_schema.processlist where command <> 'Sleep' order by time desc limit 10;"
The storage alert needed a database name.
mysql -e "select table_schema, round(sum(data_length + index_length)/1024/1024, 1) as mb from information_schema.tables group by table_schema order by mb desc;"
The fastest database security check is the listening address.
ss -ltnp | awk '$4 ~ /:(5432|3306)$/ {print}'
Before package triage, prove what OS family and release you are actually on.
. /etc/os-release && printf '%s %s %s\n' "$ID" "$VERSION_ID" "$VERSION_CODENAME"
The distro version and kernel version answer different questions.
printf 'kernel=%s arch=%s distro=%s\n' "$(uname -r)" "$(uname -m)" "$(lsb_release -ds)"
A package inventory beats memory when a server is drifting.
dpkg-query -W -f='${Package}\t${Version}\t${Architecture}\n' | sort
Before you upgrade anything, list what would move.
apt list --upgradable
apt policy explains where the next version would come from.
apt policy nginx
For one package, dpkg-query gives a clean status line.
dpkg-query -W -f='${Status} ${Version}\n' openssl
That binary came from somewhere. dpkg can tell you where.
dpkg-query -S /usr/sbin/nginx
Not every package row is cleanly installed.
dpkg-query -W -f='${db:Status-Abbrev}\t${Package}\n' | awk '$1 !~ /^ii$/'
Disk cleanup starts with evidence, not random package removal.
dpkg-query -W -f='${Installed-Size}\t${Package}\n' | sort -nr | head -20
One unexpected architecture can explain confusing dependency output.
dpkg-query -W -f='${Architecture}\t${Package}\n' | awk '$1 != "amd64" && $1 != "all"'
Skip the full CI log and jump straight to lines that usually explain the failure.
grep -RInE 'error|failed|exception|traceback|fatal' logs/ | tail -50
Confirm what your pipeline actually produced before you deploy it.
find artifacts/ -type f -printf '%TY-%Tm-%Td %TH:%TM %10s %p\n' | sort | tail -20
One glance tells you which release directory production is pointing at.
readlink -f releases/current && ls -ld releases/current
Huge logs often point to loops, noisy tests, or runaway debug output.
find logs/ -type f -printf '%s %p\n' | sort -nr | head -10
See your newest release directories without opening a dashboard.
find releases/ -mindepth 1 -maxdepth 1 -type d -printf '%T@ %TY-%Tm-%Td %TH:%TM %p\n' | sort -nr | head -10 | cut -d' ' -f2-
Audit environment labels without printing secret values.
grep -RhoE 'ENVIRONMENT|NODE_ENV|APP_ENV|RAILS_ENV' config deploy | sort -u
A deploy is not done until the endpoint answers.
curl -fsS -o /dev/null -w '%{http_code} %{time_total}s\n' https://example.com/health
Verify two artifact copies match before blaming deployment code.
sha256sum artifacts/app.tar.gz releases/current/app.tar.gz
Turn noisy test logs into a ranked failure list.
grep -RhoE '[A-Za-z0-9_./-]+\.(test|spec)\.(js|ts|py|rb)' logs/ | sort | uniq -c | sort -nr | head
Disk pressure during deploys often starts in old release directories.
du -sh releases/* 2>/dev/null | sort -h | tail -10
Find the image tags your deployment files reference without printing env values.
grep -RhoE 'image:[[:space:]]*[^[:space:]]+' deploy/ | sort -u
Turn noisy docker ps output into the few fields operators scan first.
docker ps -a --format 'table {{.Names}}\t{{.Status}}\t{{.Image}}\t{{.Ports}}'
Restart loops hide in plain sight unless you filter for them.
docker ps -a --filter status=restarting --format 'table {{.Names}}\t{{.Status}}\t{{.Image}}'
Docker may say a container is running while its health check says otherwise.
docker inspect --format '{{.Name}} health={{if .State.Health}}{{.State.Health.Status}}{{else}}none{{end}} status={{.State.Status}}' web
Skip the million-line log scroll and read only the recent failure window.
docker logs --since 10m --tail 100 api
Get Docker resource usage once, without leaving a live dashboard running.
docker stats --no-stream --format 'table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}'
When a service is unreachable, confirm Docker is publishing the port you think it is.
docker port web
See how Docker storage is split across images, containers, volumes, and cache.
docker system df -v
Check what environment variables exist without printing their secret values.
docker inspect --format '{{range .Config.Env}}{{println .}}{{end}}' api | sed 's/=.*$/=/'
A container can be healthy and still attached to the wrong network.
docker inspect --format '{{.Name}} {{range $name, $net := .NetworkSettings.Networks}}{{$name}} {{$net.IPAddress}} {{end}}' api
Docker keeps a recent event trail for starts, stops, pulls, and health changes.
docker events --since 30m --until 0s
The firewall was active, but the defaults mattered more than the rule list.
ufw status verbose
Numbered rules make firewall review less ambiguous.
ufw status numbered
The packet path was hiding below UFW.
nft list ruleset | sed -n '/chain input/,/}/p'
Legacy firewall state can still explain live exposure.
iptables -S INPUT
Firewall rules matter after you know what is listening.
ss -ltnp
Localhost services are different from public listeners.
ss -ltnp | awk 'NR==1 || $4 ~ /^(0[.]0[.]0[.]0|[[]::[]]|[*]):/'
An open firewall rule can outlive the service it was created for.
comm -23 <(ufw status numbered | awk '/ALLOW/ {print}' | grep -Eo '[0-9]+/(tcp|udp)' | cut -d/ -f1 | sort -u) <(ss -ltnp | awk '/LISTEN/ {n=split($4,a,":"); print a[n]}' | sort -u)
The process was public, but the firewall did not mention it.
comm -13 <(ufw status numbered | awk '/ALLOW/ {print}' | grep -Eo '[0-9]+/(tcp|udp)' | cut -d/ -f1 | sort -u) <(ss -ltnp | awk '$4 ~ /^(0[.]0[.]0[.]0|[[]::[]]|[*]):/ {n=split($4,a,":"); print a[n]}' | sort -u)
SSH can be locked down by source and still bind publicly.
ss -ltnp | awk '$4 ~ /:22$/ && $4 !~ /^127[.]/ {print}'
The database was listening, but only on localhost.
ss -ltnp | awk '$4 ~ /^127[.]0[.]0[.]1:(5432|3306|6379)$/ {print}'
The config looked fine. Nginx disagreed before reload broke anything.
nginx -t
The config existed, but it was not enabled.
ls -l /etc/nginx/sites-enabled/
The wrong server block was answering the domain.
grep -R "server_name" /etc/nginx/sites-enabled/
HTTPS worked. The plain HTTP redirect still mattered.
curl -I http://example.com
The page loaded, but the headers told the operational story.
curl -sI https://example.com
The site was fine. The domain was pointed somewhere else.
dig +short example.com A
The certificate existed. The question was which domains it covered.
certbot certificates
The deploy finished. The symlink told me what was actually live.
readlink -f /srv/www/example.com/current
The missing file was not random. The access log had a pattern.
awk '$9==404 {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head
LinkedIn traffic was not a guess. The referrer field showed it.
awk -F'"' '{print $4}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head
Start with severity counts before opening every log line.
journalctl -p warning..alert --since "2 hours ago" --no-pager -o short-iso | awk '{count[$4]++} END {for (level in count) print count[level], level}' | sort -nr
A noisy incident usually has a noisy source.
journalctl -p err..alert --since "2 hours ago" --no-pager -o short-iso | awk '{split($3,a,"["); unit=a[1]; count[unit]++} END {for (u in count) print count[u], u}' | sort -nr
Timeline beats guesswork when several failures happen close together.
journalctl -p err..alert --since "2 hours ago" --no-pager -o short-iso | awk '{print $1, $3, $4, substr($0,index($0,$5))}'
The first error often explains more than the last one.
awk '{buf[NR%5]=$0} tolower($0) ~ /(error|exception|fatal)/ {for (i=NR-4;i<=NR;i++) if (i>0) print buf[i%5]; exit}' fixtures/incidents/app.log
A minute-by-minute count shows whether an incident is a spike or a drip.
awk 'tolower($0) ~ /(error|fatal|timeout|exception)/ {minute=substr($1,1,16); count[minute]++} END {for (m in count) print count[m], m}' fixtures/incidents/app.log | sort -nr
Repeated request IDs can connect separate error lines to one failing path.
grep -Ei 'error|timeout|fatal|exception' fixtures/incidents/app.log | awk '{for (i=1;i<=NF;i++) if ($i ~ /^request_id=/) print $i}' | sort | uniq -c | sort -nr
Deploys and restarts are incident landmarks.
grep -Eh 'deploy|release|restart|started|stopped|rolled back' fixtures/incidents/*.log | sort
Exit code 137 often means the kernel has something to say.
journalctl -k --since "2 hours ago" --no-pager -o short-iso | grep -Ei 'out of memory|oom|killed process'
The biggest log is not always right, but it is worth knowing.
wc -l fixtures/incidents/*.log | sort -nr
Incident notes should not copy secrets forward.
grep -RInE '(password=|token=|secret=|Authorization: Bearer)' fixtures/incidents | awk '{gsub(/password=[^ ]+/, "password=REDACTED"); gsub(/token=[^ ]+/, "token=REDACTED"); gsub(/secret=[^ ]+/, "secret=REDACTED"); gsub(/Authorization: Bearer [A-Za-z0-9._-]+/, "Authorization: Bearer REDACTED"); print}'
A writable log directory is not the same thing as a safe shared directory.
find fixtures/perm-audit -type d -perm -0002 ! -perm -1000 -printf '%m %u:%g %p\n' | sort
A release file that someone besides the owner can modify deserves a second look.
find fixtures/perm-audit/releases/2026-06-25 -type f -perm /0022 -printf '%M %u:%g %p\n' | sort
Runtime directories often need writes, but the write boundary should be visible.
find fixtures/perm-audit/releases/2026-06-25/storage fixtures/perm-audit/releases/2026-06-25/uploads -type d -perm /0022 -printf '%M %u:%g %p\n' | sort
Special bits are easy to miss in a long ls listing.
find fixtures/perm-audit -perm /7000 -printf '%M %m %u:%g %p\n' | sort
Group-writable files are not automatically wrong, but the owning group decides the risk.
find fixtures/perm-audit -type f -perm -0020 -printf '%g %M %p\n' | sort
The file mode can look fine while a parent directory blocks the whole path.
namei -l fixtures/perm-audit/current/app/config/prod.token
The fastest secret audit starts with readable files that look like secrets.
find fixtures/perm-audit -type f -perm -0004 \( -iname '*secret*' -o -iname '*.env' -o -iname '*token*' -o -iname '*key*' \) -printf '%M %u:%g %p\n' | sort
Config files do not usually need to be executable.
find fixtures/perm-audit -type f -perm /111 \( -path '*/config/*' -o -name '*.env' -o -name '*.conf' \) -printf '%M %u:%g %p\n' | sort
A symlink can make the path you audited different from the file the app opens.
find fixtures/perm-audit -type l -printf '%p -> %l\n' -exec namei -l {} \;
Uploads are supposed to be writable at the edge, not writable forever by everyone.
find fixtures/perm-audit/releases/2026-06-25/uploads -type f -perm /0022 -printf '%M %u:%g %p\n' | sort
A server feels slow, but you need proof before restarting anything.
ps -eo pid,ppid,stat,pcpu,pmem,comm,args --sort=-pcpu | head -n 10
Memory pressure can look like a slow app, a stuck deploy, or random crashes.
ps -eo pid,ppid,stat,pcpu,pmem,rss,comm,args --sort=-pmem | head -n 10
Linux memory numbers look scary until you know which column matters.
free -h
A high load number is a clue, not a diagnosis.
uptime
A full disk can break logins, uploads, databases, and deploys.
df -h
Sometimes the disk has free bytes but still cannot create files.
df -ih
Once you know a filesystem is full, the next question is where.
du -xh --max-depth=1 /var 2>/dev/null | sort -h
Before blaming the firewall, check whether anything is actually listening.
ss -ltnp
A file can be deleted but still occupy disk while a process holds it open.
lsof +L1
Unexpected connections are easier to reason about when you can see them directly.
ss -tan state established
Cron problems often hide behind comments, blank lines, and copied folklore.
crontab -l | sed -n '/^[[:space:]]*#/d;/^[[:space:]]*$/d;p'
A job can be nowhere in your crontab and still run every night.
find /etc/cron.d /etc/cron.hourly /etc/cron.daily /etc/cron.weekly /etc/cron.monthly -maxdepth 1 -type f -print 2>/dev/null | sort
Cron is easier to debug when the schedule and command stop blending together.
crontab -l | awk 'NF && $1 !~ /^#/ {printf "%-16s %s\n", $1" "$2" "$3" "$4" "$5, substr($0,index($0,$6))}'
A silent cron job is a future incident with no witness.
crontab -l | awk 'NF && $1 !~ /^#/ && $0 !~ /(>>|2>|logger|mail)/ {print}'
A dot in a filename can keep a cron.daily script from running.
run-parts --test /etc/cron.daily
A timer is only half the scheduled job. The service is the payload.
systemctl list-timers --all --no-pager --plain | awk 'NR==1 || /\.timer/ {print $(NF-1), "->", $NF}'
The suspicious timer is the one with no next run.
systemctl list-timers --all --no-pager --plain | awk 'NR==1 || $1=="n/a" || /backup\.timer|logrotate\.timer/'
When a timer fires, the useful logs are usually on the service.
journalctl -u backup.service -n 20 --no-pager
Logrotate can explain its plan without rotating anything.
logrotate -d /etc/logrotate.conf 2>&1 | sed -n '/rotating pattern/p;/considering log/p;/error:/p'
The biggest log risk is often the file no policy mentions.
find /var/log -type f -name '*.log' -printf '%p\n' | while read -r log; do grep -Rqs -- "$log" /etc/logrotate.conf /etc/logrotate.d || grep -Rqs -- "$(dirname "$log")/[*].log" /etc/logrotate.conf /etc/logrotate.d || printf '%s\n' "$log"; done
Before querying a database file, see what tables are actually inside it.
sqlite3 app.db ".tables"
A failed query is often just a wrong assumption about column names.
sqlite3 app.db ".schema users"
When a SQLite-backed app behaves strangely, first rule out file corruption.
sqlite3 app.db "PRAGMA integrity_check;"
System metadata tables can distract from the app tables you care about.
sqlite3 app.db "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name;"
A quick row count can reveal empty imports, runaway events, or missing data.
sqlite3 app.db "SELECT 'users', count(*) FROM users UNION ALL SELECT 'orders', count(*) FROM orders UNION ALL SELECT 'events', count(*) FROM events;"
Slow lookups often start with missing or misunderstood indexes.
sqlite3 app.db "PRAGMA index_list('orders');"
For small apps, the quickest timeline may be inside the SQLite file.
sqlite3 app.db "SELECT created_at, event_type FROM events ORDER BY created_at DESC LIMIT 5;"
A noisy event type stands out faster when you group it.
sqlite3 app.db "SELECT event_type, count(*) FROM events GROUP BY event_type ORDER BY count(*) DESC;"
Duplicate account data is easier to spot with one grouped query.
sqlite3 app.db "SELECT email, count(*) FROM users GROUP BY email HAVING count(*) > 1;"
Copying a live SQLite file blindly can produce a bad backup.
sqlite3 app.db ".backup backup/app.db"
Duplicate titles make a static site harder to scan in search results and browser tabs.
grep -Rho --include='*.html' '[^<]* ' public | sed 's###;s# ##' | sort | uniq -c | sort -nr
Canonical tags are easy to drop when templates branch.
find public -name '*.html' -print | while read -r f; do grep -qi 'rel="canonical"' "$f" || echo "$f"; done
A leftover noindex can hide a page after launch.
grep -Rni --include='*.html' 'noindex' public
Missing descriptions are usually a content template problem, not a mystery.
find public -name '*.html' -print | while read -r f; do grep -qi 'name="description"' "$f" || echo "$f"; done
Before comparing sitemap coverage, print the URLs plainly.
grep -o '[^<]* ' public/sitemap.xml | sed 's###;s# ##'
A sitemap can exist and still be hard to discover.
grep -n '^Sitemap:' public/robots.txt
A page can exist in the build but never make it into the sitemap.
find public -name '*.html' -print | sed 's#^public#https://example.com#' | while read -r url; do grep -q "$url" public/sitemap.xml || echo "$url"; done
A broken internal link is easiest to catch before it becomes a 404.
grep -Rho --include='*.html' 'href="/[^"]*"' public | sed 's#href="##;s#"##' | while read -r path; do test -e "public${path}" || echo "$path"; done | sort -u
Social previews often fail because one template missed Open Graph tags.
find public -name '*.html' -print | while read -r f; do grep -qi 'property="og:title"' "$f" || echo "$f"; done
Your feed can advertise URLs that the sitemap never lists.
grep -o 'https://example.com/[^<]*' public/feed.xml | sed 's###;s###' | while read -r url; do grep -q "$url" public/sitemap.xml || echo "$url"; done
One command tells you which services systemd already knows are broken.
systemctl --failed --no-pager
Make systemctl status safe for scripts, screenshots, and quick incident notes.
systemctl status nginx --no-pager --lines=30
Ignore stale logs and inspect only what happened since this boot.
journalctl -u nginx -b --no-pager -n 80
Before deleting random logs, ask journald how much disk it owns.
journalctl --disk-usage
Find which units made your VPS boot slowly.
systemd-analyze blame | head -20
Running now does not mean it will survive the next reboot.
systemctl is-enabled nginx
Get a clean yes-or-no service state without the full status page.
systemctl is-active nginx
Confirm whether the server actually rebooted and when.
last -x reboot | head -5
See whether memory is actually tight before restarting services.
free -h
Cron is not the only scheduler on modern Linux servers.
systemctl list-timers --all --no-pager
Failed SSH attempts are noisy; grouping users makes the pattern readable.
sed -n 's/.*Failed password for \(invalid user \)\?\([^ ]*\) from .*/\2/p' logs/auth.log | sort | uniq -c | sort -nr
The loudest SSH source is usually visible with one count.
sed -n 's/.*Failed password .* from \([0-9.]*\) port.*/\1/p' logs/auth.log | sort | uniq -c | sort -nr
During first response, successful logins matter more than background noise.
grep 'Accepted publickey' logs/auth.log
Privilege use is one of the fastest first-response signals.
grep 'sudo:' logs/auth.log | tail -n 10
Unexpected network listeners are first-response evidence.
ss -ltnp
Not every local account should be able to log in.
awk -F: '$7 ~ /sh$/ {print $1, $7}' etc/passwd
SSH policy should be visible before you change it.
grep -nE '^(PasswordAuthentication|PermitRootLogin|PubkeyAuthentication|AllowUsers)' etc/ssh/sshd_config
World-writable web paths deserve immediate review.
find srv/www -type d -perm -0002 -print
SSH private keys should not be readable like ordinary files.
find home -type f -name 'id_*' -printf '%m %p\n' | awk '$1 > 600'
Authorized keys are the server's practical access list.
find home -path '*/.ssh/authorized_keys' -printf '%m %p\n'
The site was configured, but the port was not.
grep -RInE '^[[:space:]]*listen[[:space:]]' fixtures/nginx/conf.d fixtures/nginx/sites-enabled
The wrong site answered because it was the fallback.
grep -RIn 'default_server' fixtures/nginx/conf.d fixtures/nginx/sites-enabled
The config was valid; it just was not included.
grep -RInE '^[[:space:]]*include[[:space:]]' fixtures/nginx/nginx.conf fixtures/nginx/conf.d fixtures/nginx/sites-enabled
The URL was right. The filesystem path was not.
grep -RInE '^[[:space:]]*(root|alias)[[:space:]]' fixtures/nginx/conf.d fixtures/nginx/sites-enabled
Nginx was healthy. It was proxying to the wrong place.
grep -RInE '^[[:space:]]*proxy_pass[[:space:]]' fixtures/nginx/conf.d fixtures/nginx/sites-enabled
The Apache config existed. The enabled symlink did not.
find fixtures/apache/sites-enabled -maxdepth 1 -type l -printf '%f -> %l\n' | sort
Apache chose a virtual host. You need to know which one.
grep -RInE '
Apache was serving files from a different directory than expected.
grep -RInE '^[[:space:]]*DocumentRoot[[:space:]]' fixtures/apache/sites-enabled
Apache was up. The reverse proxy target was wrong.
grep -RInE '^[[:space:]]*(ProxyPass|ProxyPassReverse)[[:space:]]' fixtures/apache/sites-enabled
The redirect loop was hiding in plain text.
grep -RInE 'return[[:space:]]+30[18]|rewrite[[:space:]]|Redirect[[:space:]]|RewriteRule|RewriteCond' fixtures/nginx fixtures/apache
Before chasing individual lines, get the shape of the whole log.
awk '{count[$9]++} END {for (code in count) print count[code], code}' ./fixtures/nginx/access.log | sort -nr
One address can turn a normal access log into a wall of failed requests.
awk '$9 ~ /^4/ {count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | head
A 500 spike is easier to triage when the broken path is obvious.
awk '$9 ~ /^5/ {count[$7]++} END {for (path in count) print count[path], path}' ./fixtures/nginx/access.log | sort -nr | head
Most site traffic is boring. The weird methods are worth a look.
awk '$6 !~ /^"(GET|POST|HEAD|OPTIONS)$/ {print $1, $6, $7, $9}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr
A strange traffic spike often has a strange user agent.
awk -F'"' '{print $6}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr | head
A site does not need WordPress to receive WordPress-looking probes.
awk '$7 ~ /(admin|login|wp-|phpmyadmin)/ {print $1, $7, $9}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr | head
One missing URL is normal. A repeated missing URL is a signal.
awk '$9==404 {count[$7]++} END {for (path in count) if (count[path] >= 3) print count[path], path}' ./fixtures/nginx/access.log | sort -nr | head
Traffic spikes are easier to read when you bucket them by time.
awk '{minute=substr($4,2,17); count[minute]++} END {for (m in count) print count[m], m}' ./fixtures/nginx/access.log | sort -nr | head
A few huge responses can explain bandwidth, latency, and suspicious download patterns.
awk '$10 ~ /^[0-9]+$/ && $10 > 1000000 {print $10, $1, $7, $9}' ./fixtures/nginx/access.log | sort -nr | head
The suspicious pattern is sometimes one client hammering one URL.
awk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./fixtures/nginx/access.log | sort -nr | head
problem areas
Use the category that matches the problem: disk pressure, logs, permissions, Nginx, DNS, defensive checks, or Mac terminal work.
The commands to inspect a machine before guessing.
Small checks that separate DNS, TLS, Nginx, and app failures.
Commands worth understanding before copying from the internet.
Defensive commands for checking exposure, logs, permissions, and suspicious activity.
Web hosting, SSL, DNS, Nginx, deployment, backups, and VPS management.
macOS and Apple-adjacent terminal workflows for developers and power users.
how to read a fix
demo vessel
Each demo uses fake files, fake services, and safe fixtures so you can see what the command does before trying it on a real machine.
quality bar
Every fix needs a clear use case, safety notes, recovery guidance, and a reproducible example.
Problem, command, danger rating, when not to use it, undo or recovery, expected output, and demo commands.
Short demos and transcripts are generated from the same checked lesson data.
Useful pages get expanded into deeper guides, videos, and related troubleshooting paths.
project feeds
These feeds support videos, transcripts, and future tooling. Most readers can ignore them.
Titles, captions, target duration, playlist, and voiceover seed text.
Hooks, command snippets, questions, and calls to action for engagement testing.
Privacy-light events to measure copy clicks, export clicks, lesson depth, and outbound video intent.
roadmap
The library is expanding by problem area: Linux basics, hosting, security triage, macOS, CI/CD, and data workflows.