Your Site Is Not Down. DNS Might Be Lying.
The browser said the site was gone. The server was answering fine.
curl --resolve example.com:443:203.0.113.10 https://example.com/
linux fixes you can inspect first
Short, copyable commands for common Linux, hosting, server, and terminal problems. Every published fix includes safety notes, when not to use it, and simulated output from a disposable demo environment.
Current library
155 checked one-liners across Linux basics, web hosting, dangerous commands, and server triage.
155/155 commands have disposable demo output.
verified fixes
The browser said the site was gone. The server was answering fine.
curl --resolve example.com:443:203.0.113.10 https://example.com/
The disk was full, but guessing at folders was the slow part.
find /var -type f -printf '%s %p\n' | sort -nr | head -20
One rsync flag can save you. Another can erase the wrong side.
rsync -avhn --delete ./source/ ./backup/
The app was failing now. Opening a giant log file was the wrong move.
tail -n 80 -f /var/log/nginx/error.log
The error was in the log. The problem was finding it without reading noise.
grep -iE 'error|failed|denied|timeout' /var/log/nginx/error.log | tail -40
The app was running. The port was not listening.
ss -tulpn | grep ':80\|:443'
The permission fix was easy. Knowing what not to chmod was the hard part.
namei -l /var/www/example/index.html
The error was there. The useful part was knowing exactly where it was.
grep -inE 'error|failed|denied|timeout' /var/log/nginx/error.log
The disk was full. The fastest clue was the folder, not the file.
du -sh /var/* 2>/dev/null | sort -h
The log had old failures too. I only cared about the newest ones.
grep -iE 'error|failed|denied|timeout' /var/log/nginx/error.log | tail -10
`rsync --delete` is useful. It is also how people erase the wrong side.
rsync -avhn --delete ./source/ ./backup/ | grep '^deleting'
The file existed. The owner and mode explained why it still failed.
stat -c '%A %U:%G %n' /var/www/example/index.html
The server felt slow. Memory pressure was the first thing to rule out.
ps -eo pid,comm,%mem,%cpu --sort=-%mem | head
Byte counts are precise. Human units are faster under pressure.
find /var -type f -printf '%s %p\n' | sort -nr | head -10 | awk '{printf "%.1f MB %s\n", $1/1024/1024, $2}'
Your dev server says port 3000 is busy. Ask macOS who is holding it.
lsof -nP -iTCP:3000 -sTCP:LISTEN
Free a stuck dev port without hunting through Activity Monitor.
lsof -ti tcp:3000 | xargs kill
Wrong Node, Python, or FFmpeg? Start by reading your PATH clearly.
echo "$PATH" | tr ':' '\n' | nl -ba
Before blaming npm, Python, or Git, check the binary your shell actually found.
command -v node && node -v
Before committing, check whether a huge video, build artifact, or export slipped into your repo.
find . -type f -size +100M -print
When your Mac is full, start with the biggest folders in the current directory.
du -sh ./* 2>/dev/null | sort -h
Changed DNS but your Mac still visits the old place? Flush the resolver cache.
sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder
Need to see whether a file is still changing? Let tail follow it live.
tail -f ./app.log
A wall of logs is useless until you pull the error and the lines around it.
grep -n -C 2 'ERROR' ./app.log
Before opening a broken page in five browsers, ask the server for headers.
curl -I https://example.com
Before trusting a backup, know which files changed most recently.
find source -type f -printf '%TY-%Tm-%Td %TH:%TM %p\n' | sort
A file list says what exists; checksums say whether bytes match.
sha256sum source/app/config.yml source/content/index.md source/content/about.md source/assets/logo.svg
A checksum file is only useful if you actually verify it.
sha256sum -c checksums.sha256
A backup can be missing files and still look plausible at a glance.
comm -3 <(find source -type f | sed 's#^source/##' | sort) <(find backup -type f | sed 's#^backup/##' | sort)
Rsync can tell you what would change before it changes anything.
rsync -ain --delete source/ backup/
Zero-byte files can be normal, or they can be failed writes.
find backup -type f -size 0 -print
Large backup files are where storage surprises usually start.
find backup -type f -printf '%s %p\n' | sort -nr | head
You can inspect an archive without extracting it.
tar -tf archives/site-backup.tar | sort | head
A quick extension count can show whether expected content made it into the source tree.
find source -type f -printf '%f\n' | sed -n 's/.*\.//p' | sort | uniq -c | sort -nr
Files newer than the last snapshot are the ones most likely missing from it.
find source -type f -newer backup/.snapshot -print | sort
The database was running, but it was not ready.
pg_isready -h 127.0.0.1 -p 5432
The database was not down. It was full.
psql -X -A -F '|' -c "select pid,usename,datname,state,client_addr from pg_stat_activity order by state, pid;"
One query can make the whole app look broken.
psql -X -c "select pid, now() - query_start as age, state, left(query, 80) as query from pg_stat_activity where query_start is not null order by age desc limit 10;"
The outage was a queue, not a crash.
psql -X -c "select pid, wait_event_type, wait_event, state, left(query, 80) as query from pg_stat_activity where wait_event_type is not null order by pid;"
Disk pressure starts with knowing what grew.
psql -X -c "select datname, pg_size_pretty(pg_database_size(datname)) as size from pg_database order by pg_database_size(datname) desc;"
The port was open. MySQL still had to answer.
mysqladmin ping -h 127.0.0.1 -P 3306
The app was waiting behind busy sessions.
mysql -e "show full processlist;"
One old query explained the whole slowdown.
mysql -e "select id,user,host,db,command,time,state,left(info,80) as info from information_schema.processlist where command <> 'Sleep' order by time desc limit 10;"
The storage alert needed a database name.
mysql -e "select table_schema, round(sum(data_length + index_length)/1024/1024, 1) as mb from information_schema.tables group by table_schema order by mb desc;"
The fastest database security check is the listening address.
ss -ltnp | awk '$4 ~ /:(5432|3306)$/ {print}'
Skip the full CI log and jump straight to lines that usually explain the failure.
grep -RInE 'error|failed|exception|traceback|fatal' logs/ | tail -50
Confirm what your pipeline actually produced before you deploy it.
find artifacts/ -type f -printf '%TY-%Tm-%Td %TH:%TM %10s %p\n' | sort | tail -20
One glance tells you which release directory production is pointing at.
readlink -f releases/current && ls -ld releases/current
Huge logs often point to loops, noisy tests, or runaway debug output.
find logs/ -type f -printf '%s %p\n' | sort -nr | head -10
See your newest release directories without opening a dashboard.
find releases/ -mindepth 1 -maxdepth 1 -type d -printf '%T@ %TY-%Tm-%Td %TH:%TM %p\n' | sort -nr | head -10 | cut -d' ' -f2-
Audit environment labels without printing secret values.
grep -RhoE 'ENVIRONMENT|NODE_ENV|APP_ENV|RAILS_ENV' config deploy | sort -u
A deploy is not done until the endpoint answers.
curl -fsS -o /dev/null -w '%{http_code} %{time_total}s\n' https://example.com/health
Verify two artifact copies match before blaming deployment code.
sha256sum artifacts/app.tar.gz releases/current/app.tar.gz
Turn noisy test logs into a ranked failure list.
grep -RhoE '[A-Za-z0-9_./-]+\.(test|spec)\.(js|ts|py|rb)' logs/ | sort | uniq -c | sort -nr | head
Disk pressure during deploys often starts in old release directories.
du -sh releases/* 2>/dev/null | sort -h | tail -10
Find the image tags your deployment files reference without printing env values.
grep -RhoE 'image:[[:space:]]*[^[:space:]]+' deploy/ | sort -u
Turn noisy docker ps output into the few fields operators scan first.
docker ps -a --format 'table {{.Names}}\t{{.Status}}\t{{.Image}}\t{{.Ports}}'
Restart loops hide in plain sight unless you filter for them.
docker ps -a --filter status=restarting --format 'table {{.Names}}\t{{.Status}}\t{{.Image}}'
Docker may say a container is running while its health check says otherwise.
docker inspect --format '{{.Name}} health={{if .State.Health}}{{.State.Health.Status}}{{else}}none{{end}} status={{.State.Status}}' web
Skip the million-line log scroll and read only the recent failure window.
docker logs --since 10m --tail 100 api
Get Docker resource usage once, without leaving a live dashboard running.
docker stats --no-stream --format 'table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}'
When a service is unreachable, confirm Docker is publishing the port you think it is.
docker port web
See how Docker storage is split across images, containers, volumes, and cache.
docker system df -v
Check what environment variables exist without printing their secret values.
docker inspect --format '{{range .Config.Env}}{{println .}}{{end}}' api | sed 's/=.*$/=/'
A container can be healthy and still attached to the wrong network.
docker inspect --format '{{.Name}} {{range $name, $net := .NetworkSettings.Networks}}{{$name}} {{$net.IPAddress}} {{end}}' api
Docker keeps a recent event trail for starts, stops, pulls, and health changes.
docker events --since 30m --until 0s
The firewall was active, but the defaults mattered more than the rule list.
ufw status verbose
Numbered rules make firewall review less ambiguous.
ufw status numbered
The packet path was hiding below UFW.
nft list ruleset | sed -n '/chain input/,/}/p'
Legacy firewall state can still explain live exposure.
iptables -S INPUT
Firewall rules matter after you know what is listening.
ss -ltnp
Localhost services are different from public listeners.
ss -ltnp | awk 'NR==1 || $4 ~ /^(0[.]0[.]0[.]0|[[]::[]]|[*]):/'
An open firewall rule can outlive the service it was created for.
comm -23 <(ufw status numbered | awk '/ALLOW/ {print}' | grep -Eo '[0-9]+/(tcp|udp)' | cut -d/ -f1 | sort -u) <(ss -ltnp | awk '/LISTEN/ {n=split($4,a,":"); print a[n]}' | sort -u)
The process was public, but the firewall did not mention it.
comm -13 <(ufw status numbered | awk '/ALLOW/ {print}' | grep -Eo '[0-9]+/(tcp|udp)' | cut -d/ -f1 | sort -u) <(ss -ltnp | awk '$4 ~ /^(0[.]0[.]0[.]0|[[]::[]]|[*]):/ {n=split($4,a,":"); print a[n]}' | sort -u)
SSH can be locked down by source and still bind publicly.
ss -ltnp | awk '$4 ~ /:22$/ && $4 !~ /^127[.]/ {print}'
The database was listening, but only on localhost.
ss -ltnp | awk '$4 ~ /^127[.]0[.]0[.]1:(5432|3306|6379)$/ {print}'
The config looked fine. Nginx disagreed before reload broke anything.
nginx -t
The config existed, but it was not enabled.
ls -l /etc/nginx/sites-enabled/
The wrong server block was answering the domain.
grep -R "server_name" /etc/nginx/sites-enabled/
HTTPS worked. The plain HTTP redirect still mattered.
curl -I http://example.com
The page loaded, but the headers told the operational story.
curl -sI https://example.com
The site was fine. The domain was pointed somewhere else.
dig +short example.com A
The certificate existed. The question was which domains it covered.
certbot certificates
The deploy finished. The symlink told me what was actually live.
readlink -f /srv/www/example.com/current
The missing file was not random. The access log had a pattern.
awk '$9==404 {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head
LinkedIn traffic was not a guess. The referrer field showed it.
awk -F'"' '{print $4}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head
A server feels slow, but you need proof before restarting anything.
ps -eo pid,ppid,stat,pcpu,pmem,comm,args --sort=-pcpu | head -n 10
Memory pressure can look like a slow app, a stuck deploy, or random crashes.
ps -eo pid,ppid,stat,pcpu,pmem,rss,comm,args --sort=-pmem | head -n 10
Linux memory numbers look scary until you know which column matters.
free -h
A high load number is a clue, not a diagnosis.
uptime
A full disk can break logins, uploads, databases, and deploys.
df -h
Sometimes the disk has free bytes but still cannot create files.
df -ih
Once you know a filesystem is full, the next question is where.
du -xh --max-depth=1 /var 2>/dev/null | sort -h
Before blaming the firewall, check whether anything is actually listening.
ss -ltnp
A file can be deleted but still occupy disk while a process holds it open.
lsof +L1
Unexpected connections are easier to reason about when you can see them directly.
ss -tan state established
Before querying a database file, see what tables are actually inside it.
sqlite3 app.db ".tables"
A failed query is often just a wrong assumption about column names.
sqlite3 app.db ".schema users"
When a SQLite-backed app behaves strangely, first rule out file corruption.
sqlite3 app.db "PRAGMA integrity_check;"
System metadata tables can distract from the app tables you care about.
sqlite3 app.db "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name;"
A quick row count can reveal empty imports, runaway events, or missing data.
sqlite3 app.db "SELECT 'users', count(*) FROM users UNION ALL SELECT 'orders', count(*) FROM orders UNION ALL SELECT 'events', count(*) FROM events;"
Slow lookups often start with missing or misunderstood indexes.
sqlite3 app.db "PRAGMA index_list('orders');"
For small apps, the quickest timeline may be inside the SQLite file.
sqlite3 app.db "SELECT created_at, event_type FROM events ORDER BY created_at DESC LIMIT 5;"
A noisy event type stands out faster when you group it.
sqlite3 app.db "SELECT event_type, count(*) FROM events GROUP BY event_type ORDER BY count(*) DESC;"
Duplicate account data is easier to spot with one grouped query.
sqlite3 app.db "SELECT email, count(*) FROM users GROUP BY email HAVING count(*) > 1;"
Copying a live SQLite file blindly can produce a bad backup.
sqlite3 app.db ".backup backup/app.db"
Duplicate titles make a static site harder to scan in search results and browser tabs.
grep -Rho --include='*.html' '[^<]* ' public | sed 's###;s# ##' | sort | uniq -c | sort -nr
Canonical tags are easy to drop when templates branch.
find public -name '*.html' -print | while read -r f; do grep -qi 'rel="canonical"' "$f" || echo "$f"; done
A leftover noindex can hide a page after launch.
grep -Rni --include='*.html' 'noindex' public
Missing descriptions are usually a content template problem, not a mystery.
find public -name '*.html' -print | while read -r f; do grep -qi 'name="description"' "$f" || echo "$f"; done
Before comparing sitemap coverage, print the URLs plainly.
grep -o '[^<]* ' public/sitemap.xml | sed 's###;s# ##'
A sitemap can exist and still be hard to discover.
grep -n '^Sitemap:' public/robots.txt
A page can exist in the build but never make it into the sitemap.
find public -name '*.html' -print | sed 's#^public#https://example.com#' | while read -r url; do grep -q "$url" public/sitemap.xml || echo "$url"; done
A broken internal link is easiest to catch before it becomes a 404.
grep -Rho --include='*.html' 'href="/[^"]*"' public | sed 's#href="##;s#"##' | while read -r path; do test -e "public${path}" || echo "$path"; done | sort -u
Social previews often fail because one template missed Open Graph tags.
find public -name '*.html' -print | while read -r f; do grep -qi 'property="og:title"' "$f" || echo "$f"; done
Your feed can advertise URLs that the sitemap never lists.
grep -o 'https://example.com/[^<]*' public/feed.xml | sed 's###;s###' | while read -r url; do grep -q "$url" public/sitemap.xml || echo "$url"; done
One command tells you which services systemd already knows are broken.
systemctl --failed --no-pager
Make systemctl status safe for scripts, screenshots, and quick incident notes.
systemctl status nginx --no-pager --lines=30
Ignore stale logs and inspect only what happened since this boot.
journalctl -u nginx -b --no-pager -n 80
Before deleting random logs, ask journald how much disk it owns.
journalctl --disk-usage
Find which units made your VPS boot slowly.
systemd-analyze blame | head -20
Running now does not mean it will survive the next reboot.
systemctl is-enabled nginx
Get a clean yes-or-no service state without the full status page.
systemctl is-active nginx
Confirm whether the server actually rebooted and when.
last -x reboot | head -5
See whether memory is actually tight before restarting services.
free -h
Cron is not the only scheduler on modern Linux servers.
systemctl list-timers --all --no-pager
Failed SSH attempts are noisy; grouping users makes the pattern readable.
sed -n 's/.*Failed password for \(invalid user \)\?\([^ ]*\) from .*/\2/p' logs/auth.log | sort | uniq -c | sort -nr
The loudest SSH source is usually visible with one count.
sed -n 's/.*Failed password .* from \([0-9.]*\) port.*/\1/p' logs/auth.log | sort | uniq -c | sort -nr
During first response, successful logins matter more than background noise.
grep 'Accepted publickey' logs/auth.log
Privilege use is one of the fastest first-response signals.
grep 'sudo:' logs/auth.log | tail -n 10
Unexpected network listeners are first-response evidence.
ss -ltnp
Not every local account should be able to log in.
awk -F: '$7 ~ /sh$/ {print $1, $7}' etc/passwd
SSH policy should be visible before you change it.
grep -nE '^(PasswordAuthentication|PermitRootLogin|PubkeyAuthentication|AllowUsers)' etc/ssh/sshd_config
World-writable web paths deserve immediate review.
find srv/www -type d -perm -0002 -print
SSH private keys should not be readable like ordinary files.
find home -type f -name 'id_*' -printf '%m %p\n' | awk '$1 > 600'
Authorized keys are the server's practical access list.
find home -path '*/.ssh/authorized_keys' -printf '%m %p\n'
The site was configured, but the port was not.
grep -RInE '^[[:space:]]*listen[[:space:]]' fixtures/nginx/conf.d fixtures/nginx/sites-enabled
The wrong site answered because it was the fallback.
grep -RIn 'default_server' fixtures/nginx/conf.d fixtures/nginx/sites-enabled
The config was valid; it just was not included.
grep -RInE '^[[:space:]]*include[[:space:]]' fixtures/nginx/nginx.conf fixtures/nginx/conf.d fixtures/nginx/sites-enabled
The URL was right. The filesystem path was not.
grep -RInE '^[[:space:]]*(root|alias)[[:space:]]' fixtures/nginx/conf.d fixtures/nginx/sites-enabled
Nginx was healthy. It was proxying to the wrong place.
grep -RInE '^[[:space:]]*proxy_pass[[:space:]]' fixtures/nginx/conf.d fixtures/nginx/sites-enabled
The Apache config existed. The enabled symlink did not.
find fixtures/apache/sites-enabled -maxdepth 1 -type l -printf '%f -> %l\n' | sort
Apache chose a virtual host. You need to know which one.
grep -RInE '
Apache was serving files from a different directory than expected.
grep -RInE '^[[:space:]]*DocumentRoot[[:space:]]' fixtures/apache/sites-enabled
Apache was up. The reverse proxy target was wrong.
grep -RInE '^[[:space:]]*(ProxyPass|ProxyPassReverse)[[:space:]]' fixtures/apache/sites-enabled
The redirect loop was hiding in plain text.
grep -RInE 'return[[:space:]]+30[18]|rewrite[[:space:]]|Redirect[[:space:]]|RewriteRule|RewriteCond' fixtures/nginx fixtures/apache
Before chasing individual lines, get the shape of the whole log.
awk '{count[$9]++} END {for (code in count) print count[code], code}' ./fixtures/nginx/access.log | sort -nr
One address can turn a normal access log into a wall of failed requests.
awk '$9 ~ /^4/ {count[$1]++} END {for (ip in count) print count[ip], ip}' ./fixtures/nginx/access.log | sort -nr | head
A 500 spike is easier to triage when the broken path is obvious.
awk '$9 ~ /^5/ {count[$7]++} END {for (path in count) print count[path], path}' ./fixtures/nginx/access.log | sort -nr | head
Most site traffic is boring. The weird methods are worth a look.
awk '$6 !~ /^"(GET|POST|HEAD|OPTIONS)$/ {print $1, $6, $7, $9}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr
A strange traffic spike often has a strange user agent.
awk -F'"' '{print $6}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr | head
A site does not need WordPress to receive WordPress-looking probes.
awk '$7 ~ /(admin|login|wp-|phpmyadmin)/ {print $1, $7, $9}' ./fixtures/nginx/access.log | sort | uniq -c | sort -nr | head
One missing URL is normal. A repeated missing URL is a signal.
awk '$9==404 {count[$7]++} END {for (path in count) if (count[path] >= 3) print count[path], path}' ./fixtures/nginx/access.log | sort -nr | head
Traffic spikes are easier to read when you bucket them by time.
awk '{minute=substr($4,2,17); count[minute]++} END {for (m in count) print count[m], m}' ./fixtures/nginx/access.log | sort -nr | head
A few huge responses can explain bandwidth, latency, and suspicious download patterns.
awk '$10 ~ /^[0-9]+$/ && $10 > 1000000 {print $10, $1, $7, $9}' ./fixtures/nginx/access.log | sort -nr | head
The suspicious pattern is sometimes one client hammering one URL.
awk '{key=$1 " " $7; count[key]++} END {for (k in count) if (count[k] >= 5) print count[k], k}' ./fixtures/nginx/access.log | sort -nr | head
problem areas
Use the category that matches the problem: disk pressure, logs, permissions, Nginx, DNS, defensive checks, or Mac terminal work.
The commands to inspect a machine before guessing.
Small checks that separate DNS, TLS, Nginx, and app failures.
Commands worth understanding before copying from the internet.
Defensive commands for checking exposure, logs, permissions, and suspicious activity.
Web hosting, SSL, DNS, Nginx, deployment, backups, and VPS management.
macOS and Apple-adjacent terminal workflows for developers and power users.
how to read a fix
demo vessel
Each demo uses fake files, fake services, and safe fixtures so you can see what the command does before trying it on a real machine.
quality bar
Every fix needs a clear use case, safety notes, recovery guidance, and a reproducible example.
Problem, command, danger rating, when not to use it, undo or recovery, expected output, and demo commands.
Short demos and transcripts are generated from the same checked lesson data.
Useful pages get expanded into deeper guides, videos, and related troubleshooting paths.
project feeds
These feeds support videos, transcripts, and future tooling. Most readers can ignore them.
Titles, captions, target duration, playlist, and voiceover seed text.
Hooks, command snippets, questions, and calls to action for engagement testing.
Privacy-light events to measure copy clicks, export clicks, lesson depth, and outbound video intent.
roadmap
The library is expanding by problem area: Linux basics, hosting, security triage, macOS, CI/CD, and data workflows.