Back to guides

linux troubleshooting guide

Find Large Files on Linux Without Making the Problem Worse

Use read-only ranking commands to find candidates, then decide whether each file should be rotated, compressed, archived, truncated, or left alone.

Problem

A directory is suspiciously large, but a folder total does not tell you which file caused the growth or whether it is safe to remove.

First rule

Finding a large file is not permission to delete it. First identify what owns it and whether a process is actively using it.

Audience

developers, self-hosters, Linux beginners, and anyone cleaning a server under pressure

Cert context

Good practice for command-line file search, sorting, redirection, and filesystem awareness in unofficial Linux certification study.

quick start

Safe first commands

  1. find /var -xdev -type f -printf '%s %p\n' | sort -nr | head -20
  2. find /var -xdev -type f -printf '%s %p\n' | sort -nr | head -10 | awk '{printf "%.1f MB %s\n", $1/1024/1024, $2}'
  3. du -xhd1 /var 2>/dev/null | sort -h

Start broad, then narrow

`du` is useful for folder-level direction. `find` is better when you need exact file paths. Use both: first locate the hot directory, then rank files inside it.

  1. du -xhd1 /var 2>/dev/null | sort -h
  2. du -xhd1 /var/log 2>/dev/null | sort -h

Sort by bytes before formatting

Sort raw byte counts first. Human-readable values are easier to scan, but raw numbers sort more predictably in portable one-liners.

  1. find /var/log -xdev -type f -printf '%s %p\n' | sort -nr | head -20
  2. find /var/log -xdev -type f -printf '%s %p\n' | sort -nr | head -20 | numfmt --field=1 --to=iec

Treat logs, backups, and databases differently

A giant old archive may be movable. A giant active log may need rotation. A giant database file is probably normal and dangerous to delete. The path tells you what cleanup method belongs next.

  1. stat -c '%A %U:%G %s %n' /path/to/file
  2. sudo lsof /path/to/file

triage logic

How to read the result

a log file is huge and currently open

The writer may keep disk blocks reserved after deletion.

Next: Use log rotation or a service-aware truncate, then verify free space.

a backup or archive is huge

It may be safe to move only if retention rules allow it.

Next: Confirm ownership, date, and restore requirements before moving.

a cache directory dominates usage

There may be an application-supported cleanup command.

Next: Prefer the application cleanup path over manual deletion.

safety notes

Slow down here

  • Run large searches during lower traffic when possible because they can add IO load.
  • Keep output reviewable with `head` before expanding to larger result sets.
  • Do not use `sudo rm` just because a file appears near the top.

Independent study support

These guides are cert-adjacent practice material, not official training, endorsement, exam dumps, or real exam questions.

related lessons

Command cards