Back to problems

problem hub

Read-only first

Too many open files on Linux

Check system file-handle pressure, process open files, and service limits before raising limits or restarting services.

Safest first command

cat /proc/sys/fs/file-nr

Before you run this

Expected output: Three numbers showing allocated file handles, unused allocated handles, and the system-wide maximum.

When not to use it: Do not raise limits blindly on a production host before identifying the process, service unit, and whether file handles are leaking.

Expected output example

18304	0	9223372036854775807

How to read the result

The first number is allocated file handles. A fast-rising first number or a value close to the maximum points to system-wide pressure; a single service can also hit its own limit first.

What to check next

System file handles are near the max

Means: The host may be approaching the kernel-wide file table limit.

Next step: Count open files by process, then inspect only the target PID before changing limits.

Count Open Files for One Process

One service reports EMFILE or too many open files

Means: The service may have a lower per-process or systemd unit limit than the host.

Next step: Check the service LimitNOFILE value and process ownership.

Show a Service LimitNOFILE

Open files are mostly sockets or logs

Means: The issue may be connection churn, a leak, or log handles rather than normal disk files.

Next step: Inspect the process and logs before restarting or raising limits.

Find Listening Ports with ss

Too-many-open-files decision tree

Start with system file-handle pressure, then identify whether the failure is host-wide, service-specific, or process-specific. Replace 1234 with the target PID from ps, systemctl status, or application logs before running process-scoped lsof commands.

  1. cat /proc/sys/fs/file-nr
  2. ulimit -n
  3. systemctl show nginx -p LimitNOFILE --no-pager
  4. sudo lsof -p 1234 | wc -l
  5. sudo lsof -p 1234 | head

Service limit branch

If only one daemon fails, compare the shell ulimit with the systemd unit limit. A login shell limit does not prove the service limit. Inspect the unit before editing drop-ins.

  1. systemctl show nginx -p LimitNOFILE --no-pager
  2. systemctl cat nginx
  3. systemctl status nginx --no-pager --lines=30

Bad fixes to avoid

Do not raise fs.file-max or LimitNOFILE as the first move. Do not restart a leaking service before capturing process, unit, and log evidence. Do not paste lsof output publicly without redacting paths, sockets, usernames, and hostnames.

Common causes

  • Leaking file descriptors in a service
  • Too many concurrent sockets
  • Low systemd LimitNOFILE for a busy daemon
  • Log or deleted-file handles held open
  • Kernel-wide file table pressure

What not to change yet

  • Do not raise limits until the owner process is known.
  • Do not kill the process before checking whether systemd will restart it.
  • Do not publish full lsof output without redaction.

Stop and escalate if

  • The next step could interrupt users, remove data, or lock out access.
  • The output includes secrets, customer data, or private infrastructure details.
  • You cannot explain the blast radius of the repair command.

platform notes

Distro and service notes

systemd

Service limits come from the effective unit and drop-ins, not the interactive shell.

Security

lsof output can reveal private paths, users, sockets, and hostnames.

supporting commands

Command path

Guides and drills