problem hub

Read-only first

Too many open files on Linux

Check system file-handle pressure, process open files, and service limits before raising limits or restarting services.

Safest first command

cat /proc/sys/fs/file-nr

Before you run this

Expected output: Three numbers showing allocated file handles, unused allocated handles, and the system-wide maximum.

When not to use it: Do not raise limits blindly on a production host before identifying the process, service unit, and whether file handles are leaking.

Expected output example

18304	0	9223372036854775807

How to read the result

The first number is allocated file handles. A fast-rising first number or a value close to the maximum points to system-wide pressure; a single service can also hit its own limit first.

What to check next

System file handles are near the max

Means: The host may be approaching the kernel-wide file table limit.

Next step: Count open files by process, then inspect only the target PID before changing limits.

Count Open Files for One Process

One service reports EMFILE or too many open files

Means: The service may have a lower per-process or systemd unit limit than the host.

Next step: Check the service LimitNOFILE value and process ownership.

Show a Service LimitNOFILE

Open files are mostly sockets or logs

Means: The issue may be connection churn, a leak, or log handles rather than normal disk files.

Next step: Inspect the process and logs before restarting or raising limits.

Find Listening Ports with ss

Too-many-open-files decision tree

Start with system file-handle pressure, then identify whether the failure is host-wide, service-specific, or process-specific. Replace 1234 with the target PID from ps, systemctl status, or application logs before running process-scoped lsof commands.

cat /proc/sys/fs/file-nr
ulimit -n
systemctl show nginx -p LimitNOFILE --no-pager
sudo lsof -p 1234 | wc -l
sudo lsof -p 1234 | head

Service limit branch

If only one daemon fails, compare the shell ulimit with the systemd unit limit. A login shell limit does not prove the service limit. Inspect the unit before editing drop-ins.

systemctl show nginx -p LimitNOFILE --no-pager
systemctl cat nginx
systemctl status nginx --no-pager --lines=30

Bad fixes to avoid

Do not raise fs.file-max or LimitNOFILE as the first move. Do not restart a leaking service before capturing process, unit, and log evidence. Do not paste lsof output publicly without redacting paths, sockets, usernames, and hostnames.

Common causes

Leaking file descriptors in a service
Too many concurrent sockets
Low systemd LimitNOFILE for a busy daemon
Log or deleted-file handles held open
Kernel-wide file table pressure

What not to change yet

Do not raise limits until the owner process is known.
Do not kill the process before checking whether systemd will restart it.
Do not publish full lsof output without redaction.

Stop and escalate if

The next step could interrupt users, remove data, or lock out access.
The output includes secrets, customer data, or private infrastructure details.
You cannot explain the blast radius of the repair command.

platform notes

Distro and service notes

systemd

Service limits come from the effective unit and drop-ins, not the interactive shell.

Security

lsof output can reveal private paths, users, sockets, and hostnames.

supporting commands