Back to lessons

Hosting Operations

Check robots.txt for a Sitemap Line

You need to confirm robots.txt advertises the sitemap URL.

Command

grep -n '^Sitemap:' public/robots.txt

What changed

Nothing changes. The command searches robots.txt for a Sitemap directive.

Danger

safe

When to use it

Use after changing sitemap paths, domains, or static host routing.

When not to use it

Do not use it to validate every robots rule; it only checks the sitemap directive.

Undo or recovery

No undo needed because this command is read-only.

Expected output

The line number and Sitemap directive from robots.txt.

demo script

Disposable terminal steps

  1. cat public/robots.txt
  2. grep -n '^Sitemap:' public/robots.txt

simulated output

What it looks like

disposable vessel
::fixture-ready::
$ cat public/robots.txt
User-agent: *
Disallow: /admin
Sitemap: https://example.com/sitemap.xml
::exit-code::0
$ grep -n '^Sitemap:' public/robots.txt
3:Sitemap: https://example.com/sitemap.xml
::exit-code::0

YouTube Short

Does robots advertise sitemap?

Check robots.txt for the Sitemap line so crawlers can discover the sitemap directly.

LinkedIn hook

A sitemap can exist and still be hard to discover.

Question: Do you check robots.txt after moving a static site between domains?

experiments

A/B tests to run

Metric: watch_time

A: Check robots.txt.

B: Sitemap exists but is not advertised.