Discover
See the current public robots and sitemap state.
Check whether robots.txt or sitemap URLs are blocking indexing, pointing crawlers to an unexpected host, or exposing suspicious spam-looking paths.
Try it now
Running outside-in integrity check
Following redirects, reading public signals, and checking safe exposure paths.
Public domains only. Internal IPs, localhost, custom ports, and unsupported schemes are rejected.
robots.txt
Disallow rules and sitemap declarations
sitemaps
Common XML locations and suspicious URLs
Crawler control files
This checker looks for the public signals that matter most: unavailable robots.txt, Disallow: /, external sitemap declarations, external URLs, and spam-looking sitemap entries.
robots.txt
User-agent: * Disallow: / Sitemap: https://unknown-host.test/sitemap.xml
sitemap.xml
/normal-page /brand-wallet-replica https://external-host.test/page
One-time check vs monitoring
See the current public robots and sitemap state.
Find external hosts, broad disallow rules, and spam terms.
Create a baseline and get alerts when these public files change.
Ambastly tracks these public files as part of external website integrity monitoring.
Why it matters
A broad disallow rule can hide pages from search. A compromised sitemap can invite crawlers to spam URLs. An external sitemap declaration can send attention to a host you do not control.
Can tell compliant crawlers not to index the whole site.
Can point crawlers away from the expected website.
Can expose spam paths to search engines before humans notice.
Problems this catches
A visitor may never open robots.txt or sitemap.xml, but search engines and crawlers use those files to decide what to discover, ignore, or index.
A broad Disallow: / rule can create search visibility issues if it appears unexpectedly.
An external sitemap declaration can send crawlers away from the expected website.
Suspicious paths in sitemap XML can expose injected pages to search engines.
A one-time check shows the current state; monitoring helps catch future drift.
Robots and sitemap checker FAQ
robots.txt gives crawler instructions about which paths should or should not be crawled. A sudden broad disallow rule can affect search visibility.
Sitemaps help crawlers discover URLs. If a sitemap contains spam URLs, external hosts, or unexpected paths, those signals can create search reputation problems.
They can be publicly visible compromise signals, especially when they point to external hosts, include spam terms, or differ from the expected website state.
These files can change after a deployment, CMS plugin issue, or compromise. Continuous monitoring helps detect suspicious changes instead of relying on occasional manual checks.