Question 1

What does this sitemap checker actually validate?

Accepted Answer

10 checks per the documented Sitemap Protocol 0.9 spec (sitemaps.org) and Google's published constraints: (1) XML root element ( or ), (2) Content-Type header, (3) Sitemap namespace, (4) Entry count vs Google's 50,000 cap, (5) Uncompressed size vs 50 MB cap, (6) Absolute URLs, (7) Single-host path-match, (8) W3C Datetime validity, (9) / usage flagged as informational (Google ignores these), (10) XML entity escaping. Each check links to the underlying spec or Google Search Central documentation.

Question 2

Does Google still use and ?

Accepted Answer

No. Google Search Central explicitly states it ignores both and . Many 2026 SEO guides still recommend setting these values — that advice is stale. Our checker flags their usage as informational (not an error) so you can clean them up at your next sitemap regeneration. IS still used by Google, but only when Google considers it 'consistently and verifiably accurate' — stale lastmod values are discounted.

Question 3

What are Google's actual sitemap limits?

Accepted Answer

50,000 URLs per file, 50 MB uncompressed per file (Google Search Central, verified June 2026). A sitemap index can reference up to 50,000 child sitemaps, each with its own 50K/50MB cap. News sitemaps have a separate hard cap of 1,000 URLs. Gzip compression is allowed; the 50 MB limit still applies to the uncompressed size. UTF-8 encoding is required. URLs must be fully-qualified absolute URLs with all special characters entity-escaped (& → &).

Question 4

Do AI crawlers like GPTBot and ClaudeBot use sitemap.xml?

Accepted Answer

Not officially documented. GPTBot (OpenAI), ClaudeBot (Anthropic), and PerplexityBot all obey robots.txt — and robots.txt can include a Sitemap: directive. That's the only documented hook for AI crawlers to discover your sitemap. Whether each AI crawler actually fetches and uses sitemap.xml the way Googlebot does is not publicly confirmed by OpenAI, Anthropic, or Perplexity. The safest practice is to declare Sitemap: in robots.txt and keep the sitemap reachable — if AI crawlers want to use it, they can. Check whether AI crawlers can access your site with our AI Bot Checker.

Question 5

What's the difference between a sitemap and a sitemap index?

Accepted Answer

A regular sitemap () lists individual page URLs. A sitemap index () lists OTHER sitemaps — useful for large sites with more than 50,000 URLs (the per-file Google cap). For example: an ecommerce site with 500,000 products would have 10+ child sitemaps, each with 50,000 URLs, referenced from one sitemap-index.xml at the root. This checker auto-detects which kind it found and reports accordingly.

Question 6

Why does the checker say my sitemap is on the wrong host?

Accepted Answer

Google's path-match rule: a sitemap can only list URLs at the same protocol + host as the sitemap itself, UNLESS you've set up Search Console cross-site verification. A sitemap at https://example.com/sitemap.xml listing URLs at https://blog.example.com/* will trigger this warning — Google will refuse to index the cross-host URLs. Fix: either move the sitemap to the same host as the URLs, or set up cross-domain sitemap verification in Search Console.

Question 7

What schema namespace should my sitemap use?

Accepted Answer

Sitemap Protocol 0.9 namespace: xmlns="http://www.sitemaps.org/schemas/sitemap/0.9". Last updated November 21, 2016 — no newer protocol exists. The same namespace applies to both and roots. For specialized sitemaps (image, video, news), additional namespaces are layered on top — xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" for image sitemaps, for example.

Question 8

How often should I update my sitemap?

Accepted Answer

Automatically, every time a page is published or significantly updated. Static or quarterly-regenerated sitemaps with stale <lastmod> values get discounted by Google. Most modern CMSes regenerate sitemaps on each publish (WordPress, Next.js, Shopify, Webflow, Squarespace). If you're hand-rolling a sitemap, set up an automated build that runs at deploy time. Stale sitemaps are the most common reason for indexability issues.

Sitemap Checker

What this tool checks

Common sitemap errors (and what fixes them)

Frequently asked questions

Related free tools

Want a full AI search visibility audit?