Schema Markup for AI Search (2026): What Works, What's Deprecated, What to Implement
Schema is not an AI ranking signal — it's extractability infrastructure. John Mueller has stated this multiple times; Ahrefs replicated with a difference-in-differences study (1,885 pages, May 2026) showing AI citation lift indistinguishable from zero. But five schema types still earn their keep in 2026: Organization (Knowledge Graph feed for Gemini), Article (datePublished/dateModified freshness signal), BreadcrumbList (still gets rich results), Product (AI shopping eligibility), and Speakable BETA (US news only). FAQPage rich results retired May 7, 2026 — schema still parsed, no SERP feature. HowTo retired September 13, 2023. Implementation order, six common mistakes, and a per-engine honesty table below.
1,885
Ahrefs DiD treated pages — AI citation lift ≈ 0
May 7 2026
FAQPage rich results retired
v30.0
Schema.org current vocabulary
BETA
Speakable still beta (US news, EN only)
What schema markup actually does (and doesn't do) for AI
Schema doesn't lift rankings. Google's John Mueller has stated this multiple times — most recently quoted across SEO industry coverage in 2024–2026: “Structured data won't make your site rank better.” The claim isn't new and isn't controversial inside Google. The Ahrefs May 2026 difference-in-differences study replicated this with empirical data: 1,885 treated pages (added JSON-LD between Aug 2025 and Mar 2026) vs ~4,000 matched control pages. Result: AI Mode +2.4%, ChatGPT +2.2% (both statistically indistinguishable from zero), AI Overviews −4.6% (significant but in the opposite direction). The study's conclusion: adding schema did NOT increase AI citations.
Schema DOES expose entity relationships. Organization JSON-LD with sameAs links to Wikipedia, Wikidata, LinkedIn, X, and Crunchbase feeds Google's Knowledge Graph — which Gemini grounds on. This is the strongest indirect path between schema and AI citation. Not a ranking signal in the SERP sense; an identity signal in the entity-graph sense.
Schema DOES expose freshness. Article dateModified is the freshness lever AI crawlers and Googlebot both use. Pages without it default to first-crawl date; pages with stale dateModified that doesn't reflect actual changes get deprioritized as “fake fresh.” Update only when content actually changes.
Schema DOES gate rich results that still exist. BreadcrumbList, Product, Review, Recipe, Job, Event, and a few others still earn SERP features. FAQPage and HowTo no longer do (May 7 2026 and September 13 2023, respectively).
Schema DOES gate AI shopping eligibility. Product schema with offers, aggregateRating, brand, and gtin/mpn is required for Google Merchant Center and the emerging AI shopping surfaces (AI Overviews shopping panels, ChatGPT Shopping). Not optional if you sell products.
Note on the Princeton GEO paper. The Princeton GEO benchmark (Aggarwal et al., KDD 2024, arXiv:2311.09735) tested nine content-level methods for AI citation lift — Quotation Addition (+42.6%), Statistics Addition (+32.8%), Cite Sources (+27.7%), and others. Schema was not among the nine methods tested. All Princeton levers are text-level content manipulations. If you're looking for content levers with measured citation lift, that's where they are — not in JSON-LD.
The 5 schema types that still earn their keep in 2026
Most of Schema.org's 823 types are noise for most sites. These five do real work.
Organization
Knowledge Graph feedGemini and Google Search rely on Knowledge Graph for entity disambiguation. Organization schema with logo, sameAs links to Wikipedia/Wikidata/LinkedIn/X, and contactPoint feeds that graph. The strongest single sitewide schema investment.
Where to put it: Once, sitewide, in <head> via root layout. Not duplicated per page.
Article (or NewsArticle / BlogPosting)
datePublished + dateModified freshness signal + author attributionEditorial content needs to expose datePublished, dateModified, headline, and author Person. AI crawlers use dateModified as a freshness heuristic. Person authority via sameAs graph compounds with E-E-A-T.
Where to put it: Every editorial page. Match @type to content (NewsArticle for news, BlogPosting for blog, Article for evergreen).
BreadcrumbList
Site structure for AI crawler navigation + Google breadcrumb rich resultsBreadcrumbList still earns Google rich results (the only schema below that consistently does on most templates) and gives AI crawlers a topical hierarchy. Cheap to ship, compounding return.
Where to put it: Every page with a breadcrumb trail. Match visible breadcrumb exactly.
Product
Merchant Center + AI shopping surface eligibility (AI Overviews shopping, ChatGPT Shopping)Product schema with offers, aggregateRating, brand, and gtin/mpn feeds both Google Merchant Center and the emerging AI shopping surfaces. Required for shopping results in AI Overviews.
Where to put it: Every product page. Match visible price, availability, ratings exactly.
Speakable (BETA)
Voice-AI signal for Google AssistantStill BETA in 2026, US news publishers only, English only, onboarding submission required. Worth implementing on TL;DR blocks if you publish news and care about voice surfaces. Skip if you're not in the eligible category.
Where to put it: On answer-first TL;DR sections only. Mark the speakable selectors (<code>data-speakable=true</code> + JSON-LD selectors).
Deprecated, retired, and dead schema (don't waste time)
Schema advice from 2019–2022 hasn't aged well. These are the four things still appearing in stale “SEO checklists” that aren't worth the implementation cost.
FAQPage rich results
Rich results retired May 7, 2026Google removed FAQPage from the rich results gallery on May 7, 2026. The schema is still parsed for understanding — and AI crawlers extract from it — but no SERP rich result is rendered. Verdict: still worth shipping for AI extraction; do NOT pitch it as a 'get rich results' tactic.
HowTo rich results
Deprecated September 13, 2023Google announced the change August 2023 and fully removed mobile + desktop HowTo rich results by September 13, 2023. The schema is still valid Schema.org vocabulary but earns no SERP feature. Skip new HowTo implementations unless you have a separate reason.
Meta keywords tag
Dead since September 2009Google publicly confirmed it stopped using the meta keywords tag for ranking in September 2009. It still appears in 'SEO checklists' written by people who haven't updated since 2008. Do not implement.
Schema as a ranking signal
Not a ranking signal — confirmed by Mueller, replicated by AhrefsJohn Mueller has stated multiple times that structured data won't make your site rank better. Ahrefs ran a difference-in-differences study (1,885 treated pages vs 4,000 matched controls, May 2026) and found AI Mode +2.4%, ChatGPT +2.2% (both statistically indistinguishable from zero), and AI Overviews −4.6% (significant in the opposite direction). Adding schema did not increase AI citations.
Which AI crawlers parse schema (honest table)
The honest answer: primary documentation from OpenAI, Anthropic, and Perplexity does NOT explicitly confirm JSON-LD parsing. Many SEO sources assert these bots parse structured data; the official docs cover user agents and robots.txt behavior but stay silent on JSON-LD specifically. Don't treat schema parsing as a verified AI-side capability — treat it as extractability infrastructure that compounds with rendered HTML.
| Crawler | Operator | Purpose | Parses JSON-LD? |
|---|---|---|---|
| GPTBot | OpenAI | Training data crawl | Not confirmed in primary docs OpenAI's official bot docs document user agents, IP ranges, and robots.txt behavior — but do NOT explicitly confirm JSON-LD parsing. Third-party SEO sources assert they do; the official docs are silent. |
| OAI-SearchBot | OpenAI | ChatGPT Search live retrieval | Not confirmed in primary docs Same as GPTBot. OAI-SearchBot fetches live web content for ChatGPT Search responses; whether it specifically parses JSON-LD vs rendered HTML is not documented. |
| ChatGPT-User | OpenAI | User-triggered browse (links the user opens) | Not confirmed in primary docs Fires only when a ChatGPT user explicitly clicks a link. Behavior similar to a browser. |
| ClaudeBot | Anthropic | Training + content fetch | Not confirmed in primary docs Anthropic's crawler documentation does not explicitly address structured data parsing. There are three Claude crawlers (claude-web, ClaudeBot, anthropic-ai). |
| PerplexityBot | Perplexity AI | Live web search index | Not confirmed in primary docs Perplexity's documentation covers robots.txt compliance and user agents; JSON-LD parsing is not explicitly addressed. Perplexity sources responses from live web retrieval. |
| Google-Extended | Bard/Gemini training opt-out token (not a separate crawler) | Same as Googlebot Google-Extended is a robots.txt token that controls whether Googlebot-crawled content is used for Bard/Gemini training. Schema parsing follows Googlebot behavior — which explicitly does parse JSON-LD per Google's own Rich Results documentation. | |
| Applebot-Extended | Apple | Apple Intelligence training opt-out | Not documented Apple introduced Applebot-Extended as a separate opt-out for Apple Intelligence. JSON-LD parsing behavior is not publicly documented. |
Verification status. Googlebot explicitly parses JSON-LD per Google's own Rich Results documentation. Google-Extended follows Googlebot. For OpenAI, Anthropic, Perplexity, and Apple bots, JSON-LD parsing is asserted in third-party SEO blogs but is not documented in primary operator docs. We mark these as “not confirmed” rather than confabulating.
Implementation order (do this, in this order)
Seven steps from sitewide foundations to per-page additions. Stop when the next step doesn't apply to your content.
Organization (sitewide, once)
Add Organization JSON-LD once in your root layout. Include name, url, logo (absolute URL, square), sameAs (Wikipedia/Wikidata if eligible, LinkedIn, X, Crunchbase, your verified Mastodon/Bluesky), and contactPoint. Do NOT duplicate per page.
WebSite (sitewide, once)
Add WebSite schema with name, url, and potentialAction for site search (SearchAction). This is the only sitewide schema that still earns a Sitelinks Search Box.
BreadcrumbList (every page with a trail)
Match visible breadcrumb to JSON-LD position values 1..N exactly. This is the cheapest schema with the most reliable rich-result return in 2026.
Article / BlogPosting / NewsArticle (every editorial page)
Include headline (≤110 chars to be safe), datePublished, dateModified, author (Person with sameAs), publisher (Organization), image (1200×675 minimum), and inLanguage. dateModified is the freshness lever AI crawlers use.
Product (every commerce page)
Match visible price, availability, currency, brand, and aggregateRating exactly. Ahrefs DiD study showed schema doesn't lift AI citations — but Product schema IS gating for Merchant Center and AI shopping surfaces. Required, not optional.
Speakable (TL;DR blocks on news pages — optional)
Only if you publish news, English-only, and care about voice AI surfaces. Mark TL;DR or answer-first sections with data-speakable=true and JSON-LD speakable selectors. Skip for non-news sites.
FAQPage (Q&A blocks — for AI extraction, not rich results)
Rich results retired May 7, 2026. Still parsed for understanding. Implement FAQPage on Q&A blocks ONLY when the answers are visible on-page and match. Do not duplicate visible content into JSON-LD answers without rendering it.
Need to generate the JSON-LD? Our free JSON-LD Schema Generator builds Article, FAQ, Product, Organization, and Breadcrumb schemas — spec-compliant, paste-ready. No signup, no rate limit.
6 schema mistakes that hurt AI extraction
What goes wrong in practice. Audit your own pages against this list before believing the plugin output is clean.
JSON-LD content not matching visible content
Ahrefs May 2026 DiD study explicitly flagged content-match: schema with hidden content (FAQ answers in JSON-LD but not on the page) underperformed even baseline. AI engines and Google both treat content-match violations as a quality signal.
Multiple conflicting Organization blocks
Some CMSes emit Organization schema from theme, plugin, AND manual JSON-LD. AI crawlers ingesting conflicting name/logo/sameAs values lose entity confidence. Audit for duplicates with Google Rich Results Test.
sameAs pointing to dead profiles
sameAs is the entity-graph link. Pointing to a defunct Mastodon instance, retired Crunchbase URL, or hijacked old Twitter handle pollutes your entity. Audit annually.
Missing or stale dateModified
AI crawlers use dateModified as a freshness heuristic. Pages without it default to first-crawl date. Pages with dateModified that doesn't actually change get deprioritized as 'fake fresh'. Update dateModified only when content actually changes.
Schema inside <noscript> or rendered client-side only
AI crawlers (and Googlebot's first pass) read the initial HTML. JSON-LD injected after hydration by client-side JS may be missed. Server-render schema in <head> or before </body>.
Validating with the retired Structured Data Testing Tool
The classic Structured Data Testing Tool was retired in 2023. Use Google's Rich Results Test for SERP eligibility checks and Schema.org's Schema Markup Validator for spec compliance. Both are linked in the validate section below.
How to validate your schema
Two official validators plus our free generator. Use them in this order before shipping.
Google Rich Results Test
Checks SERP eligibility for Google rich results. Use first if you care about SERP features.
Open toolSchema.org Schema Markup Validator
Checks Schema.org spec compliance regardless of Google eligibility. Use for non-Google AI crawlers.
Open toolTurboAudit Schema Generator
Free generator for Article, FAQ, Product, Organization, BreadcrumbList JSON-LD. Paste-ready output.
Open generatorNote on retired tools. The classic Structured Data Testing Tool at search.google.com/structured-data/testing-tool was retired in 2023. If your CI uses it, migrate to Rich Results Test (Google-specific) or Schema.org Markup Validator (spec-compliant).
Does schema help me rank in AI Overviews / ChatGPT / Perplexity?
Per-engine honest answer with the evidence that supports it.
Google AI Overviews
No correlation with citation liftAhrefs May 2026 DiD study measured AI Overviews citation change at −4.6% (statistically significant, but in the OPPOSITE direction from a lift). Mueller's repeated public statements align. AIO selection appears to be driven by SERP ranking + extractability of visible content, not JSON-LD presence.
ChatGPT (ChatGPT Search, GPT-5.5)
No measured correlation; OpenAI docs silentAhrefs DiD study: ChatGPT +2.2%, indistinguishable from zero. OpenAI's official bot docs do not confirm JSON-LD parsing. ChatGPT citation correlates with Bing index presence + Wikipedia/Reddit authority, not schema.
Google AI Mode
No measured correlationAhrefs DiD study: AI Mode +2.4%, indistinguishable from zero. AI Mode uses Gemini's query fan-out across Google's index; same retrieval substrate as AIO.
Perplexity
Not specifically studied; live-web retrievalPerplexity uses live-web retrieval with mandatory citations. The Ahrefs study didn't isolate Perplexity. Reasonable hypothesis: schema doesn't lift Perplexity citation directly, but Organization sameAs to Wikipedia compounds with Perplexity's strong Wikipedia/Reddit citation pattern.
Gemini (standalone app)
Knowledge Graph dependency means Organization schema indirectly mattersGemini grounds on Google's Knowledge Graph. Organization + sameAs to Wikipedia/Wikidata is the strongest indirect path. This is the one engine where Organization schema has a defensible (if indirect) connection to citation.
Frequently asked questions
Is schema markup dead for SEO in 2026?+
Should I still use FAQPage schema in 2026?+
Does ChatGPT read JSON-LD?+
Does llms.txt replace schema markup?+
Is Speakable schema production-ready in 2026?+
Article vs NewsArticle vs BlogPosting — which should I use?+
Do I need both Organization and LocalBusiness schema?+
Will AI engines fix my broken schema?+
Does Yoast / Rank Math / Schema App output AI-friendly schema?+
How many schema types should I use per page?+
Sources
- Google Search Central — FAQPage rich results deprecation (notice added May 8, 2026; effective May 7, 2026; documentation removed June 15, 2026). developers.google.com/search/docs/appearance/structured-data/faqpage
- Google Search Central — HowTo and FAQ changes (announced August 2023, HowTo fully deprecated September 13, 2023). developers.google.com/search/blog/2023/08/howto-faq-changes
- Ahrefs — “We Tracked 1,885 Pages Adding Schema. AI Citations Barely Moved.” (May 2026 difference-in-differences study: AI Mode +2.4%, ChatGPT +2.2%, AI Overviews −4.6%). ahrefs.com/blog/schema-ai-citations
- John Mueller — “Structured data won't make your site rank better.” (Bluesky, quoted in industry coverage). searchenginejournal.com
- Princeton GEO paper — Aggarwal et al., KDD 2024. The 9 tested methods do NOT include schema markup. arXiv:2311.09735
- Google Search Central — Speakable (BETA) structured data documentation. developers.google.com/search/docs/appearance/structured-data/speakable
- Schema.org — Releases history. v30.0 published March 19, 2026 (823 Types, 1,529 Properties, 19 Datatypes). schema.org/docs/releases.html
- OpenAI — Bot documentation (GPTBot, OAI-SearchBot, ChatGPT-User, OAI-AdsBot). Documents user agents, IP ranges, robots.txt behavior; does NOT confirm JSON-LD parsing. developers.openai.com/api/docs/bots
- Anthropic — Crawler documentation. Three Claude crawlers (claude-web, ClaudeBot, anthropic-ai); JSON-LD parsing not explicitly addressed. support.anthropic.com
- Perplexity — Crawler documentation. PerplexityBot covers live web retrieval; JSON-LD parsing not explicitly addressed. docs.perplexity.ai/guides/bots
- Google — Google-Extended robots.txt token (opt-out for Bard/Gemini training; not a separate crawler). developers.google.com/search/docs/crawling-indexing/overview-google-crawlers
- Google — Rich Results Test (current SERP eligibility validator). search.google.com/test/rich-results
- Schema.org — Schema Markup Validator (spec-compliance validator). validator.schema.org
Audit your schema content-match
Run TurboAudit to flag schema-without-content-match, missing Organization sameAs, stale dateModified, and the six common mistakes above. Free tier: 5 audits per month, no credit card.
Start free