Schema · Verified June 2026

Schema Markup for AI Search (2026): What Works, What's Deprecated, What to Implement

Updated

Schema is not an AI ranking signal — it's extractability infrastructure. John Mueller has stated this multiple times; Ahrefs replicated with a difference-in-differences study (1,885 pages, May 2026) showing AI citation lift indistinguishable from zero. But five schema types still earn their keep in 2026: Organization (Knowledge Graph feed for Gemini), Article (datePublished/dateModified freshness signal), BreadcrumbList (still gets rich results), Product (AI shopping eligibility), and Speakable BETA (US news only). FAQPage rich results retired May 7, 2026 — schema still parsed, no SERP feature. HowTo retired September 13, 2023. Implementation order, six common mistakes, and a per-engine honesty table below.

1,885

Ahrefs DiD treated pages — AI citation lift ≈ 0

May 7 2026

FAQPage rich results retired

v30.0

Schema.org current vocabulary

BETA

Speakable still beta (US news, EN only)

Sources: Ahrefs, Google Search Central, Schema.org

What schema markup actually does (and doesn't do) for AI

Schema doesn't lift rankings. Google's John Mueller has stated this multiple times — most recently quoted across SEO industry coverage in 2024–2026: “Structured data won't make your site rank better.” The claim isn't new and isn't controversial inside Google. The Ahrefs May 2026 difference-in-differences study replicated this with empirical data: 1,885 treated pages (added JSON-LD between Aug 2025 and Mar 2026) vs ~4,000 matched control pages. Result: AI Mode +2.4%, ChatGPT +2.2% (both statistically indistinguishable from zero), AI Overviews −4.6% (significant but in the opposite direction). The study's conclusion: adding schema did NOT increase AI citations.

Schema DOES expose entity relationships. Organization JSON-LD with sameAs links to Wikipedia, Wikidata, LinkedIn, X, and Crunchbase feeds Google's Knowledge Graph — which Gemini grounds on. This is the strongest indirect path between schema and AI citation. Not a ranking signal in the SERP sense; an identity signal in the entity-graph sense.

Schema DOES expose freshness. Article dateModified is the freshness lever AI crawlers and Googlebot both use. Pages without it default to first-crawl date; pages with stale dateModified that doesn't reflect actual changes get deprioritized as “fake fresh.” Update only when content actually changes.

Schema DOES gate rich results that still exist. BreadcrumbList, Product, Review, Recipe, Job, Event, and a few others still earn SERP features. FAQPage and HowTo no longer do (May 7 2026 and September 13 2023, respectively).

Schema DOES gate AI shopping eligibility. Product schema with offers, aggregateRating, brand, and gtin/mpn is required for Google Merchant Center and the emerging AI shopping surfaces (AI Overviews shopping panels, ChatGPT Shopping). Not optional if you sell products.

Note on the Princeton GEO paper. The Princeton GEO benchmark (Aggarwal et al., KDD 2024, arXiv:2311.09735) tested nine content-level methods for AI citation lift — Quotation Addition (+42.6%), Statistics Addition (+32.8%), Cite Sources (+27.7%), and others. Schema was not among the nine methods tested. All Princeton levers are text-level content manipulations. If you're looking for content levers with measured citation lift, that's where they are — not in JSON-LD.

The 5 schema types that still earn their keep in 2026

Most of Schema.org's 823 types are noise for most sites. These five do real work.

Organization

Knowledge Graph feed

Gemini and Google Search rely on Knowledge Graph for entity disambiguation. Organization schema with logo, sameAs links to Wikipedia/Wikidata/LinkedIn/X, and contactPoint feeds that graph. The strongest single sitewide schema investment.

Where to put it: Once, sitewide, in <head> via root layout. Not duplicated per page.

Article (or NewsArticle / BlogPosting)

datePublished + dateModified freshness signal + author attribution

Editorial content needs to expose datePublished, dateModified, headline, and author Person. AI crawlers use dateModified as a freshness heuristic. Person authority via sameAs graph compounds with E-E-A-T.

Where to put it: Every editorial page. Match @type to content (NewsArticle for news, BlogPosting for blog, Article for evergreen).

BreadcrumbList

Site structure for AI crawler navigation + Google breadcrumb rich results

BreadcrumbList still earns Google rich results (the only schema below that consistently does on most templates) and gives AI crawlers a topical hierarchy. Cheap to ship, compounding return.

Where to put it: Every page with a breadcrumb trail. Match visible breadcrumb exactly.

Product

Merchant Center + AI shopping surface eligibility (AI Overviews shopping, ChatGPT Shopping)

Product schema with offers, aggregateRating, brand, and gtin/mpn feeds both Google Merchant Center and the emerging AI shopping surfaces. Required for shopping results in AI Overviews.

Where to put it: Every product page. Match visible price, availability, ratings exactly.

Speakable (BETA)

Voice-AI signal for Google Assistant

Still BETA in 2026, US news publishers only, English only, onboarding submission required. Worth implementing on TL;DR blocks if you publish news and care about voice surfaces. Skip if you're not in the eligible category.

Where to put it: On answer-first TL;DR sections only. Mark the speakable selectors (<code>data-speakable=true</code> + JSON-LD selectors).

Deprecated, retired, and dead schema (don't waste time)

Schema advice from 2019–2022 hasn't aged well. These are the four things still appearing in stale “SEO checklists” that aren't worth the implementation cost.

FAQPage rich results

Rich results retired May 7, 2026

Google removed FAQPage from the rich results gallery on May 7, 2026. The schema is still parsed for understanding — and AI crawlers extract from it — but no SERP rich result is rendered. Verdict: still worth shipping for AI extraction; do NOT pitch it as a 'get rich results' tactic.

HowTo rich results

Deprecated September 13, 2023

Google announced the change August 2023 and fully removed mobile + desktop HowTo rich results by September 13, 2023. The schema is still valid Schema.org vocabulary but earns no SERP feature. Skip new HowTo implementations unless you have a separate reason.

Meta keywords tag

Dead since September 2009

Google publicly confirmed it stopped using the meta keywords tag for ranking in September 2009. It still appears in 'SEO checklists' written by people who haven't updated since 2008. Do not implement.

Schema as a ranking signal

Not a ranking signal — confirmed by Mueller, replicated by Ahrefs

John Mueller has stated multiple times that structured data won't make your site rank better. Ahrefs ran a difference-in-differences study (1,885 treated pages vs 4,000 matched controls, May 2026) and found AI Mode +2.4%, ChatGPT +2.2% (both statistically indistinguishable from zero), and AI Overviews −4.6% (significant in the opposite direction). Adding schema did not increase AI citations.

Which AI crawlers parse schema (honest table)

The honest answer: primary documentation from OpenAI, Anthropic, and Perplexity does NOT explicitly confirm JSON-LD parsing. Many SEO sources assert these bots parse structured data; the official docs cover user agents and robots.txt behavior but stay silent on JSON-LD specifically. Don't treat schema parsing as a verified AI-side capability — treat it as extractability infrastructure that compounds with rendered HTML.

CrawlerOperatorPurposeParses JSON-LD?
GPTBotOpenAITraining data crawlNot confirmed in primary docs

OpenAI's official bot docs document user agents, IP ranges, and robots.txt behavior — but do NOT explicitly confirm JSON-LD parsing. Third-party SEO sources assert they do; the official docs are silent.

OAI-SearchBotOpenAIChatGPT Search live retrievalNot confirmed in primary docs

Same as GPTBot. OAI-SearchBot fetches live web content for ChatGPT Search responses; whether it specifically parses JSON-LD vs rendered HTML is not documented.

ChatGPT-UserOpenAIUser-triggered browse (links the user opens)Not confirmed in primary docs

Fires only when a ChatGPT user explicitly clicks a link. Behavior similar to a browser.

ClaudeBotAnthropicTraining + content fetchNot confirmed in primary docs

Anthropic's crawler documentation does not explicitly address structured data parsing. There are three Claude crawlers (claude-web, ClaudeBot, anthropic-ai).

PerplexityBotPerplexity AILive web search indexNot confirmed in primary docs

Perplexity's documentation covers robots.txt compliance and user agents; JSON-LD parsing is not explicitly addressed. Perplexity sources responses from live web retrieval.

Google-ExtendedGoogleBard/Gemini training opt-out token (not a separate crawler)Same as Googlebot

Google-Extended is a robots.txt token that controls whether Googlebot-crawled content is used for Bard/Gemini training. Schema parsing follows Googlebot behavior — which explicitly does parse JSON-LD per Google's own Rich Results documentation.

Applebot-ExtendedAppleApple Intelligence training opt-outNot documented

Apple introduced Applebot-Extended as a separate opt-out for Apple Intelligence. JSON-LD parsing behavior is not publicly documented.

Verification status. Googlebot explicitly parses JSON-LD per Google's own Rich Results documentation. Google-Extended follows Googlebot. For OpenAI, Anthropic, Perplexity, and Apple bots, JSON-LD parsing is asserted in third-party SEO blogs but is not documented in primary operator docs. We mark these as “not confirmed” rather than confabulating.

Implementation order (do this, in this order)

Seven steps from sitewide foundations to per-page additions. Stop when the next step doesn't apply to your content.

1

Organization (sitewide, once)

Add Organization JSON-LD once in your root layout. Include name, url, logo (absolute URL, square), sameAs (Wikipedia/Wikidata if eligible, LinkedIn, X, Crunchbase, your verified Mastodon/Bluesky), and contactPoint. Do NOT duplicate per page.

2

WebSite (sitewide, once)

Add WebSite schema with name, url, and potentialAction for site search (SearchAction). This is the only sitewide schema that still earns a Sitelinks Search Box.

3

BreadcrumbList (every page with a trail)

Match visible breadcrumb to JSON-LD position values 1..N exactly. This is the cheapest schema with the most reliable rich-result return in 2026.

4

Article / BlogPosting / NewsArticle (every editorial page)

Include headline (≤110 chars to be safe), datePublished, dateModified, author (Person with sameAs), publisher (Organization), image (1200×675 minimum), and inLanguage. dateModified is the freshness lever AI crawlers use.

5

Product (every commerce page)

Match visible price, availability, currency, brand, and aggregateRating exactly. Ahrefs DiD study showed schema doesn't lift AI citations — but Product schema IS gating for Merchant Center and AI shopping surfaces. Required, not optional.

6

Speakable (TL;DR blocks on news pages — optional)

Only if you publish news, English-only, and care about voice AI surfaces. Mark TL;DR or answer-first sections with data-speakable=true and JSON-LD speakable selectors. Skip for non-news sites.

7

FAQPage (Q&A blocks — for AI extraction, not rich results)

Rich results retired May 7, 2026. Still parsed for understanding. Implement FAQPage on Q&A blocks ONLY when the answers are visible on-page and match. Do not duplicate visible content into JSON-LD answers without rendering it.

Need to generate the JSON-LD? Our free JSON-LD Schema Generator builds Article, FAQ, Product, Organization, and Breadcrumb schemas — spec-compliant, paste-ready. No signup, no rate limit.

6 schema mistakes that hurt AI extraction

What goes wrong in practice. Audit your own pages against this list before believing the plugin output is clean.

JSON-LD content not matching visible content

Ahrefs May 2026 DiD study explicitly flagged content-match: schema with hidden content (FAQ answers in JSON-LD but not on the page) underperformed even baseline. AI engines and Google both treat content-match violations as a quality signal.

Multiple conflicting Organization blocks

Some CMSes emit Organization schema from theme, plugin, AND manual JSON-LD. AI crawlers ingesting conflicting name/logo/sameAs values lose entity confidence. Audit for duplicates with Google Rich Results Test.

sameAs pointing to dead profiles

sameAs is the entity-graph link. Pointing to a defunct Mastodon instance, retired Crunchbase URL, or hijacked old Twitter handle pollutes your entity. Audit annually.

Missing or stale dateModified

AI crawlers use dateModified as a freshness heuristic. Pages without it default to first-crawl date. Pages with dateModified that doesn't actually change get deprioritized as 'fake fresh'. Update dateModified only when content actually changes.

Schema inside <noscript> or rendered client-side only

AI crawlers (and Googlebot's first pass) read the initial HTML. JSON-LD injected after hydration by client-side JS may be missed. Server-render schema in <head> or before </body>.

Validating with the retired Structured Data Testing Tool

The classic Structured Data Testing Tool was retired in 2023. Use Google's Rich Results Test for SERP eligibility checks and Schema.org's Schema Markup Validator for spec compliance. Both are linked in the validate section below.

How to validate your schema

Two official validators plus our free generator. Use them in this order before shipping.

Google Rich Results Test

Checks SERP eligibility for Google rich results. Use first if you care about SERP features.

Open tool

Schema.org Schema Markup Validator

Checks Schema.org spec compliance regardless of Google eligibility. Use for non-Google AI crawlers.

Open tool

TurboAudit Schema Generator

Free generator for Article, FAQ, Product, Organization, BreadcrumbList JSON-LD. Paste-ready output.

Open generator

Note on retired tools. The classic Structured Data Testing Tool at search.google.com/structured-data/testing-tool was retired in 2023. If your CI uses it, migrate to Rich Results Test (Google-specific) or Schema.org Markup Validator (spec-compliant).

Does schema help me rank in AI Overviews / ChatGPT / Perplexity?

Per-engine honest answer with the evidence that supports it.

Google AI Overviews

No correlation with citation lift

Ahrefs May 2026 DiD study measured AI Overviews citation change at −4.6% (statistically significant, but in the OPPOSITE direction from a lift). Mueller's repeated public statements align. AIO selection appears to be driven by SERP ranking + extractability of visible content, not JSON-LD presence.

ChatGPT (ChatGPT Search, GPT-5.5)

No measured correlation; OpenAI docs silent

Ahrefs DiD study: ChatGPT +2.2%, indistinguishable from zero. OpenAI's official bot docs do not confirm JSON-LD parsing. ChatGPT citation correlates with Bing index presence + Wikipedia/Reddit authority, not schema.

Google AI Mode

No measured correlation

Ahrefs DiD study: AI Mode +2.4%, indistinguishable from zero. AI Mode uses Gemini's query fan-out across Google's index; same retrieval substrate as AIO.

Perplexity

Not specifically studied; live-web retrieval

Perplexity uses live-web retrieval with mandatory citations. The Ahrefs study didn't isolate Perplexity. Reasonable hypothesis: schema doesn't lift Perplexity citation directly, but Organization sameAs to Wikipedia compounds with Perplexity's strong Wikipedia/Reddit citation pattern.

Gemini (standalone app)

Knowledge Graph dependency means Organization schema indirectly matters

Gemini grounds on Google's Knowledge Graph. Organization + sameAs to Wikipedia/Wikidata is the strongest indirect path. This is the one engine where Organization schema has a defensible (if indirect) connection to citation.

Frequently asked questions

Is schema markup dead for SEO in 2026?+
No — but the way it gets pitched is dead. Schema is not a ranking signal (Mueller has stated this multiple times; Ahrefs replicated with a 1,885-page difference-in-differences study in May 2026 showing AI citation lift indistinguishable from zero). What schema DOES do in 2026: feeds entity graphs (Organization → Knowledge Graph → Gemini), exposes freshness (Article dateModified), gates rich results that still exist (BreadcrumbList, Product, Review, Recipe, Job), and gates AI shopping eligibility (Product). Implement it for extractability and rich result eligibility, not for rankings.
Should I still use FAQPage schema in 2026?+
Yes, but with realistic expectations. Google retired FAQPage rich results on May 7, 2026 — no SERP feature is rendered anymore. The JSON-LD is still parsed for understanding, and AI crawlers can extract from it for citation. Implement FAQPage on Q&A blocks where the answers are visible on-page and match exactly. Do NOT duplicate visible content into JSON-LD answers as a 'hack' — Ahrefs' content-match finding penalizes this.
Does ChatGPT read JSON-LD?+
OpenAI's official bot documentation does not explicitly confirm JSON-LD parsing. Third-party SEO sources assert that GPTBot and OAI-SearchBot parse structured data; the official docs cover user agents, IP ranges, and robots.txt behavior but stay silent on JSON-LD. Honest framing: assume the rendered HTML is the primary surface; don't bank on JSON-LD being the parsing path.
Does llms.txt replace schema markup?+
No — different jobs. llms.txt is a proposed file at /llms.txt that tells AI crawlers which content to prefer. Schema is in-page structured metadata. The 2026 adoption status of llms.txt is mixed (no major LLM has formally committed to honoring it). Schema is a 12+ year standard with documented Google use. They complement, not replace. See our /tools/llms-txt-generator for the llms.txt side.
Is Speakable schema production-ready in 2026?+
Still BETA. Google's Speakable documentation has carried the BETA label since launch and still does in 2026. Eligibility: US news publishers only, English only, onboarding submission required. For most sites: skip Speakable. For US news publishers: it's the only standardized voice-AI signal worth implementing on TL;DR blocks.
Article vs NewsArticle vs BlogPosting — which should I use?+
Match the content. NewsArticle for news with a strict publishing date (use this if you have a newsroom). BlogPosting for blog-style editorial content. Article for general evergreen content. All inherit from Article and are valid for Google rich results. The choice affects no ranking — it affects how AI crawlers and Knowledge Graph classify the content type.
Do I need both Organization and LocalBusiness schema?+
If you're a single-location business with a physical address: LocalBusiness only (LocalBusiness inherits from Organization). If you're a multi-location business: Organization sitewide, LocalBusiness per location page. If you're online-only: Organization, no LocalBusiness. Mixing both for the same entity confuses crawlers — pick one based on whether you serve customers at a physical address.
Will AI engines fix my broken schema?+
No. AI engines either parse the schema as-given or ignore it. Validation tools (Google Rich Results Test, Schema.org Schema Markup Validator) report errors; AI engines simply skip malformed JSON-LD. Validate every page that ships new schema; don't assume the engine will infer your intent.
Does Yoast / Rank Math / Schema App output AI-friendly schema?+
They output spec-compliant schema, which is the minimum bar. Whether it's optimal depends on configuration: many default-installed plugins emit duplicate Organization blocks (one from theme, one from plugin), missing sameAs, or schema-only-content (FAQs in JSON-LD that aren't visible). Audit your plugin output with Google Rich Results Test before assuming it's clean.
How many schema types should I use per page?+
Use as many as accurately describe the visible page content. A blog post might reasonably emit BreadcrumbList + Article + FAQPage + Person + Organization. A product page might emit BreadcrumbList + Product + Review + Organization. The cap isn't quantity — it's content-match. Every JSON-LD claim must match visible content. Schema you can't back up with rendered HTML hurts more than it helps.

Sources

  1. Google Search Central — FAQPage rich results deprecation (notice added May 8, 2026; effective May 7, 2026; documentation removed June 15, 2026). developers.google.com/search/docs/appearance/structured-data/faqpage
  2. Google Search Central — HowTo and FAQ changes (announced August 2023, HowTo fully deprecated September 13, 2023). developers.google.com/search/blog/2023/08/howto-faq-changes
  3. Ahrefs — “We Tracked 1,885 Pages Adding Schema. AI Citations Barely Moved.” (May 2026 difference-in-differences study: AI Mode +2.4%, ChatGPT +2.2%, AI Overviews −4.6%). ahrefs.com/blog/schema-ai-citations
  4. John Mueller — “Structured data won't make your site rank better.” (Bluesky, quoted in industry coverage). searchenginejournal.com
  5. Princeton GEO paper — Aggarwal et al., KDD 2024. The 9 tested methods do NOT include schema markup. arXiv:2311.09735
  6. Google Search Central — Speakable (BETA) structured data documentation. developers.google.com/search/docs/appearance/structured-data/speakable
  7. Schema.org — Releases history. v30.0 published March 19, 2026 (823 Types, 1,529 Properties, 19 Datatypes). schema.org/docs/releases.html
  8. OpenAI — Bot documentation (GPTBot, OAI-SearchBot, ChatGPT-User, OAI-AdsBot). Documents user agents, IP ranges, robots.txt behavior; does NOT confirm JSON-LD parsing. developers.openai.com/api/docs/bots
  9. Anthropic — Crawler documentation. Three Claude crawlers (claude-web, ClaudeBot, anthropic-ai); JSON-LD parsing not explicitly addressed. support.anthropic.com
  10. Perplexity — Crawler documentation. PerplexityBot covers live web retrieval; JSON-LD parsing not explicitly addressed. docs.perplexity.ai/guides/bots
  11. Google — Google-Extended robots.txt token (opt-out for Bard/Gemini training; not a separate crawler). developers.google.com/search/docs/crawling-indexing/overview-google-crawlers
  12. Google — Rich Results Test (current SERP eligibility validator). search.google.com/test/rich-results
  13. Schema.org — Schema Markup Validator (spec-compliance validator). validator.schema.org

Audit your schema content-match

Run TurboAudit to flag schema-without-content-match, missing Organization sameAs, stale dateModified, and the six common mistakes above. Free tier: 5 audits per month, no credit card.

Start free