The 7 AI Search Ranking Factors in 2026 — And How to Audit Each One

Why AI Search Ranking Factors Are Different From Traditional Google

For two decades, SEO was largely a game of backlinks, keyword density, and technical crawlability. Google's PageRank algorithm rewarded pages that other pages trusted. Build enough authority, optimise your title tags, and you ranked.

That model is fracturing.

In 2026, a growing share of search queries never reach a traditional results page at all. Google AI Overviews now appear on an estimated 25–30% of all searches, up from roughly 7% at launch in mid-2024. AI-referred traffic — sessions originating from ChatGPT, Perplexity, Google AI Mode, and similar platforms — grew 527% year-over-year between Q1 2025 and Q1 2026, according to data aggregated across large-scale analytics providers. Perplexity alone processes over 100 million queries per day.

These engines don't rank pages. They synthesise answers — and they choose which sources to cite. The criteria they apply are meaningfully different from classic ranking signals:

They cannot execute JavaScript to read your content. A beautiful React SPA may be invisible to them.
They favour direct, structured answers over content optimised to keep users scrolling.
They weight author identity and institutional credibility far more heavily than link graphs.
They parse structured data (schema markup) to extract facts, not just to understand page type.
They respect new AI-specific access controls — llms.txt, updated robots.txt directives, and noai meta tags.

This is not a minor update to SEO. It is a different discipline, now commonly called Generative Engine Optimisation (GEO). If you want to understand the full scope of GEO and how it relates to traditional SEO, see our deep-dive: What Is GEO — Generative Engine Optimisation?

The good news: the factors that drive AI citation are auditable, measurable, and largely fixable. We have distilled them into 7 ranked signals — the same 7 categories that seo.yatna.ai scores every site on. Here is what each one means in practice.

Factor 1: E-E-A-T Signals (25% of your audit score)

Experience, Expertise, Authoritativeness, and Trustworthiness is not a new concept — Google's Quality Rater Guidelines have referenced it for years. But in 2026, E-E-A-T has become the dominant signal for AI citation selection, accounting for a full quarter of what our audit model measures.

Why so heavy? Because AI engines are fundamentally in the business of reputation arbitrage. When Perplexity or ChatGPT cites a source, it is implicitly vouching for it. The last thing OpenAI wants is for ChatGPT to recommend a page that later turns out to be written by an anonymous contributor with no track record. So these engines have developed increasingly sophisticated methods of assessing whether a source is genuinely authoritative.

What AI engines actually look for

Author identity signals. Is there a byline? Does that author have a verifiable web presence — LinkedIn profile, academic publications, industry conference speaker pages, or a detailed author bio with credentials? Pages with no identifiable human author are systematically downweighted by AI citation engines.

First-hand experience markers. Google's addition of the first "E" (Experience) to the original EAT framework was deliberate. AI engines now look for content that demonstrates direct, personal experience: case study data from the author's own work, specific examples, screenshots, original research, and language patterns that signal lived expertise rather than aggregated secondary sources.

Institutional credibility. Does your domain have a history of being cited by authoritative sources? Do you have a clear About page, a physical address or registered business identity, a public contact email (not just a form), and a privacy policy? These signals collectively tell AI engines that a real, accountable organisation operates this site.

Trust signals. HTTPS (non-negotiable), clear editorial policies, dated and maintained content, no pattern of thin or spammy pages in the crawl history.

Actionable takeaways

Add a detailed author bio to every article, including credentials, social profiles, and a headshot.
Publish original research, proprietary data, or documented case studies — even small-scale ones carry weight.
Ensure every page in the About section is comprehensive: team bios, company history, contact details, registered address.
Review and update your content regularly; AI engines penalise pages with staleness signals (old stats, outdated references, expired links).
For a full breakdown of 2026 quality rater guidelines and how to optimise against them, see: E-E-A-T in 2026 — What Google's Quality Rater Guidelines Actually Require

Factor 2: Technical Accessibility (25% of your audit score)

Technical SEO has always mattered for discoverability. In the AI search era it matters for a different reason: if an AI crawler cannot read your page, your content does not exist for that engine. Full stop.

This factor carries the same 25% weight as E-E-A-T because there is simply no path to AI citation if the technical foundations are broken.

The AI crawler problem

Traditional Google's Googlebot renders JavaScript using a full headless Chrome instance, then indexes the rendered DOM. This is slow, resource-intensive, and happens on a delay — but it works for JavaScript-heavy sites.

Most AI crawlers do not render JavaScript at all. OpenAI's GPTBot, Anthropic's ClaudeBot, Perplexity's PerplexityBot, and Meta's FacebookBot are all HTML-only crawlers. If your Next.js app renders a blank <div id="root"> on the server and populates it client-side, these bots see an empty page. This is a catastrophic and extremely common problem on modern web stacks.

The fix is server-side rendering (SSR) or static generation (SSG) — ensuring that full, complete HTML is returned in the initial HTTP response, before any JavaScript runs.

Other technical signals that matter

Crawl budget efficiency. Keep your robots.txt clean and intentional. Avoid blocking important content paths. Submit a well-structured sitemap that prioritises your most valuable pages.

Clean HTML structure. Semantic HTML5 elements — <article>, <main>, <section>, <header>, <nav> — help AI parsers understand content hierarchy. Div-soup makes extraction harder and lowers citation probability.

Internal link architecture. A clear, shallow link structure means crawlers reach all your content efficiently. Orphaned pages — those with no internal links pointing to them — are frequently missed.

HTTPS and Core Web Vitals. Beyond speed (which we cover in Factor 5), basic connectivity and security matter. Redirect chains, broken links, and mixed-content warnings create crawl failures that silently remove pages from the AI index.

Actionable takeaways

Audit your server response headers to confirm full HTML is delivered on the initial response — use curl -A "GPTBot" <url> to test what AI bots actually see.
Run a crawl simulation checking specifically for JavaScript-dependent content.
Review and clean up your robots.txt to ensure no important sections are inadvertently blocked.
Generate a comprehensive XML sitemap covering all canonical URLs and submit it to Google Search Console.
Read our guide on robots.txt configuration for AI crawlers in 2026 before making any changes to access controls.

Factor 3: On-Page Content Quality (20% of your audit score)

This is where traditional SEO wisdom still applies — but the bar has been raised significantly by AI engines that are, themselves, expert content synthesisers.

AI engines cite content that answers questions clearly and directly. They are particularly drawn to content that is structured in a way that matches the question-answer format of conversational queries — because extracting a citation-ready quote is easier when the answer is contained in a discrete, scannable paragraph.

What "quality" means to an AI engine in 2026

Answer-first structure. The classic journalist's "inverted pyramid" — most important information first — is ideal for AI citation. If someone asks "what is the best schema markup for a recipe page?", the AI engine wants a page that opens with a direct, complete answer, not one that buries the answer under three paragraphs of preamble.

Semantic coverage without stuffing. AI language models understand topical depth. A page about "technical SEO audits" that never mentions crawlability, canonical tags, or Core Web Vitals will be scored as topically shallow. But keyword-stuffing is counterproductive — these engines understand natural language and can tell the difference between genuine expertise and keyword padding.

Long-tail, question-based headings. Structuring H2 and H3 headings as questions ("How do AI crawlers index JavaScript sites?") significantly increases the probability that your content is pulled into a featured citation for that query. This is one of the highest-leverage on-page changes you can make.

Content freshness. AI engines are actively looking for current information. A page last updated in 2022 citing 2021 statistics will rarely be cited for a query with recency intent. Review and update your most important pages on a documented schedule.

Unique insight, not aggregation. Content that synthesises information available on 20 other sites adds nothing to an AI engine's training data or citation pool. What gets cited is first-hand analysis, original data, counterintuitive positions backed by evidence, and specific how-to guidance that is visibly more detailed than competitor content.

Actionable takeaways

Restructure your most important pages so the primary answer appears within the first two sentences of the body content.
Audit your heading structure: aim for at least 30% of H2/H3s to be phrased as questions your target audience is actually asking.
Add a "last updated" date to every article and keep it accurate — staleness is a visible trust signal.
For each target keyword, ask: "what would ChatGPT need to quote from this page?" — then write that paragraph deliberately.

Factor 4: Schema Markup (10% of your audit score)

Structured data is the communication protocol between your content and AI engines. While it accounts for 10% of the overall score, its impact on citation probability is disproportionately large for certain content types — particularly FAQ pages, how-to guides, product pages, and review content.

Why schema matters for AI citation

AI engines parse structured data to extract facts without having to interpret free-form prose. A page with FAQPage schema tells the engine exactly which text is a question and which is the answer. An Article schema with author, datePublished, dateModified, and publisher properties makes E-E-A-T signals machine-readable rather than requiring inference.

The schema types with the highest impact on AI citation in 2026:

Article / NewsArticle / BlogPosting: Every editorial page should have this, with full author (as a Person object, not just a string), publisher (as an Organization with logo), datePublished, dateModified, and headline matching the page H1 exactly.

FAQPage: If your page contains a Q&A section — even an embedded FAQ — implementing this schema dramatically increases the chance that specific Q&A pairs are cited verbatim by AI engines. The format is a direct match to how conversational AI queries are processed.

HowTo: Step-by-step instructional content with HowTo schema gives AI engines a structured representation of your process. This is particularly powerful for tools, technical guides, and product tutorials.

Organization / WebSite: Site-level schema that establishes your entity identity — name, URL, logo, social profiles, contact information. This underpins E-E-A-T signals at the domain level.

BreadcrumbList: Helps AI engines understand site hierarchy and where a specific page sits within it.

Product + Review + AggregateRating: For e-commerce and product review sites, these schema types are table stakes for AI-cited product comparisons.

Common schema mistakes

Implementing schema that doesn't match the visible page content (this is a spam signal)
Using JSON-LD with syntax errors — validate every implementation with Google's Rich Results Test
Implementing only WebPage schema when more specific types are available
Omitting nested objects (e.g., using "author": "Jane Smith" instead of "author": {"@type": "Person", "name": "Jane Smith", "url": "..."})

Actionable takeaways

Run every important page through Google's Rich Results Test and Schema.org Validator.
Add FAQPage schema to any page that has a Q&A section — this is the single highest-ROI schema implementation for AI citation.
Implement Article schema with full Person objects for author attribution on all editorial content.
Add site-level Organization schema to your homepage with sameAs properties linking to all social profiles and business directory listings.

Factor 5: Core Web Vitals and Performance (10% of your audit score)

Page speed and Core Web Vitals have been a Google ranking factor since 2021. For AI search, the dynamic is different — but equally important.

AI crawlers, unlike human users, don't wait for slow pages. Most have hard timeout limits of 5–10 seconds. If your server takes 3 seconds to return the first byte (TTFB), an AI crawler may time out before receiving the full content. A slow page is a partially crawled page — or an uncrawled page.

Core Web Vitals thresholds that matter for AI

Google's own data shows that pages in the "Good" band for all three Core Web Vitals metrics are crawled 40% more frequently than pages in the "Needs Improvement" band. More crawl frequency means fresher inclusion in AI knowledge bases and citation pools.

The three metrics to target:

Largest Contentful Paint (LCP) < 2.5s. The main content of the page must render quickly. For AI crawlers (which can't render at all), TTFB and server response time are the relevant proxy metrics — aim for TTFB < 800ms.

Interaction to Next Paint (INP) < 200ms. While AI bots don't interact with pages, poor INP typically signals poor overall frontend performance and JavaScript bloat — which correlates with poor server-side rendering quality.

Cumulative Layout Shift (CLS) < 0.1. Layout stability affects how AI parsers extract content from the DOM. Pages with high CLS often have poorly structured HTML where content positions are determined dynamically, making extraction unreliable.

Infrastructure factors

Server location and CDN coverage. AI crawlers are distributed globally. A site hosted on a single origin server in one region will have variable response times for crawlers operating from other regions. A CDN is not optional at scale.

Image optimisation. Uncompressed images are the most common cause of slow LCP. We cover image-specific factors in detail in Factor 7 — but from a performance standpoint, every uncompressed image is leaving speed points on the table.

Caching headers. Correct Cache-Control and ETag headers mean that repeat crawls by AI bots are served from cache, reducing both your server load and crawl latency.

Actionable takeaways

Use Google PageSpeed Insights and CrUX data to identify your current Core Web Vitals status on real-user data, not just lab scores.
If TTFB exceeds 800ms, investigate server-side: database query times, third-party script blocking, inadequate hosting tier.
Implement a CDN if you're not already on one — Cloudflare's free tier is sufficient for most small to mid-size sites.
Defer or remove unused JavaScript — it is both the most common source of slow LCP and the most common reason AI crawlers see incomplete content.

Factor 6: AI Readiness Signals (5% of your audit score)

This is the newest factor category and the most rapidly evolving. AI readiness is a catch-all for signals that specifically communicate to AI systems how to access, interpret, and cite your content.

While it accounts for only 5% of the overall audit score today, it is growing in importance. Sites that configure these signals correctly now are building a durable advantage as AI search matures.

The `llms.txt` standard

First proposed in late 2024 and now supported by an increasing number of AI platforms, llms.txt is a plain-text file placed at yourdomain.com/llms.txt. It gives AI language model training and retrieval systems a curated map of your most important content — analogous to sitemap.xml for crawlers, but specifically structured for LLM consumption.

A basic llms.txt file includes:

A brief description of the site and its purpose
Links to key pages, grouped by topic
Instructions on how content can be used (opt-in for training, citation-only, etc.)

This is entirely voluntary, but it signals to AI systems that you are actively managing your content for AI visibility — a positive trust signal.

`robots.txt` directives for AI bots

The robots.txt file now needs to address a much longer list of crawlers than the traditional Googlebot and Bingbot. Major AI crawlers that you should explicitly manage include:

GPTBot (OpenAI / ChatGPT)
ClaudeBot (Anthropic)
PerplexityBot
Amazonbot
Meta-ExternalAgent
Applebot-Extended
Google-Extended (Google's AI training crawler, separate from regular Googlebot)

You can either whitelist all of them (allowing full access, maximising AI citation potential), block specific ones, or use Disallow to exclude specific paths from AI training while still allowing citation. The key is to be deliberate — many sites are inadvertently blocking AI crawlers with overly broad robots.txt rules written before these crawlers existed.

For a comprehensive guide to AI crawler management, see: robots.txt for AI Crawlers in 2026 — The Complete Guide

Sitemap freshness

AI engines use sitemap <lastmod> dates to prioritise recrawling. If your sitemap has stale or inaccurate <lastmod> values — or worse, no <lastmod> at all — you're leaving crawl frequency on the table. Accurate sitemap metadata is a low-effort, high-value signal.

Actionable takeaways

Create an llms.txt file for your domain following the llms.txt specification and submit it to the major AI platforms that support it.
Audit your robots.txt to ensure you are not accidentally blocking AI crawlers you want to allow.
Verify that your XML sitemap includes accurate <lastmod> dates for all URLs, updated programmatically on each content change.
For a full AI readiness self-assessment framework, see: How to Run an AI Search Readiness Audit

Factor 7: Image Optimisation (5% of your audit score)

Image SEO is often treated as an afterthought. In 2026's AI search landscape, it carries specific significance beyond the traditional "alt text for accessibility" guidance.

AI engines increasingly power visual search, image-based queries, and multimodal retrieval. Google Lens queries have grown over 200% in three years. ChatGPT's image analysis capabilities are now integrated into search. Images that are properly labelled and described become discoverable through entirely new query surfaces.

What AI engines read from your images

Alt text. The most fundamental signal. Alt text should describe the image content in natural language — what it shows, who appears in it if relevant, and the context. It should not be keyword-stuffed. An alt text like "Screenshot of seo.yatna.ai audit dashboard showing technical SEO score breakdown" is far more useful than "SEO audit tool free online best 2026".

File names. The image filename is a crawl-time signal that most sites completely ignore. img-3847.webp tells AI engines nothing. technical-seo-audit-score-dashboard.webp is descriptive, machine-readable, and keyword-relevant without being spammy.

Captions. Figure captions are among the most-read elements on a page (eye-tracking studies consistently show this). They also provide AI engines with a second, contextual description of the image — one that situates the image within the surrounding content rather than describing it in isolation.

Structured data for images. ImageObject schema within your Article or Product schema, with caption, description, contentUrl, and creator fields, gives AI engines a machine-readable image annotation that goes beyond what alt text alone provides.

File format and size. WebP is the standard in 2026 — smaller file sizes than JPEG at equivalent quality, with modern browser support across the board. Images over 200KB on a content page are a performance problem (see Factor 5) and a signal of generally low technical standards.

Actionable takeaways

Audit every image on your top 20 pages: does it have a descriptive alt text? A keyword-relevant filename? A visible caption?
Migrate image assets to WebP format and implement responsive images with srcset for different viewport sizes.
Add ImageObject schema to your most important editorial pages, particularly those with data visualisations, product screenshots, or infographics.
For images that carry critical information (charts, tables presented as images, infographics), add a text alternative in the page body so AI engines can extract the data even when the image can't be parsed.

How These 7 Factors Work Together

The seven factors above are not independent levers — they compound. A site with perfect E-E-A-T but broken technical accessibility will not be cited because the AI crawler cannot read the content. A site with excellent technical foundations but no schema markup will rank lower than a comparable site that gives AI engines machine-readable structured data.

The compounding effect works in your favour too. A site with strong signals across all seven categories achieves a multiplier effect: AI engines cite you not just because you are technically accessible or because you have good author credentials, but because every available signal consistently confirms your authority and reliability.

This is why a comprehensive, multi-factor audit is more valuable than optimising a single dimension. Running an audit that covers all seven factors simultaneously — and identifies the specific issues on your specific site — is the starting point for a meaningful AI search strategy.

For guidance on how to optimise for the two most important AI search platforms specifically, see:

Start With an Audit

Every site is different. The specific issues dragging down your AI citation potential — whether it's JavaScript rendering blocking AI crawlers, missing author bios undermining E-E-A-T, or an robots.txt file accidentally blocking GPTBot — depend on how your site was built and maintained.

The fastest way to get a clear picture of where you stand across all seven factors is a structured, automated audit.

seo.yatna.ai runs a full 7-factor AI search audit on your site in minutes. It crawls your pages, analyses every factor described in this post, scores each one, and produces a prioritised action list — not a generic checklist, but specific findings tied to specific pages on your specific domain.

The free tier audits up to 5 pages with no credit card required. For most sites, 5 pages is enough to surface the critical issues.

Run your free AI search audit now →

About the Author

Rejith Krishnan

Founder & CEO, lowtouch.ai

Rejith Krishnan is the Founder and CEO of lowtouch.ai and the creator of seo.yatna.ai. He built the AI agent platform that powers seo.yatna.ai's 7-agent audit engine - the same infrastructure lowtouch.ai deploys for enterprise clients across finance, legal, and operations.

Rejith's focus is AI enablement: helping businesses of all sizes - from solo founders and SMBs to enterprise teams - adopt AI agents that genuinely transform how they work. He specialises in deploying Large Language Models and building multi-agent systems that automate complex workflows, enhance discoverability, and deliver measurable outcomes without requiring engineering teams to manage the infrastructure.

He built seo.yatna.ai because AI-first SEO is a prerequisite for AI-era discoverability. Businesses that are not visible to ChatGPT, Perplexity, and Claude are already losing traffic. seo.yatna.ai gives every business - not just enterprise clients with dedicated SEO teams - the same AI-powered audit capability lowtouch.ai builds for its largest customers.

LinkedIn →

Related Guides

AI Search Readiness

What Is Generative Engine Optimization (GEO)? The Complete 2026 Guide

Generative Engine Optimization (GEO) is the practice of structuring content so AI assistants like ChatGPT, Perplexity, and Claude cite it in their answers - not just rank it in Google.

Jan 8, 2026Rejith Krishnan

AI Search Readiness

AI Search Readiness Audit: 12 Checks That Determine Whether AI Assistants Can Find and Cite Your Site

Run these 12 checks to find out whether AI assistants like ChatGPT, Perplexity, and Claude can access, understand, and cite your site.

Jan 21, 2026Rejith Krishnan

AI Search Readiness

robots.txt for AI Crawlers in 2026: GPTBot, ClaudeBot, PerplexityBot - The Complete Configuration Guide

The definitive 2026 reference for robots.txt and AI crawlers - configure GPTBot, ClaudeBot, and PerplexityBot to maximize AI search visibility.

Jan 14, 2026Rejith Krishnan

The 7 AI Search Ranking Factors in 2026 — And How to Audit Each One

Why AI Search Ranking Factors Are Different From Traditional Google

Factor 1: E-E-A-T Signals (25% of your audit score)

What AI engines actually look for

Actionable takeaways

Factor 2: Technical Accessibility (25% of your audit score)

The AI crawler problem

Other technical signals that matter

Actionable takeaways

Factor 3: On-Page Content Quality (20% of your audit score)

What "quality" means to an AI engine in 2026

Actionable takeaways

Factor 4: Schema Markup (10% of your audit score)

Why schema matters for AI citation

Common schema mistakes

Actionable takeaways

Factor 5: Core Web Vitals and Performance (10% of your audit score)

Core Web Vitals thresholds that matter for AI

Infrastructure factors

Actionable takeaways

Factor 6: AI Readiness Signals (5% of your audit score)

The llms.txt standard

robots.txt directives for AI bots

Sitemap freshness

Actionable takeaways

Factor 7: Image Optimisation (5% of your audit score)

What AI engines read from your images

Actionable takeaways

How These 7 Factors Work Together

Start With an Audit

The `llms.txt` standard

`robots.txt` directives for AI bots