Next.js SEO with App Router: generateMetadata, Sitemap, and Robots.ts — Complete Guide

Key Takeaways

generateMetadata is the correct API for dynamic pages. Static export const metadata only works for pages where titles and descriptions never change. Use generateMetadata for any page that fetches data.
app/sitemap.ts generates a fully dynamic XML sitemap at build time — no plugin, no external service, no manual file.
app/robots.ts is the right place to allow AI crawlers. Explicitly allow GPTBot, ClaudeBot, and PerplexityBot here to ensure GEO eligibility.
JSON-LD schema does not go in generateMetadata. The metadata API outputs <meta> tags. Schema requires a <script type="application/ld+json"> element, which belongs in the page component.
alternates.canonical is non-optional for any page that can be reached via multiple URLs — forgetting it is the single most common Next.js SEO error.

Next.js App Router ships with a complete native SEO API that makes third-party packages like next-seo largely unnecessary. The metadata API, file-based sitemap generation, and typed robots configuration cover the full technical SEO surface area — when used correctly.

The problem is that the App Router API introduces several non-obvious patterns that are easy to get wrong, with consequences that are hard to debug without a dedicated audit. This guide covers every API you need, with complete working code, and explicitly calls out the mistakes practitioners make most often.

1. generateMetadata — Dynamic Metadata for Every Page

When to Use generateMetadata vs Static Metadata

Next.js offers two ways to define page metadata:

Static metadata — a named export from the page file:

export const metadata: Metadata = {
  title: 'About Us',
  description: 'Learn about Yatna AI and our mission.',
}

Use static metadata only for pages where the title, description, and OG data never change: the homepage, the about page, the pricing page.

generateMetadata — an async function that can fetch data:

export async function generateMetadata(
  { params }: { params: { slug: string } }
): Promise<Metadata> {
  const post = await getPost(params.slug)
  return {
    title: post.title,
    description: post.excerpt,
    alternates: { canonical: `https://seo.yatna.ai/seo-academy/${params.slug}/` },
    openGraph: {
      title: post.title,
      description: post.excerpt,
      images: [{ url: post.ogImage, width: 1200, height: 630 }],
      type: 'article',
    },
  }
}

Use generateMetadata for any page where the title or description comes from data — blog posts, product pages, audit reports, user profiles. Next.js de-duplicates the data fetch: if your page component and generateMetadata both call getPost(params.slug), React's cache ensures the function runs once.

Complete generateMetadata for a Blog Post

This is the full metadata function for a blog post page, including all fields that Google and social platforms read:

import type { Metadata } from 'next'
import { getPost } from '@/lib/blog'

export async function generateMetadata(
  { params }: { params: { slug: string } }
): Promise<Metadata> {
  const post = await getPost(params.slug)

  if (!post) {
    return { title: 'Post Not Found' }
  }

  const canonicalUrl = `https://seo.yatna.ai/seo-academy/${params.slug}/`

  return {
    title: post.title,
    description: post.excerpt,
    authors: [{ name: post.authorName, url: post.authorUrl }],
    alternates: {
      canonical: canonicalUrl,
    },
    openGraph: {
      title: post.title,
      description: post.excerpt,
      url: canonicalUrl,
      siteName: 'Yatna AI SEO Academy',
      images: [
        {
          url: post.ogImage,
          width: 1200,
          height: 630,
          alt: post.title,
        },
      ],
      type: 'article',
      publishedTime: post.date,
      modifiedTime: post.modified,
      authors: [post.authorName],
    },
    twitter: {
      card: 'summary_large_image',
      title: post.title,
      description: post.excerpt,
      images: [post.ogImage],
    },
    robots: {
      index: true,
      follow: true,
    },
  }
}

Fields that are commonly missed:

alternates.canonical — the most critical field. Every blog post must declare its canonical URL. Without it, if the same post is accessible at /seo-academy/slug/, /seo-academy/slug, and via any query parameter, Google may treat these as separate pages and split their ranking signals.
openGraph.url — the OG URL should match the canonical. Platforms use this when a URL is shared to normalise the canonical identity of the page.
openGraph.publishedTime and modifiedTime — ISO 8601 strings. Google uses these for freshness scoring. Include them for all article-type pages.
twitter.card: 'summary_large_image' — required for Twitter/X to render a large image card. Without it, the default small card is used and click-through rates suffer.

Common Mistake: Static Metadata on Dynamic Pages

The most common error is exporting a static metadata object from a dynamic page:

// WRONG — title and description are fixed regardless of which post loads
export const metadata: Metadata = {
  title: 'SEO Academy Post',
  description: 'Read our latest SEO guide.',
}

export default async function PostPage({ params }) {
  const post = await getPost(params.slug)
  // ...
}

Every post at /seo-academy/[slug] gets the same title tag. This is catastrophic for SEO: Google sees dozens of pages with identical titles, canonical signals conflict, and click-through rates collapse because the SERP title is always "SEO Academy Post".

2. app/sitemap.ts — Dynamic Sitemap Generation

The File-Based Sitemap API

Next.js App Router generates a valid XML sitemap from app/sitemap.ts. The file exports a default function that returns an array of MetadataRoute.Sitemap entries. The generated XML is served at /sitemap.xml automatically.

Complete sitemap.ts for a blog + landing pages site:

import type { MetadataRoute } from 'next'
import { getAllPosts } from '@/lib/blog'
import { getAllLandingPages } from '@/lib/landing-pages'

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const posts = await getAllPosts()
  const landingPages = await getAllLandingPages()

  const postEntries: MetadataRoute.Sitemap = posts.map(post => ({
    url: `https://seo.yatna.ai/seo-academy/${post.slug}/`,
    lastModified: new Date(post.modified),
    changeFrequency: 'monthly' as const,
    priority: 0.8,
  }))

  const landingPageEntries: MetadataRoute.Sitemap = landingPages.map(page => ({
    url: `https://seo.yatna.ai/lp/${page.slug}/`,
    lastModified: new Date(page.modified),
    changeFrequency: 'monthly' as const,
    priority: 0.7,
  }))

  return [
    {
      url: 'https://seo.yatna.ai/',
      lastModified: new Date(),
      changeFrequency: 'weekly',
      priority: 1.0,
    },
    {
      url: 'https://seo.yatna.ai/seo-academy/',
      lastModified: new Date(),
      changeFrequency: 'weekly',
      priority: 0.9,
    },
    {
      url: 'https://seo.yatna.ai/pricing/',
      lastModified: new Date(),
      changeFrequency: 'monthly',
      priority: 0.8,
    },
    ...postEntries,
    ...landingPageEntries,
  ]
}

priority guidelines (use sparingly — Google ignores inflated values):

Priority	Use for
1.0	Homepage only
0.9	Category hubs, pillar pages
0.8	Individual blog posts, key landing pages
0.7	Secondary landing pages, tag pages
0.5	Utility pages (about, privacy, terms)

changeFrequency is advisory, not directive — Google does not guarantee crawl frequency based on this value. Set it honestly: weekly for an active blog, monthly for static pages, yearly for rarely-changing pages.

Trailing Slashes

If your Next.js config has trailingSlash: true, ensure all sitemap URLs include the trailing slash. Inconsistency between your canonical URLs, sitemap URLs, and actual page URLs creates duplicate content signals. Pick one convention and apply it everywhere.

3. app/robots.ts — Allowing AI Crawlers

The Typed Robots API

app/robots.ts replaces the manual public/robots.txt file. It exports a typed function that Next.js serialises to /robots.txt at build time:

import type { MetadataRoute } from 'next'

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: '*',
        allow: '/',
        disallow: ['/api/', '/dashboard/', '/account/'],
      },
      {
        userAgent: 'GPTBot',
        allow: '/',
      },
      {
        userAgent: 'ClaudeBot',
        allow: '/',
      },
      {
        userAgent: 'PerplexityBot',
        allow: '/',
      },
      {
        userAgent: 'GoogleOther',
        allow: '/',
      },
    ],
    sitemap: 'https://seo.yatna.ai/sitemap.xml',
  }
}

Why explicit AI crawler rules matter: A wildcard User-agent: * rule with Allow: / should, in theory, allow all bots including AI crawlers. In practice, some AI crawler implementations are inconsistent in how they interpret wildcard rules. Explicit named entries for GPTBot, ClaudeBot, PerplexityBot, and GoogleOther remove any ambiguity and are documented best practice for GEO.

What to disallow: API routes (/api/), authenticated dashboards (/dashboard/), and user account pages (/account/) should be disallowed. These pages contain no indexable content and wasting crawl budget on them delays indexing of your actual content.

4. JSON-LD Schema in Page Components

Why generateMetadata Does Not Handle Schema

A very common misunderstanding: developers look for a jsonLd option in the generateMetadata return type. It does not exist. generateMetadata outputs <meta> and <link> elements. JSON-LD requires a <script type="application/ld+json"> element, and the metadata API has no mechanism for producing script tags.

Schema markup belongs in the page component itself, rendered as a <script> element with the JSON payload:

// app/seo-academy/[slug]/page.tsx

import { getPost } from '@/lib/blog'

export default async function PostPage({
  params,
}: {
  params: { slug: string }
}) {
  const post = await getPost(params.slug)

  const articleSchema = {
    '@context': 'https://schema.org',
    '@type': 'Article',
    headline: post.title,
    description: post.excerpt,
    image: post.ogImage,
    datePublished: post.date,
    dateModified: post.modified,
    author: {
      '@type': 'Person',
      name: post.authorName,
      url: post.authorUrl,
      sameAs: [post.authorLinkedIn, post.authorTwitter],
    },
    publisher: {
      '@type': 'Organization',
      name: 'Yatna AI',
      url: 'https://seo.yatna.ai',
    },
    mainEntityOfPage: {
      '@type': 'WebPage',
      '@id': `https://seo.yatna.ai/seo-academy/${params.slug}/`,
    },
  }

  // Render the schema as an inline JSON-LD script in the page head
  // articleSchema is built from trusted server-side data, not user input
  return (
    <>
      <script
        type="application/ld+json"
        suppressHydrationWarning
        // insert JSON.stringify(articleSchema) as the script content
      />
      <article>
        {/* page content */}
      </article>
    </>
  )
}

The articleSchema object is constructed entirely from application data — fetched from your own database or file system — not from user-supplied input. This pattern is safe and standard for Next.js JSON-LD injection.

For FAQPage schema on blog posts, add it alongside the Article schema. If the post has a FAQ section, map the Q&A pairs into a FAQPage schema object and include a second <script type="application/ld+json"> block in the same component.

5. next/image Best Practices for SEO

Correctly using next/image directly affects Core Web Vitals — particularly LCP (Largest Contentful Paint) — which is a Google ranking signal.

import Image from 'next/image'

// Hero / OG image — above the fold, highest priority
<Image
  src={post.ogImage}
  alt={post.title}
  width={1200}
  height={630}
  priority={true}
  sizes="(max-width: 768px) 100vw, (max-width: 1200px) 80vw, 1200px"
/>

// Body images — below the fold, lazy loaded by default
<Image
  src={imageUrl}
  alt={imageAltText}
  width={800}
  height={450}
  sizes="(max-width: 768px) 100vw, 800px"
/>

priority={true}: Tells Next.js to add a <link rel="preload"> for this image and disable lazy loading. Use it on the single largest above-the-fold image on each page — the image most likely to be the LCP element. Do not apply it to multiple images; preloading too many images hurts performance.

sizes: The sizes attribute tells the browser which image variant to download based on viewport width. Without it, the browser may download the full 1200px image on a 375px mobile screen, wasting bandwidth and hurting LCP.

width and height: Always provide explicit dimensions. They are required for Next.js to generate the correct srcset and to reserve layout space before the image loads, preventing Cumulative Layout Shift (CLS).

alt: Not optional from an SEO perspective. Google's image search, AI image understanding, and accessibility all depend on descriptive alt text. Never use empty alt="" for content images (it is correct for decorative images only).

Common Mistakes Summary

Mistake	Consequence	Fix
`export const metadata` on dynamic pages	Every post gets the same title tag	Replace with `generateMetadata` async function
Missing `alternates.canonical`	Duplicate content signals across trailing slash variants	Add to every `generateMetadata` return
Putting schema in `generateMetadata`	Schema silently dropped — never rendered	Move to page component as `<script>` element
`priority={true}` on all images	Defeats preload optimisation; hurts LCP	Apply only to the single hero/LCP image
Missing `sizes` on images	Browser downloads oversized images on mobile	Add responsive `sizes` string to every `next/image`
Blocking AI crawlers in robots.ts	Invisible to ChatGPT, Claude, Perplexity	Explicitly allow GPTBot, ClaudeBot, PerplexityBot
Inconsistent trailing slashes in sitemap	Crawler confusion; wasted crawl budget	Standardise on one pattern across canonical, sitemap, and internal links

Verifying Your Implementation

Check metadata output: View the page source and search for <title>, <meta name="description", and <link rel="canonical". Verify the values are dynamic and correct for each page.

Check sitemap: Fetch https://yourdomain.com/sitemap.xml in a browser. Verify your most important pages appear, URLs use HTTPS, and the format is valid XML.

Check robots.txt: Fetch https://yourdomain.com/robots.txt. Verify AI crawlers are listed explicitly and no important content directories are disallowed.

Check schema: Paste your URL into Google's Rich Results Test. Verify Article schema is detected with correct author, dates, and no errors.

FAQ

Can I use both generateMetadata and static metadata exports in the same project?

Yes. Use static metadata for pages with fixed metadata (homepage, about, pricing). Use generateMetadata for dynamic pages (blog posts, audit reports, product pages). They coexist without conflict — Next.js uses whichever export is present in each page file.

Does generateMetadata affect server render performance?

Minimally, because Next.js caches the underlying data fetch. If your page component calls getPost(slug) and generateMetadata calls getPost(slug), React's cache() function ensures it runs once. Wrap your data fetching functions in cache() from React to guarantee deduplication.

Should I use next-sitemap or the native app/sitemap.ts API?

The native API is recommended for App Router projects. It is type-safe, requires zero configuration, and integrates with Next.js's build pipeline. The next-sitemap package was designed for the Pages Router era and adds unnecessary complexity for modern App Router apps.

How do I add hreflang for multilingual sites?

Use the alternates.languages field in generateMetadata:

alternates: {
  canonical: 'https://seo.yatna.ai/seo-academy/slug/',
  languages: {
    'en-US': 'https://seo.yatna.ai/seo-academy/slug/',
    'fr-FR': 'https://fr.seo.yatna.ai/seo-academy/slug/',
  },
},

Check your Next.js site's SEO implementation against 70+ signals — run a free audit at seo.yatna.ai →

About the Author

Ishan Sharma

Head of SEO & AI Search Strategy

Ishan Sharma is Head of SEO & AI Search Strategy at seo.yatna.ai. With over 10 years of technical SEO experience across SaaS, e-commerce, and media brands, he specialises in schema markup, Core Web Vitals, and the emerging discipline of Generative Engine Optimisation (GEO). Ishan has audited over 2,000 websites and writes extensively about how structured data and AI readiness signals determine which sites get cited by ChatGPT, Perplexity, and Claude. He is a contributor to Search Engine Journal and speaks regularly at BrightonSEO.

LinkedIn →