Logo
Published on

Making Sitecore Headless + Next.js Work Seamlessly Behind Azure Front Door

Authors
  • avatar
    Name
    Jorge Lusar
    Twitter

Making Sitecore Headless + Next.js play nicely with Azure Front Door (Host, sitemap.xml, and robots.txt)

TL;DR: When you stick Azure Front Door (AFD) in front of a Sitecore Headless (JSS) app on Next.js, your app starts seeing AFD’s host instead of the original domain. Use x-forwarded-host (and x-forwarded-proto) to resolve the correct site and to generate absolute URLs for sitemap.xml and robots.txt.


Why this matters

  • Site resolution: Sitecore JSS typically chooses a site from the Host header. Behind AFD, that value is AFD’s endpoint (e.g., myapp-afd.azurefd.net) rather than your vanity domain (e.g., www.example.com), breaking multisite mapping.
  • SEO endpoints: sitemap.xml and robots.txt must emit the public URLs (scheme + host) or search engines index the wrong domain.

AFD forwards the original request host in x-forwarded-host and the scheme in x-forwarded-proto. We can read those to restore the “real” request context.


What I changed

I made three small but important tweaks:

  1. Middleware (multisite)

    • Read x-forwarded-host
    • Resolve the Sitecore site by that host
    • Set req.headers.host to the resolved site host before the built-in MultisiteMiddleware runs
  2. /api/sitemap.xml

    • Resolve the site using x-forwarded-host
    • Keep GraphQL sitemap fetching intact
  3. /api/robots.txt

    • Resolve the site using x-forwarded-host
    • Return site-specific robots content via GraphQL

The code

1) Multisite middleware plugin

This ensures the first middleware sets the correct host, so all subsequent middlewares (including Sitecore’s MultisiteMiddleware) “see” the right site.

src/lib/middleware/plugins/multisite.ts
import { NextRequest, NextResponse } from 'next/server';
import { MultisiteMiddleware } from '@sitecore-jss/sitecore-jss-nextjs/middleware';
import { siteResolver } from 'lib/site-resolver';
import { MiddlewarePlugin } from '..';

/**
 * Multisite: resolve site by host and rewrite for Sitecore JSS.
 */
class MultisitePlugin implements MiddlewarePlugin {
  private multisiteMiddleware: MultisiteMiddleware;

  // Run first
  order = -1;

  constructor() {
    this.multisiteMiddleware = new MultisiteMiddleware({
      excludeRoute: () => false,
      siteResolver,
      useCookieResolution: () => process.env.VERCEL_ENV === 'preview',
    });
  }

  async exec(req: NextRequest, res?: NextResponse): Promise<NextResponse> {
    const xForwardedHost = req.headers.get('x-forwarded-host') as string;
    if (xForwardedHost) {
      const siteByHost = siteResolver.getByHost(xForwardedHost);
      req.headers.set('host', siteByHost.hostName);
    }

    return this.multisiteMiddleware.getHandler()(req, res);
  }
}

export const multisitePlugin = new MultisitePlugin();

2) src/pages/api/sitemap.ts

Two sitemap modes are supported out of the box by JSS: a “regular” single sitemap and an “index” that links to multiple partitioned sitemaps. We keep that, but compose absolute URLs with the forwarded protocol + host.

src/pages/api/sitemap.ts
import type { NextApiRequest, NextApiResponse } from 'next';
import { NativeDataFetcher, GraphQLSitemapXmlService } from '@sitecore-jss/sitecore-jss-nextjs';
import { siteResolver } from 'lib/site-resolver';
import config from 'temp/config';
import clientFactory from 'lib/graphql-client-factory';

const ABSOLUTE_URL_REGEXP = '^(?:[a-z]+:)?//';

const sitemapApi = async (
  req: NextApiRequest,
  res: NextApiResponse
): Promise<NextApiResponse | void> => {
  const { query: { id } } = req;

  // Resolve site based on hostname
  const hostName = req.headers['host']?.split(':')[0] || 'localhost';
  let site = siteResolver.getByHost(hostName);

  // Update for Front Door
  const xForwardedHost = req.headers['x-forwarded-host'] as string;
  if (xForwardedHost) {
    site = siteResolver.getByHost(xForwardedHost);
  }

  const sitemapXmlService = new GraphQLSitemapXmlService({
    clientFactory,
    siteName: site.name,
  });

  const sitemapPath = await sitemapXmlService.getSitemap(id as string);

  // Regular sitemap
  if (sitemapPath) {
    const isAbsoluteUrl = sitemapPath.match(ABSOLUTE_URL_REGEXP);
    const sitemapUrl = isAbsoluteUrl ? sitemapPath : `${config.sitecoreApiHost}${sitemapPath}`;
    res.setHeader('Content-Type', 'text/xml;charset=utf-8');

    try {
      const fetcher = new NativeDataFetcher();
      const xmlResponse = await fetcher.fetch<string>(sitemapUrl);

      return res.send(xmlResponse.data);
    } catch {
      return res.redirect('/404');
    }
  }

  // Index sitemap.xml
  const sitemaps = await sitemapXmlService.fetchSitemaps();

  if (!sitemaps.length) return res.redirect('/404');

  const reqtHost = req.headers.host;
  const reqProtocol = (req.headers['x-forwarded-proto'] as string) || 'https';

  const SitemapLinks = sitemaps
    .map((item: string) => {
      const parseUrl = item.split('/');
      const lastSegment = parseUrl[parseUrl.length - 1];

      return `<sitemap>
        <loc>${reqProtocol}://${reqtHost}/${lastSegment}</loc>
      </sitemap>`;
    })
    .join('');

  res.setHeader('Content-Type', 'text/xml;charset=utf-8');

  return res.send(`
  <sitemapindex xmlns="http://sitemaps.org/schemas/sitemap/0.9" encoding="UTF-8">${SitemapLinks}</sitemapindex>
  `);
};

export default sitemapApi;

3) src/pages/api/robots.ts

Same idea: resolve the correct site, fetch site-specific robots, return as plain text.

src/pages/api/robots.ts
import type { NextApiRequest, NextApiResponse } from 'next';
import { GraphQLRobotsService } from '@sitecore-jss/sitecore-jss-nextjs';
import { siteResolver } from 'lib/site-resolver';
import clientFactory from 'lib/graphql-client-factory';

const robotsApi = async (req: NextApiRequest, res: NextApiResponse): Promise<void> => {
  res.setHeader('Content-Type', 'text/plain');

  // Resolve site based on hostname
  const hostName = req.headers['host']?.split(':')[0] || 'localhost';
  let site = siteResolver.getByHost(hostName);

  // Update for Front Door
  const xForwardedHost = req.headers['x-forwarded-host'] as string;
  if (xForwardedHost) {
    site = siteResolver.getByHost(xForwardedHost);
  }

  const robotsService = new GraphQLRobotsService({
    clientFactory,
    siteName: site.name,
  });

  const robotsResult = await robotsService.fetchRobots();

  return res.status(200).send(robotsResult);
};

export default robotsApi;

Gotchas & hard-won lessons

  • Trust the header source: Only treat x-forwarded-* as authoritative when the request definitely came through AFD (or another trusted proxy). If you also serve requests directly, add a simple allowlist check (e.g., by comparing req.headers['x-arr-ssl'], or by environment/ingress rules) before trusting x-forwarded-host.
  • Multiple forwarded hosts: Some proxies append lists like a.example.com, b.example.com. AFD typically sends a single value, but if you need to be defensive, split by comma and take the first value.
  • Ports: You’re already stripping ports (split(':')[0]). Keep that to avoid mismatches in siteResolver.
  • Protocol: Use x-forwarded-proto for absolute URLs; don’t assume https unless you must.
  • Caching: If you cache sitemap.xml/robots.txt, consider varying by x-forwarded-host to keep multi-tenant sites isolated:
    res.setHeader('Vary', 'x-forwarded-host');
  • Preview environments: Your middleware preserves useCookieResolution for Vercel preview URLs—handy when domain-based resolution isn’t applicable.

Quick test script

Make sure your app is only reachable via AFD during testing, or be explicit about headers.

# Simulate AFD passing the real host & proto
curl -i   -H "x-forwarded-host: www.example.com"   -H "x-forwarded-proto: https"   https://your-app-host/api/robots.txt

curl -i   -H "x-forwarded-host: www.example.com"   -H "x-forwarded-proto: https"   https://your-app-host/sitemap.xml

You should see:

  • robots.txt content for the www.example.com site
  • sitemap.xml links that start with https://www.example.com/...

Production checklist

  • Middleware runs first (order = -1)
  • Site is resolved by x-forwarded-host when present
  • Absolute links in index sitemap use x-forwarded-proto + host
  • Vary: x-forwarded-host set on SEO endpoints (if caching)
  • Defensive parsing for comma-separated forwarded hosts (optional)
  • Security: only trust headers on requests that came via AFD
  • Observability: add structured logs to confirm resolved site + headers in non-prod

Reusable helper (optional)

If you want to centralize the logic:

// lib/request-host.ts
export function getPublicHost(req: { headers: Record<string, string | string[] | undefined> }) {
  const xfHost = req.headers['x-forwarded-host'];
  const hostRaw = (Array.isArray(xfHost) ? xfHost[0] : xfHost) || req.headers['host'] || '';
  const host = (Array.isArray(hostRaw) ? hostRaw[0] : hostRaw).split(',')[0].trim().split(':')[0];
  return host || 'localhost';
}

export function getPublicProto(req: { headers: Record<string, string | string[] | undefined> }) {
  const xfProto = req.headers['x-forwarded-proto'];
  return (Array.isArray(xfProto) ? xfProto[0] : xfProto) || 'https';
}

Use in both API routes and middleware to keep behavior consistent.


Wrap-up

These small adjustments let Sitecore JSS + Next.js respect the real public domain when operating behind Azure Front Door. That keeps multisite routing correct and ensures SEO endpoints (sitemap.xml, robots.txt) generate clean, canonical URLs.