---
title: "Dynamic robots.txt in Next.js for Multi-Tenant Sites"
slug: "dynamic-robots-txt-nextjs-multi-tenant"
published: "2026-01-09"
updated: "2026-02-22"
categories:
  - "Next.js"
tags:
  - "dynamic robots.txt"
  - "Next.js robots.txt"
  - "multi-tenant sitemap.xml"
  - "Payload CMS"
  - "App Router route handlers"
  - "unstable_cache tenant lookup"
  - "block AI bots"
  - "middleware matcher regex"
  - "dynamic humans.txt"
  - "tenant-specific sitemaps"
llm-intent: "how-to"
audience-level: "intermediate"
llm-purpose: "Dynamic robots.txt in Next.js: serve per-tenant robots.txt and sitemap.xml with Payload CMS, block AI bots, and fix middleware to preserve tenant routes."
llm-prereqs:
  - "Next.js"
  - "Payload CMS"
  - "TypeScript"
---

**Summary Triples**
- (Dynamic robots.txt in Next.js for Multi-Tenant Sites, expresses-intent, how-to)
- (Dynamic robots.txt in Next.js for Multi-Tenant Sites, covers-topic, dynamic robots.txt)
- (Dynamic robots.txt in Next.js for Multi-Tenant Sites, provides-guidance-for, Dynamic robots.txt in Next.js: serve per-tenant robots.txt and sitemap.xml with Payload CMS, block AI bots, and fix middleware to preserve tenant routes.)

### {GOAL}
Dynamic robots.txt in Next.js: serve per-tenant robots.txt and sitemap.xml with Payload CMS, block AI bots, and fix middleware to preserve tenant routes.

### {PREREQS}
- Next.js
- Payload CMS
- TypeScript

### {STEPS}
1. Create centralized tenant lookup function
2. Add dynamic robots.txt route handler
3. Implement dynamic humans.txt endpoint
4. Adjust middleware matcher to exclude files
5. Test and deploy per-tenant SEO endpoints

<!-- llm:goal="Dynamic robots.txt in Next.js: serve per-tenant robots.txt and sitemap.xml with Payload CMS, block AI bots, and fix middleware to preserve tenant routes." -->
<!-- llm:prereq="Next.js" -->
<!-- llm:prereq="Payload CMS" -->
<!-- llm:prereq="TypeScript" -->

# Dynamic robots.txt in Next.js for Multi-Tenant Sites
> Dynamic robots.txt in Next.js: serve per-tenant robots.txt and sitemap.xml with Payload CMS, block AI bots, and fix middleware to preserve tenant routes.
Matija Žiberna · 2026-01-09

I recently tackled a common challenge in multi-tenant architectures: how to serve unique `robots.txt` and `sitemap.xml` files for different domains running on the same application. A static file in `/public` just doesn't cut it when Tenant A needs to block AI bots while Tenant B wants full indexing.

This guide walks through the robust, cached solution I implemented using Next.js App Router and Payload CMS.

## 1. The Core Utility: Centralized Tenant Lookups

The first step was to stop repeating ourselves. We needed a single, reliable way to resolve the current tenant from the hostname—whether it's a custom domain (`example.com`) or a subdomain (`tenant.app.com`).

I centralized this logic in `src/payload/db/index.ts` using `unstable_cache` to keep performance high. This function is the backbone of our SEO strategy.

```typescript
// File: src/payload/db/index.ts

export const getTenantByDomain = async (domain: string) => {
  return await unstable_cache(
    async () => {
      const payload = await getPayloadClient();
      const tenants = await payload.find({
        collection: "tenants",
        where: {
          or: [
            { domain: { equals: domain } },
            { slug: { equals: domain.split('.')[0] } } // Fallback to slug for subdomain patterns
          ]
        },
        limit: 1,
      });
      return tenants.docs[0] || null;
    },
    [CACHE_KEY.TENANT_BY_DOMAIN(domain)],
    {
      tags: [TAGS.TENANTS],
      revalidate: 3600, // Revalidate every hour
    }
  )();
};
```

**Why this matters:**
This function handles the heavy lifting of database queries and caching. By centralizing it, we ensure that `robots.txt`, `sitemap.xml`, and `humans.txt` all "agree" on which tenant is active.

## 2. Dynamic Robots.txt with AI Protection

With the tenant lookup in place, I created a dynamic route handler for `robots.txt`. This isn't just a static file anymore; it's code. This allows us to inject the correct sitemap URL for the specific tenant and apply global rules, like blocking AI scrapers.

```typescript
// File: src/app/robots.ts

import type { MetadataRoute } from "next";
import { headers } from "next/headers";
import { getTenantByDomain } from "@/payload/db";

export default async function robots(): Promise<MetadataRoute.Robots> {
  // Get hostname from request headers
  const hostname = (await headers()).get('host') || 'www.adart.com';
  
  // Try to find tenant by domain or subdomain
  const tenant = await getTenantByDomain(hostname);
  
  // If no tenant found, use fallback (adart)
  const baseUrl = tenant?.domain ? \`https://\${tenant.domain}\` : \`https://\${hostname}\`;
  
  return {
    rules: [
      // Block AI Scraping Bots
      {
        userAgent: ["GPTBot", "CCBot", "Google-Extended"],
        disallow: ["/"],
      },
      // Standard bots
      {
        userAgent: "*",
        allow: "/",
        disallow: [
          "/admin",
          "/api",
        ],
        crawlDelay: 1,
      },
    ],
    sitemap: \`\${baseUrl}/sitemap.xml\`,
    host: baseUrl,
  };
}
```

**Key Features:**
- **Dynamic Host:** The `sitemap` link automatically matches the visitor's domain.
- **AI Blocking:** explicit blocks for `GPTBot`, `CCBot`, and `Google-Extended` protecting our content intelligence.

## 3. Dynamic Humans.txt

To give credit where it's due, I also implemented a `humans.txt` endpoint. This is a nice touch that adds personality and transparency to the site, dynamically acknowledging the specific tenant.

```typescript
// File: src/app/humans.ts

import { headers } from "next/headers";
import { getTenantByDomain } from "@/payload/db";

export default async function humans() {
  const hostname = (await headers()).get('host') || '';
  const tenant = await getTenantByDomain(hostname);
  const tenantName = tenant?.name || 'Ad Art';
  
  const content = \`/* TEAM */
  
  Site built by: Ad Art Team
  For: \${tenantName}
  
/* SITE */
  
  Standards: HTML5, CSS3, TypeScript
  Components: Payload CMS, Next.js\`;

  return new Response(content, {
    headers: { 'Content-Type': 'text/plain; charset=utf-8' },
  });
}
```

## 4. The Critical Fix: Middleware Matcher

This was the tricky part. Even with the files in place, `robots.txt` was returning a 404.

The culprit was `src/middleware.ts`. The matcher regex was swallowing requests to files if they didn't match specific patterns. I updated the negative lookahead to explicitly exclude **any** path with a file extension (like `.txt` or `.xml`).

```typescript
// File: src/middleware.ts

export const config = {
  matcher: [
    /*
     * Match all request paths except for:
     * ...
     * 5. Static files (e.g. /favicon.ico, /robots.txt) - Matched by .*\\..*
     */
    '/((?!api|_next|_static|_vercel|.*\\..*).*)',
  ],
};
```

**The Lesson:**
If your middleware runs on file routes, it might try to rewrite them to tenant paths (e.g., `/tenant-slugs/.../robots.txt`), which don't exist. Excluding files from middleware ensures they hit the App Router handlers directly.

## Conclusion

By moving away from static files and leveraging Next.js Route Handlers, we've created a SEO infrastructure that allows:
1.  **Automatic Sitemaps** per tenant.
2.  **Smart Indexing Rules** that protect against AI scraping.
3.  **Zero Maintenance** when onboarding new tenants.

Let me know if you have questions!

Thanks, Matija