Payload CMS Vector Search: 4-Step RAG Upgrade Guide

Use Upstash + OpenAI embeddings and Payload Native Jobs to add RAG, semantic search, and context-aware chatbots to…

·Updated on:·Matija Žiberna·
Payload CMS Vector Search: 4-Step RAG Upgrade Guide

📚 Comprehensive Payload CMS Guides

Detailed Payload guides with field configuration examples, custom components, and workflow optimization tips to speed up your CMS development process.

No spam. Unsubscribe anytime.

Imagine if your CMS didn't just store content, but actually understood it.

By integrating a Vector Store (like Upstash) with Payload CMS, you unlock a new layer of intelligence for your data. You're not just building a website anymore; you're building a knowledge base that can power:

  • Chatbots that answer questions based on your specific documentation.
  • RAG (Retrieval Augmented Generation) pipelines to ground AI responses in truth.
  • MCP (Model Context Protocol) servers that let AI agents query your content directly.
  • Semantic Search that finds what users mean, not just what they type.

In this guide, I'll walk you through exactly how to build this architecture.

The Challenge: Speed vs. Intelligence

The naive approach is simple: "When I save a post, generate an embedding and save it to the vector store."

But here is the twist: AI is slow.

Generating high-quality embeddings with OpenAI and syncing them to a database takes time—often 2-3 seconds per record. If you put this logic in your standard afterChange hook, your editors will be staring at a spinning "Save" button. Even worse, if you're on a serverless platform like Vercel, you risk hitting strict timeout limits, causing data to sync partially or not at all.

The Solution: Payload Native Jobs

To make Payload robust enough for RAG, we can't perform these operations synchronously. We need to decouple the "Save" action from the "Embed" action.

We do this using Payload's Native Jobs Queue.

This architecture allows us to:

  1. Fire and Forget: The editor saves instantly.
  2. Process in Background: A dedicated worker handles the heavy lifting.
  3. Auto-Retry: deeply integrated error handling ensures no data is ever left behind.

Step 1: Set up the Vector Infrastructure

First, we need a clean way to interact with our Vector Database and our Embedding provider (OpenAI).

File: src/lib/vector/client.ts

import { Index } from '@upstash/vector'

const url = process.env.UPSTASH_VECTOR_REST_URL
const token = process.env.UPSTASH_VECTOR_REST_TOKEN

// Create a singleton instance for reuse
export const vectorIndex = new Index({ url, token })

File: src/lib/vector/embedding.ts We use text-embedding-3-small with explicit dimensions to ensure compatibility.

import OpenAI from 'openai'

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY })

export async function generateEmbedding(text: string): Promise<number[]> {
    const sanitizedText = text.replace(/\n/g, ' ')
    const response = await openai.embeddings.create({
        model: 'text-embedding-3-small',
        input: sanitizedText,
        dimensions: 1024, // Critical: Match your Vector Store dimensions
    })
    return response.data[0].embedding
}

Step 2: Create the Background Job

This "Task Handler" is the engine of our new feature. It operates independently of the web server response cycle.

File: src/payload/jobs/vector/upsert.ts

import type { TaskHandler } from 'payload'
import { embedDocument } from '@/lib/vector/operations'

export const vectorUpsertHandler: TaskHandler<any> = async ({ input, req }) => {
    const { docId, collection } = input

    req.payload.logger.info(`[Vector Upsert] Processing ${collection}/${docId} ...`)

    try {
        // 1. Fetch the latest version of the document
        const doc = await req.payload.findByID({ collection, id: docId })

        // 2. Validate availability (e.g. only index published content)
        if (doc._status && doc._status !== 'published') {
            return { output: { message: 'Skipped: Document is not published' } }
        }

        // 3. Perform the embedding (The heavy operation)
        await embedDocument({
            id: docId,
            collection,
            // ... map your document fields to your embedding function here
        })

        return { output: { message: 'Successfully indexed' } }

    } catch (error) {
        req.payload.logger.error(`[Vector Upsert] Failed: ${error.message}`)
        throw error // Throwing triggers Payload's automatic retry mechanism!
    }
}

Step 3: The "Fire and Forget" Hook

Now that our heavy lifting is in a job, our hook becomes incredibly fast. It simply checks "Should I sync this?" and if yes, "Add to queue."

File: src/payload/hooks/syncToVectorStore.ts

import type { CollectionAfterChangeHook } from 'payload'

export const syncToVectorStoreAfterChange: CollectionAfterChangeHook = async ({
    doc,
    req,
    collection,
}) => {
    // Only queue if the document is published
    if (doc._status !== 'published') return doc

    // Dispatch the job instantly
    await req.payload.jobs.queue({
        task: 'vector-upsert', 
        input: {
            docId: doc.id,
            collection: collection.slug,
        },
    })

    return doc
}

Step 4: Register the Job

Finally, wire everything together in your Payload config.

File: payload.config.ts

import { vectorUpsertHandler } from '@/payload/jobs/vector/upsert'

export default buildConfig({
  // ...
  jobs: {
    // In production, ensure you secure access to the job runner endpoint!
    access: { run: ({ req }) => true }, 
    tasks: [
      {
        slug: 'vector-upsert',
        handler: vectorUpsertHandler,
        retries: 3, // If Upstash or OpenAI blips, we automatically try again
      },
    ],
  },
})

Taking it Further: MCP & Chatbots

With this pipeline in place, your Payload CMS is no longer just a database—it's a semantic engine.

1. Build an MCP Server

You can now build a Model Context Protocol (MCP) server that queries vectorIndex. This allows AI agents (like Claude Desktop) to "read" your CMS content directly to answer your questions or help you code.

2. Context-Aware Chatbots

When a user asks a question on your site, you don't just send it to an LLM. You first query your Upstash index for relevant chunks, append them to the system prompt ("Use the following context..."), and then ask the LLM. This dramatically reduces hallucinations.

3. Manual Control

Since your logic is decoupled into a Job, you can easily expose it via a custom endpoint to create a "Refresh Vector" button in your UI, giving editors granular control over indexing without needing developer intervention.


By leveraging Payload Jobs, you've built a robust, scalable foundation for the next generation of AI features.

Thanks, Matija

0

Frequently Asked Questions

Comments

Leave a Comment

Your email will not be published

10-2000 characters

• Comments are automatically approved and will appear immediately

• Your name and email will be saved for future comments

• Be respectful and constructive in your feedback

• No spam, self-promotion, or off-topic content

Matija Žiberna
Matija Žiberna
Full-stack developer, co-founder

I'm Matija Žiberna, a self-taught full-stack developer and co-founder passionate about building products, writing clean code, and figuring out how to turn ideas into businesses. I write about web development with Next.js, lessons from entrepreneurship, and the journey of learning by doing. My goal is to provide value through code—whether it's through tools, content, or real-world software.