How to Seed Payload CMS with CSV Files: A Complete Guide

Replace hardcoded seed data with maintainable CSV files for better content management

·Matija Žiberna·
How to Seed Payload CMS with CSV Files: A Complete Guide

I was building a client project with Payload CMS when I hit a familiar frustration: managing seed data scattered across multiple JavaScript files. Every time content needed updating, I found myself digging through hardcoded objects, making changes, and hoping I didn't break anything. After implementing a CSV-based seeding system, I discovered how much cleaner and more maintainable this approach could be.

This guide shows you exactly how to build a comprehensive CSV seeding system for Payload CMS that handles everything from simple text fields to complex relationships and nested data structures.

The Problem with Hardcoded Seed Data

Traditional Payload seeding typically looks like this:

// File: src/lib/payload/seed/collections/testimonials.ts
const testimonials = [
  {
    name: "Jane Doe",
    content: "Great service!",
    rating: 5,
    // ... more fields
  },
  // ... more objects
]

This approach becomes unwieldy quickly. Content updates require code changes, non-technical team members can't contribute, and managing relationships between collections becomes a nightmare.

Building the CSV Seeding Foundation

Let's start by creating the core infrastructure. First, we need a CSV reader utility that can parse our data files consistently.

// File: src/lib/payload/seed/csvReader.ts
import Papa from 'papaparse'
import fs from 'fs'
import path from 'path'

export async function readCsvFile(filePath: string): Promise<Array<Record<string, any>>> {
  try {
    const fullPath = path.resolve(filePath)
    const csvData = fs.readFileSync(fullPath, 'utf8')
    
    const result = Papa.parse(csvData, {
      header: true,
      skipEmptyLines: true,
      transformHeader: (header) => header.trim(),
      transform: (value) => {
        const trimmed = value.trim()
        // Handle boolean conversion
        if (trimmed === 'true') return true
        if (trimmed === 'false') return false
        // Handle empty values
        if (trimmed === '') return undefined
        return trimmed
      }
    })
    
    if (result.errors.length > 0) {
      console.warn('CSV parsing warnings:', result.errors)
    }
    
    return result.data
  } catch (error) {
    console.error(`Error reading CSV file ${filePath}:`, error)
    throw error
  }
}

This utility handles the common CSV parsing challenges you'll encounter: trimming whitespace, converting boolean strings, and managing empty values. The transformHeader function ensures consistent column naming even if your CSV has extra spaces.

Next, let's set up our directory structure for organizing CSV data:

mkdir -p src/lib/payload/seed/csv-data

Phase 1: Simple Collections

We'll start with the simplest case - flat data structures with basic field types. Let's implement testimonials seeding.

Create your first CSV file:

// File: src/lib/payload/seed/csv-data/testimonials.csv
csv_id,name,testimonialDate,source,location,service,content,rating
testimonial_jane,"Jane Doe","2023-10-26","google","Ljubljana","Bathroom renovation","Amazing work, highly recommend!",5
testimonial_john,"John Smith","2023-11-15","website","Maribor","Plumbing","Quick and professional service.",4
testimonial_maja,"Maja Novak",,"manual",,"Consultation","Very helpful advice.",5

Now implement the seeding function:

// File: src/lib/payload/seed/collections/testimonials.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'

export async function seedTestimonials(payload: Payload): Promise<any[]> {
  const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/testimonials.csv')
  const csvData = await readCsvFile(csvPath)
  
  const createdTestimonials = []
  
  for (const row of csvData) {
    try {
      const testimonial = await payload.create({
        collection: 'testimonials',
        data: {
          name: row.name,
          testimonialDate: row.testimonialDate ? new Date(row.testimonialDate) : undefined,
          source: row.source,
          location: row.location,
          service: row.service,
          content: row.content,
          rating: row.rating ? parseInt(row.rating) : undefined,
        },
      })
      
      createdTestimonials.push(testimonial)
      console.log(`Created testimonial: ${testimonial.name}`)
    } catch (error) {
      console.error(`Error creating testimonial from row:`, row, error)
    }
  }
  
  return createdTestimonials
}

This function demonstrates the core pattern: read CSV data, iterate through rows, map fields to Payload's expected structure, and handle type conversions. Notice how we convert the date string to a Date object and parse the rating as an integer.

Let's add one more simple collection to reinforce the pattern - FAQ items:

// File: src/lib/payload/seed/csv-data/faq-items.csv
csv_id,question,category,answer_html
faq_1,"What are your working hours?","general","<p>Our regular working hours are Monday to Friday, 8:00 to 16:00. For urgent interventions, we are available outside working hours as well.</p>"
faq_2,"In which area do you provide services?","general","<p>We provide services mainly in central Slovenia, including Ljubljana and surroundings, Domžale, Kamnik and Kranj.</p>"
// File: src/lib/payload/seed/collections/faq-items.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'

export async function seedFaqItems(payload: Payload): Promise<any[]> {
  const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/faq-items.csv')
  const csvData = await readCsvFile(csvPath)
  
  const createdFaqs = []
  
  for (const row of csvData) {
    try {
      const faqItem = await payload.create({
        collection: 'faqItems',
        data: {
          question: row.question,
          category: row.category,
          answer: {
            root: {
              children: [
                {
                  children: [
                    {
                      detail: 0,
                      format: 0,
                      mode: "normal",
                      style: "",
                      text: row.answer_html?.replace(/<[^>]*>/g, '') || '',
                      type: "text",
                      version: 1,
                    },
                  ],
                  direction: "ltr",
                  format: "",
                  indent: 0,
                  type: "paragraph",
                  version: 1,
                },
              ],
              direction: "ltr",
              format: "",
              indent: 0,
              type: "root",
              version: 1,
            },
          },
        },
      })
      
      createdFaqs.push(faqItem)
      console.log(`Created FAQ: ${faqItem.question}`)
    } catch (error) {
      console.error(`Error creating FAQ from row:`, row, error)
    }
  }
  
  return createdFaqs
}

The FAQ implementation shows how to handle Payload's richText fields. For simplicity, we're converting HTML to plain text and wrapping it in Payload's expected lexical structure. This creates a basic paragraph with the content stripped of HTML tags.

Phase 2: Complex Data with JSON Arrays

Now we'll tackle collections that require more sophisticated data structures. Services with feature arrays are a perfect example:

// File: src/lib/payload/seed/csv-data/services.csv
csv_id,title,description,priceDisplay,features_json
service_plumbing,"Plumbing","We provide comprehensive solutions for plumbing installations, from planning to implementation and maintenance.","By agreement","[{""featureText"":""New buildings""},{""featureText"":""Renovations""},{""featureText"":""Repairs""}]"
service_installation,"Sanitary Equipment Installation","Professional installation of shower cabins, bathtubs, toilets, sinks and other sanitary equipment.","From €150 onwards","[{""featureText"":""Installation""},{""featureText"":""Connection""},{""featureText"":""Consultation""}]"
// File: src/lib/payload/seed/collections/services.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'

export async function seedServices(payload: Payload): Promise<any[]> {
  const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/services.csv')
  const csvData = await readCsvFile(csvPath)
  
  const createdServices = []
  
  for (const row of csvData) {
    try {
      // Parse JSON features
      let features = []
      if (row.features_json) {
        try {
          features = JSON.parse(row.features_json)
        } catch (jsonError) {
          console.warn(`Invalid JSON in features for ${row.title}:`, jsonError)
          features = []
        }
      }
      
      const service = await payload.create({
        collection: 'services',
        data: {
          title: row.title,
          description: row.description,
          priceDisplay: row.priceDisplay,
          features: features,
        },
      })
      
      createdServices.push(service)
      console.log(`Created service: ${service.title}`)
    } catch (error) {
      console.error(`Error creating service from row:`, row, error)
    }
  }
  
  return createdServices
}

The key insight here is using JSON strings within CSV cells for complex data structures. We parse the features_json column into an actual JavaScript array before passing it to Payload. This approach scales to any level of complexity while keeping the CSV format manageable.

For even more complex nested structures, like machinery specifications:

// File: src/lib/payload/seed/csv-data/machinery.csv
csv_id,tabName,name,description,notes,specifications_json
machine_excavator,"Excavators","Volvo EL70","Light excavator for smaller excavations.","Suitable for urban construction sites.","[{""specName"":""Dimensions"",""specDetails"":[{""detail"":""Length: 5.4m""},{""detail"":""Width: 2.1m""}]},{""specName"":""Weight"",""specDetails"":[{""detail"":""7 tons""}]}]"
// File: src/lib/payload/seed/collections/machinery.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'

export async function seedMachinery(payload: Payload): Promise<any[]> {
  const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/machinery.csv')
  const csvData = await readCsvFile(csvPath)
  
  const createdMachinery = []
  
  for (const row of csvData) {
    try {
      let specifications = []
      if (row.specifications_json) {
        try {
          specifications = JSON.parse(row.specifications_json)
        } catch (jsonError) {
          console.warn(`Invalid JSON in specifications for ${row.name}:`, jsonError)
          specifications = []
        }
      }
      
      const machine = await payload.create({
        collection: 'machinery',
        data: {
          tabName: row.tabName,
          name: row.name,
          description: row.description,
          notes: row.notes,
          specifications: specifications,
        },
      })
      
      createdMachinery.push(machine)
      console.log(`Created machine: ${machine.name}`)
    } catch (error) {
      console.error(`Error creating machine from row:`, row, error)
    }
  }
  
  return createdMachinery
}

This demonstrates handling deeply nested JSON structures within CSV files. The specifications field contains an array of objects, where each object has its own array of details. By using JSON strings, we maintain the full data structure while keeping it manageable in spreadsheet software.

Phase 3: Collection Relationships

The most complex scenario involves relationships between collections. Let's implement projects that reference both services and testimonials:

// File: src/lib/payload/seed/csv-data/projects.csv
csv_id,title,description_html,projectStatus,location,metadata_json,tags_json,service_ids,testimonial_ids,project_type
project_renovation,"Novak Bathroom Renovation","<p>Complete bathroom renovation in the Novak family apartment.</p>","completed","Ljubljana","{""startDate"":""2023-09-01"",""completionDate"":""2023-10-15"",""client"":""Novak Family"",""budget"":""10000 EUR""}","[{""tag"":""Renovation""},{""tag"":""Bathroom""}]","service_plumbing","testimonial_jane","renovation"
project_newbuild,"Podlipnik House New Construction","<p>Implementation of all plumbing installations in newly built single-family house.</p>","completed","Domžale","{""completionDate"":""2024-01-20"",""client"":""Mr. Podlipnik""}","[{""tag"":""New Construction""},{""tag"":""House""}]","service_plumbing","","newbuild"
// File: src/lib/payload/seed/collections/projects.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'

// Helper function to look up documents by CSV ID
async function lookupDocumentsByCsvIds(
  payload: Payload,
  collection: string,
  csvIds: string[]
): Promise<string[]> {
  if (!csvIds.length) return []
  
  const results = []
  for (const csvId of csvIds) {
    try {
      const docs = await payload.find({
        collection,
        where: {
          // Assuming you store csv_id in your documents for lookup
          csv_id: { equals: csvId }
        },
        limit: 1,
      })
      
      if (docs.docs.length > 0) {
        results.push(docs.docs[0].id)
      }
    } catch (error) {
      console.warn(`Could not find ${collection} with csv_id ${csvId}`)
    }
  }
  
  return results
}

export async function seedProjects(payload: Payload): Promise<any[]> {
  const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/projects.csv')
  const csvData = await readCsvFile(csvPath)
  
  const createdProjects = []
  
  for (const row of csvData) {
    try {
      // Parse JSON fields
      let metadata = {}
      let tags = []
      
      if (row.metadata_json) {
        try {
          metadata = JSON.parse(row.metadata_json)
        } catch (jsonError) {
          console.warn(`Invalid JSON in metadata for ${row.title}:`, jsonError)
        }
      }
      
      if (row.tags_json) {
        try {
          tags = JSON.parse(row.tags_json)
        } catch (jsonError) {
          console.warn(`Invalid JSON in tags for ${row.title}:`, jsonError)
          tags = []
        }
      }
      
      // Handle relationships
      const serviceIds = row.service_ids 
        ? await lookupDocumentsByCsvIds(payload, 'services', row.service_ids.split(','))
        : []
        
      const testimonialIds = row.testimonial_ids 
        ? await lookupDocumentsByCsvIds(payload, 'testimonials', row.testimonial_ids.split(','))
        : []
      
      // Convert HTML description to richText (simplified)
      const description = {
        root: {
          children: [
            {
              children: [
                {
                  detail: 0,
                  format: 0,
                  mode: "normal",
                  style: "",
                  text: row.description_html?.replace(/<[^>]*>/g, '') || '',
                  type: "text",
                  version: 1,
                },
              ],
              direction: "ltr",
              format: "",
              indent: 0,
              type: "paragraph",
              version: 1,
            },
          ],
          direction: "ltr",
          format: "",
          indent: 0,
          type: "root",
          version: 1,
        },
      }
      
      const project = await payload.create({
        collection: 'projects',
        data: {
          title: row.title,
          description: description,
          projectStatus: row.projectStatus,
          location: row.location,
          metadata: metadata,
          tags: tags,
          relatedServices: serviceIds,
          relatedTestimonials: testimonialIds,
          // Store csv_id for future lookups
          csv_id: row.csv_id,
        },
      })
      
      createdProjects.push(project)
      console.log(`Created project: ${project.title}`)
    } catch (error) {
      console.error(`Error creating project from row:`, row, error)
    }
  }
  
  return createdProjects
}

The relationship handling here introduces a lookup system. We use CSV IDs to reference documents across collections, then resolve these to actual Payload document IDs. This approach maintains referential integrity while keeping the CSV format readable.

The lookupDocumentsByCsvIds helper function demonstrates how to find previously seeded documents. This assumes you're storing the original csv_id field in your documents, which becomes crucial for managing relationships.

Orchestrating the Complete Seeding Process

Finally, let's tie everything together in a main seeding function:

// File: src/lib/payload/seed/index.ts
import { Payload } from 'payload'
import { seedTestimonials } from './collections/testimonials'
import { seedFaqItems } from './collections/faq-items'
import { seedServices } from './collections/services'
import { seedMachinery } from './collections/machinery'
import { seedProjects } from './collections/projects'

export async function seedDatabase(payload: Payload): Promise<void> {
  console.log('Starting CSV-based database seeding...')
  
  try {
    // Phase 1: Simple collections (no dependencies)
    console.log('Phase 1: Seeding simple collections...')
    await seedTestimonials(payload)
    await seedFaqItems(payload)
    
    // Phase 2: Collections with complex data structures
    console.log('Phase 2: Seeding complex collections...')
    await seedServices(payload)
    await seedMachinery(payload)
    
    // Phase 3: Collections with relationships (depend on previous collections)
    console.log('Phase 3: Seeding collections with relationships...')
    await seedProjects(payload)
    
    console.log('Database seeding completed successfully!')
  } catch (error) {
    console.error('Error during database seeding:', error)
    throw error
  }
}

The order matters here. Collections with relationships must be seeded after their dependencies. This orchestration ensures that when we try to look up related services or testimonials, they already exist in the database.

Advanced Features and Best Practices

For production use, consider these enhancements:

Error Handling and Validation:

// Add to your seeding functions
function validateRow(row: any, requiredFields: string[]): boolean {
  for (const field of requiredFields) {
    if (!row[field]) {
      console.error(`Missing required field ${field} in row:`, row)
      return false
    }
  }
  return true
}

Progress Tracking:

// Add progress indicators for large datasets
console.log(`Processing ${csvData.length} records...`)
for (let i = 0; i < csvData.length; i++) {
  const row = csvData[i]
  // ... processing
  if ((i + 1) % 10 === 0) {
    console.log(`Processed ${i + 1}/${csvData.length} records`)
  }
}

Relationship Caching:

// Cache relationship lookups to improve performance
const relationshipCache = new Map<string, string>()

async function cachedLookup(payload: Payload, collection: string, csvId: string): Promise<string | null> {
  const cacheKey = `${collection}:${csvId}`
  if (relationshipCache.has(cacheKey)) {
    return relationshipCache.get(cacheKey)!
  }
  
  // Perform lookup and cache result
  const result = await lookupDocumentsByCsvIds(payload, collection, [csvId])
  const id = result.length > 0 ? result[0] : null
  relationshipCache.set(cacheKey, id)
  return id
}

Installation Requirements

Before implementing this system, install the required dependencies:

npm install papaparse @types/papaparse
# or
pnpm install papaparse @types/papaparse
# or
yarn add papaparse @types/papaparse

Conclusion

This CSV-based seeding approach transforms how you manage Payload CMS data. Instead of hunting through JavaScript files, your content lives in organized, version-controlled CSV files that anyone on your team can edit. You've learned how to handle simple fields, complex JSON structures, and cross-collection relationships while maintaining data integrity.

The phased implementation approach—starting with simple collections and progressively adding complexity—ensures you can adopt this system incrementally. Whether you're seeding a small blog or a complex e-commerce platform, this foundation scales to meet your needs.

Your CSV files become a single source of truth for seed data, your seeding process becomes predictable and maintainable, and your team gains the ability to manage content without touching code.

Let me know in the comments if you have questions, and subscribe for more practical development guides.

Thanks, Matija

0

Comments

Enjoyed this article?
Subscribe to my newsletter for more insights and tutorials.
Matija Žiberna
Matija Žiberna
Full-stack developer, co-founder

I'm Matija Žiberna, a self-taught full-stack developer and co-founder passionate about building products, writing clean code, and figuring out how to turn ideas into businesses. I write about web development with Next.js, lessons from entrepreneurship, and the journey of learning by doing. My goal is to provide value through code—whether it's through tools, content, or real-world software.

You might be interested in