I was building a client project with Payload CMS when I hit a familiar frustration: managing seed data scattered across multiple JavaScript files. Every time content needed updating, I found myself digging through hardcoded objects, making changes, and hoping I didn't break anything. After implementing a CSV-based seeding system, I discovered how much cleaner and more maintainable this approach could be.
This guide shows you exactly how to build a comprehensive CSV seeding system for Payload CMS that handles everything from simple text fields to complex relationships and nested data structures.
The Problem with Hardcoded Seed Data
Traditional Payload seeding typically looks like this:
This approach becomes unwieldy quickly. Content updates require code changes, non-technical team members can't contribute, and managing relationships between collections becomes a nightmare.
Building the CSV Seeding Foundation
Let's start by creating the core infrastructure. First, we need a CSV reader utility that can parse our data files consistently.
This utility handles the common CSV parsing challenges you'll encounter: trimming whitespace, converting boolean strings, and managing empty values. The transformHeader function ensures consistent column naming even if your CSV has extra spaces.
Next, let's set up our directory structure for organizing CSV data:
bash
mkdir -p src/lib/payload/seed/csv-data
Phase 1: Simple Collections
We'll start with the simplest case - flat data structures with basic field types. Let's implement testimonials seeding.
Create your first CSV file:
csv
// File: src/lib/payload/seed/csv-data/testimonials.csv
csv_id,name,testimonialDate,source,location,service,content,rating
testimonial_jane,"Jane Doe","2023-10-26","google","Ljubljana","Bathroom renovation","Amazing work, highly recommend!",5
testimonial_john,"John Smith","2023-11-15","website","Maribor","Plumbing","Quick and professional service.",4
testimonial_maja,"Maja Novak",,"manual",,"Consultation","Very helpful advice.",5
This function demonstrates the core pattern: read CSV data, iterate through rows, map fields to Payload's expected structure, and handle type conversions. Notice how we convert the date string to a Date object and parse the rating as an integer.
Let's add one more simple collection to reinforce the pattern - FAQ items:
csv
// File: src/lib/payload/seed/csv-data/faq-items.csv
csv_id,question,category,answer_html
faq_1,"What are your working hours?","general","<p>Our regular working hours are Monday to Friday, 8:00 to 16:00. For urgent interventions, we are available outside working hours as well.</p>"
faq_2,"In which area do you provide services?","general","<p>We provide services mainly in central Slovenia, including Ljubljana and surroundings, Domžale, Kamnik and Kranj.</p>"
The FAQ implementation shows how to handle Payload's richText fields. For simplicity, we're converting HTML to plain text and wrapping it in Payload's expected lexical structure. This creates a basic paragraph with the content stripped of HTML tags.
Phase 2: Complex Data with JSON Arrays
Now we'll tackle collections that require more sophisticated data structures. Services with feature arrays are a perfect example:
csv
// File: src/lib/payload/seed/csv-data/services.csv
csv_id,title,description,priceDisplay,features_json
service_plumbing,"Plumbing","We provide comprehensive solutions for plumbing installations, from planning to implementation and maintenance.","By agreement","[{""featureText"":""New buildings""},{""featureText"":""Renovations""},{""featureText"":""Repairs""}]"
service_installation,"Sanitary Equipment Installation","Professional installation of shower cabins, bathtubs, toilets, sinks and other sanitary equipment.","From €150 onwards","[{""featureText"":""Installation""},{""featureText"":""Connection""},{""featureText"":""Consultation""}]"
typescript
// File: src/lib/payload/seed/collections/services.tsimport { Payload } from'payload'import { readCsvFile } from'../csvReader'import path from'path'exportasyncfunctionseedServices(payload: Payload): Promise<any[]> {
const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/services.csv')
const csvData = awaitreadCsvFile(csvPath)
const createdServices = []
for (const row of csvData) {
try {
// Parse JSON featureslet features = []
if (row.features_json) {
try {
features = JSON.parse(row.features_json)
} catch (jsonError) {
console.warn(`Invalid JSON in features for ${row.title}:`, jsonError)
features = []
}
}
const service = await payload.create({
collection: 'services',
data: {
title: row.title,
description: row.description,
priceDisplay: row.priceDisplay,
features: features,
},
})
createdServices.push(service)
console.log(`Created service: ${service.title}`)
} catch (error) {
console.error(`Error creating service from row:`, row, error)
}
}
return createdServices
}
The key insight here is using JSON strings within CSV cells for complex data structures. We parse the features_json column into an actual JavaScript array before passing it to Payload. This approach scales to any level of complexity while keeping the CSV format manageable.
For even more complex nested structures, like machinery specifications:
csv
// File: src/lib/payload/seed/csv-data/machinery.csv
csv_id,tabName,name,description,notes,specifications_json
machine_excavator,"Excavators","Volvo EL70","Light excavator for smaller excavations.","Suitable for urban construction sites.","[{""specName"":""Dimensions"",""specDetails"":[{""detail"":""Length: 5.4m""},{""detail"":""Width: 2.1m""}]},{""specName"":""Weight"",""specDetails"":[{""detail"":""7 tons""}]}]"
This demonstrates handling deeply nested JSON structures within CSV files. The specifications field contains an array of objects, where each object has its own array of details. By using JSON strings, we maintain the full data structure while keeping it manageable in spreadsheet software.
Phase 3: Collection Relationships
The most complex scenario involves relationships between collections. Let's implement projects that reference both services and testimonials:
csv
// File: src/lib/payload/seed/csv-data/projects.csv
csv_id,title,description_html,projectStatus,location,metadata_json,tags_json,service_ids,testimonial_ids,project_type
project_renovation,"Novak Bathroom Renovation","<p>Complete bathroom renovation in the Novak family apartment.</p>","completed","Ljubljana","{""startDate"":""2023-09-01"",""completionDate"":""2023-10-15"",""client"":""Novak Family"",""budget"":""10000 EUR""}","[{""tag"":""Renovation""},{""tag"":""Bathroom""}]","service_plumbing","testimonial_jane","renovation"
project_newbuild,"Podlipnik House New Construction","<p>Implementation of all plumbing installations in newly built single-family house.</p>","completed","Domžale","{""completionDate"":""2024-01-20"",""client"":""Mr. Podlipnik""}","[{""tag"":""New Construction""},{""tag"":""House""}]","service_plumbing","","newbuild"
The relationship handling here introduces a lookup system. We use CSV IDs to reference documents across collections, then resolve these to actual Payload document IDs. This approach maintains referential integrity while keeping the CSV format readable.
The lookupDocumentsByCsvIds helper function demonstrates how to find previously seeded documents. This assumes you're storing the original csv_id field in your documents, which becomes crucial for managing relationships.
Orchestrating the Complete Seeding Process
Finally, let's tie everything together in a main seeding function:
The order matters here. Collections with relationships must be seeded after their dependencies. This orchestration ensures that when we try to look up related services or testimonials, they already exist in the database.
Advanced Features and Best Practices
For production use, consider these enhancements:
Error Handling and Validation:
typescript
// Add to your seeding functionsfunctionvalidateRow(row: any, requiredFields: string[]): boolean {
for (const field of requiredFields) {
if (!row[field]) {
console.error(`Missing required field ${field} in row:`, row)
returnfalse
}
}
returntrue
}
Progress Tracking:
typescript
// Add progress indicators for large datasetsconsole.log(`Processing ${csvData.length} records...`)
for (let i = 0; i < csvData.length; i++) {
const row = csvData[i]
// ... processingif ((i + 1) % 10 === 0) {
console.log(`Processed ${i + 1}/${csvData.length} records`)
}
}
Before implementing this system, install the required dependencies:
bash
npm install papaparse @types/papaparse
# or
pnpm install papaparse @types/papaparse
# or
yarn add papaparse @types/papaparse
Conclusion
This CSV-based seeding approach transforms how you manage Payload CMS data. Instead of hunting through JavaScript files, your content lives in organized, version-controlled CSV files that anyone on your team can edit. You've learned how to handle simple fields, complex JSON structures, and cross-collection relationships while maintaining data integrity.
The phased implementation approach—starting with simple collections and progressively adding complexity—ensures you can adopt this system incrementally. Whether you're seeding a small blog or a complex e-commerce platform, this foundation scales to meet your needs.
Your CSV files become a single source of truth for seed data, your seeding process becomes predictable and maintainable, and your team gains the ability to manage content without touching code.
If you're seeding as part of a larger data migration — moving content from another CMS rather than creating fresh seed data — the CMS migration quiz estimates difficulty, cost, and timeline for your specific platform pair before you commit. For a full picture of what a Payload project costs including data migration work, the Payload CMS pricing guide breaks it all down. For handling large import volumes with Payload's transaction layer, Payload CMS large data imports with Drizzle transactions covers the patterns that keep imports reliable at scale.
Let me know in the comments if you have questions, and subscribe for more practical development guides.