Chunked File Uploads: Architecture and Mental Model

By Matija Žiberna · Last updated June 2026 · Tested while building local TV, a self-hosted video platform on Next.js + Payload CMS

Chunked file upload is the pattern that makes large transfers reliable. The short version: you split the file into fixed-size pieces, upload them one at a time, and give the upload a persistent identity on the server so it can resume from exactly where it stopped. That is the entire idea. Everything else in this guide is the reasoning behind the decisions.

This is the architecture and mental model guide. It does not contain implementation code. The follow-up guide — Building a Chunked Upload Service in Go: Step-by-Step Implementation — covers the Go session store, S3 multipart SDK calls, browser chunk client, and finalization callback in detail.

Why single-request uploads fail for large files

The simplest way to upload a file is a single HTTP POST. The browser reads the file, attaches it as a request body, and sends it. The server receives it and saves it. Simple.

For small files this is fine. For anything large — a video longer than a few minutes, a high-resolution photo set, a CAD export — it breaks in practice.

The connection can die mid-transfer. A cellular hiccup, a laptop lid closing, a server timeout: any of these ends the TCP connection. With a single-request upload there is no recovery. The user starts over from byte zero.

Every HTTP server (Nginx, Node.js, Go's net/http) has a maximum request body size. These limits exist to prevent memory exhaustion. A 2 GB video gets rejected long before it finishes uploading.

If the server must hold the entire file in memory while processing, a single large upload can consume gigabytes of RAM. With several concurrent users, the server falls over.

Progress reporting is also unreliable. A browser's upload progress event fires as bytes leave the OS network stack — "bytes sent" is not the same as "bytes received and durably stored by the server." If the server buffered everything and then crashed, the progress bar was lying.

Chunked upload solves all of this with one insight: break the file into pieces and negotiate each piece independently.

What chunking actually means

A chunk is a contiguous slice of the file's bytes. You pick a size — 8 MB is a common choice — and divide the file into sequential slices.

A 100 MB file with 8 MB chunks produces 13 chunks: twelve 8 MB chunks and one 4 MB remainder.

The browser uploads these one at a time. After each chunk is confirmed by the server, the browser moves to the next. The server tracks how many bytes it has received. If the connection drops and the user resumes, the browser asks the server "how many bytes do you have?" and picks up from that exact byte position.

The abstract protocol looks like this:

code

Client                                     Server
──────                                     ──────
"I want to upload a 100 MB file"        →
                                        ← "OK, session ID is abc-123"

"Here are bytes 0–8,388,607"            →
                                        ← "Confirmed, I have 8,388,608 bytes"

"Here are bytes 8,388,608–16,777,215"   →
                                        ← "Confirmed, I have 16,777,216 bytes"

... (10 more rounds) ...

"I'm done, finalize it"                 →
                                        ← "Upload complete"

Each exchange is independent. Any of them can fail and be retried without affecting the others.

Key terms

Before going further, here are the terms used throughout. If something feels unclear later, return here.

Term	Meaning
Chunk	One fixed-size slice of a large file, sent as a single HTTP request.
Session	The server-side record that tracks the state of an in-progress upload.
Offset	The byte position where the next chunk should begin.
Object storage	Storage designed for files (not databases): S3, R2, MinIO, Garage.
Multipart upload	An object-storage protocol that lets one file be assembled from separately uploaded parts.
ETag	A token returned by object storage for each uploaded part, used to verify and assemble the final object.
Staged object	A file fully uploaded to object storage but not yet processed or made visible in the application.
Finalization	The step where the staged file becomes a real application record.
Fingerprint	A SHA-256 hash of the file's contents, used for deduplication.

The session: giving an upload a persistent identity

For a resumable upload to work, the upload must have a persistent identity on the server. This is called a session.

When the browser starts an upload, it asks the server to create a session. The server generates a unique identifier — typically a UUID — and returns it. Everything that follows is tied to that ID.

The session stores, at minimum: the declared total file size, the number of bytes received so far, the file name and type, and a reference to wherever the partially-uploaded data is being stored.

The session is the contract between browser and server. It answers the question "where are we?" at any moment during the upload.

How much durability does a session need?

For a first implementation, sessions can live in server memory. An in-memory store is simple to reason about and teaches the protocol clearly before adding database complexity. The only downside is that sessions disappear on server restart — the worst outcome is a user has to start their upload over, which is uncommon enough to accept while learning.

For production with multiple servers or crash-proof resumption requirements, store sessions in a persistent store. Redis, PostgreSQL, and SQLite all work. The session schema is small — a few fields per upload — so this is a light migration.

The session does not need to survive server crashes to enable "resumable" uploads. Resumability means the user can close their laptop, come back tomorrow on the same device, and continue. That works as long as the server is still running when they return. Crash-proof durability is a separate, harder problem that many production implementations address only after validating the protocol.

Where the browser keeps the session ID

The browser must remember the session ID across page refreshes, app backgrounding, and device sleep. The right place is sessionStorage — not localStorage, not a cookie.

sessionStorage persists for the lifetime of the browser tab, survives page refreshes, and clears when the tab is closed. This matches the expected upload lifecycle: if the user closes the tab entirely, they are starting fresh.

The key used to look up the session ID should be derived from the file's identity:

code

upload-session:<endpoint>:<filename>:<filesize>:<lastModified>

Including lastModified prevents the browser from accidentally resuming the wrong file if the user selects a different file with the same name.

The offset handshake: how resumption actually works

The mechanism that makes resumption work is an offset header. Before sending any chunk, the browser declares: "I believe you have received N bytes. I am sending the next chunk starting at byte N."

The server checks whether N matches its actual committed byte count. If they match, the chunk is accepted. If not, the server returns its actual committed byte count and the browser adjusts.

code

PUT /upload/abc-123
x-upload-offset: 8388608
Content-Type: application/octet-stream
[binary chunk body]

If the server has 8,388,608 bytes: accept the chunk. If the server has 0 bytes (after a restart): reject with { uploadedBytes: 0 }. If the server has 16,777,216 bytes (client is behind): reject with { uploadedBytes: 16777216 }.

This handshake ensures no gaps and no duplicated bytes in the stored data. It is the critical invariant that makes resumable uploads safe.

Choosing a chunk size

Chunk size is a throughput-versus-overhead tradeoff. The theoretical maximum throughput for a sequential stop-and-wait protocol is:

code

max throughput ≈ chunk_size / round_trip_time

On a connection with a 200 ms round trip:

Chunk size	Theoretical max
1 MB	5 MB/s
4 MB	20 MB/s
8 MB	40 MB/s
16 MB	80 MB/s

Larger chunks win on throughput, but come with real costs. More work is lost on failure — if a 16 MB chunk fails halfway through, you resend all 16 MB. More memory is consumed on mobile devices that must hold the chunk while sending. Progress updates only happen once per chunk, so on slow connections users wait longer between visual feedback.

8 MB hits a practical sweet spot: it keeps overhead low, stays within mobile memory budgets, gives reasonably granular progress on slower connections, and comfortably exceeds the 5 MB minimum part size enforced by S3 multipart upload (more on that below).

Multipart upload: where the bytes go on the server side

When the server receives a chunk, it needs somewhere to put it while the upload is in progress. Writing to local disk and concatenating at the end works for a single server, but local disk does not replicate, fills up, and ties the upload to a specific machine.

The production pattern is S3 multipart upload, which is a feature built into S3 and all S3-compatible stores (Cloudflare R2, MinIO, Garage, Google Cloud Storage). It works in three stages:

Initiate. Tell S3 "I want to upload an object at this key." S3 returns an uploadId.

Upload parts. Send each chunk to S3 as a numbered part. S3 confirms receipt with an ETag for each part.

Complete. Send S3 the list of all part numbers and ETags. S3 assembles the object atomically.

The server-side chunk handler maps directly onto this:

session init → CreateMultipartUpload
each chunk PUT → UploadPart
upload complete → CompleteMultipartUpload
session expiry or cancellation → AbortMultipartUpload

The object is not visible in storage until CompleteMultipartUpload succeeds. A partially uploaded file never appears as a complete file, which means no cleanup race condition.

Fingerprinting and deduplication

Before starting a chunked upload, the browser can compute a SHA-256 hash of the file locally and ask the server whether that file already exists. If it does, the upload is skipped entirely.

code

Browser computes SHA-256 of the file
Browser asks server: "Do you have a file with fingerprint X?"
Server checks:
  yes → return the existing file's ID, done
  no  → proceed with chunked upload

Computing SHA-256 on a large file in the browser takes a few seconds — measurable but not painful. The payoff is significant: if the file is already stored, the "upload" is instantaneous.

The fingerprint computed during dedup preflight can also be passed to the server at finalization, saving the server from re-hashing the file. A 4 GB file hash takes real CPU time; computing it twice is wasteful.

Finalization: separating byte transfer from business logic

Receiving all the bytes is not the same as processing a file. After upload completes, the system typically needs to validate file type and content, generate thumbnails, transcode video, extract metadata, run deduplication checks, and create database records.

None of this should happen during the upload itself. The upload is a byte-transfer concern. Business logic is a separate concern. Mixing them makes the upload slow and fragile.

The clean pattern is a finalization callback:

The upload service completes the multipart upload to object storage.
It calls your application server with metadata: storage key, file name, MIME type, file size, fingerprint.
The application server downloads the staged object once, processes it, and creates the necessary records.
The application server confirms success. The upload service deletes the staged object.

This separation means upload throughput is not affected by slow business logic. Business logic can fail and be retried independently of the upload. The upload service stays focused and simple.

The application server never sees raw bytes in the throughput path — it only participates once, at the end.

Invariants the server must enforce

A naive upload endpoint trusts the client completely. A production upload service enforces hard invariants.

File size must be declared upfront and cannot change. The session records the declared size. Any request that would push the total above it is rejected. This prevents a client from gradually accumulating more storage than they declared.

Chunks must arrive in order with no gaps. The offset header must match the server's committed byte count exactly. Out-of-order chunks are rejected.

File size has a system maximum. An upload that exceeds the size limit is rejected at init time, not discovered mid-upload.

Finalization requires completeness. The server will not call the finalization callback until uploadedBytes === declaredFileSize.

Sessions expire. An upload sitting incomplete for 24 hours has its session evicted and its in-progress multipart upload aborted. Cleanup runs on a timer — every 30 minutes is typical — to prevent orphaned storage accumulating indefinitely.

Graceful shutdown aborts active uploads. Any in-progress multipart uploads are aborted before the process exits. This prevents orphaned partial uploads that would never complete.

Progress: measuring what actually matters

Progress is meaningless unless it reflects server-confirmed bytes. The browser knows how many bytes it has sent, but "sent" means "handed to the OS network stack," not "received and durably stored."

Measure progress as the server's committed byte count. After each chunk, the server responds with how many bytes it has confirmed. The browser uses this number — not its own send counter — to update the progress bar.

Progress can appear to stall even while data is flowing. That is correct behavior. The upload is not complete until the server says so.

Speed estimation smooths poorly if computed on raw time-between-responses. An exponential moving average handles this:

code

smoothedSpeed = (currentChunkSpeed × α) + (previousSmoothedSpeed × (1 - α))

A smoothing factor around 0.2–0.3 updates quickly when the connection genuinely changes but avoids jitter on every chunk.

Retries and error handling

A single chunk failure should not abort the upload. The correct behavior:

A chunk upload returns an error (network timeout, 5xx from the server).
Wait a short backoff period — 1 second, then 3 seconds.
Re-query the server's committed offset.
Retry from the server's confirmed position.

After 2–3 failures on the same chunk, the upload is abandoned and the user is informed. The session ID persists in sessionStorage, so if the user tries again, the upload is resumable from where it stopped.

One error category requires special handling: NotReadableError on the file object. This happens on Android and iOS when a content:// or cloud-synced file handle becomes invalid — the OS revoked access. Retrying is pointless: the file cannot be read. The user must re-select the file from local storage.

Go versus Node.js for a dedicated upload service

When you outgrow handling large uploads inside your application server, you might extract the upload path into a dedicated service. The tradeoffs between Go and Node.js are meaningful here.

Node.js is single-threaded with an event loop. CPU-bound work — hashing, buffer manipulation — blocks the event loop and stalls every other request in the process. Streaming large binary bodies through Node.js adds latency to unrelated requests on the same process. Each chunk passes through the HTTP parser, the stream abstraction, and your handler — each hop involves a buffer copy.

Go handles this differently. Each incoming request runs on its own goroutine — a lightweight, independently scheduled unit of execution. The Go runtime multiplexes thousands of goroutines onto a small pool of OS threads. CPU work on one goroutine does not stall others. Streaming bytes from an HTTP request body to an S3 multipart upload is a direct pipeline with minimal copies. The SHA-256 hash is computed as bytes stream through, adding negligible overhead.

Consideration	Node.js	Go
Concurrency model	Single-threaded event loop	Goroutine per request
Large binary streaming	Multiple buffer copies	Direct pipeline
Event loop contention	Measurable under load	None
Integration with JS codebase	Easy	Requires separate service
Standard library for HTTP + crypto	Adequate	Excellent

Node.js is sufficient for moderate upload loads and is easier to integrate into an existing JavaScript codebase. Go is a better fit for a dedicated upload service handling many concurrent large files.

The architecture decision — keeping your application server out of the chunk hot path entirely — is more important than the language choice. Whether you use Go, Rust, or another compiled language, the application server should only participate in finalization.

The four levels of upload architecture

There are four recognizable levels of upload system. Each solves a real problem that the previous level cannot handle.

Level 1 — Single-request upload. The browser sends the whole file in one HTTP POST. The server receives it, saves it, done. Simple, but fails for large files: body size limits, no resumption, memory pressure.

Level 2 — Chunked upload to local disk. The browser slices the file and sends pieces. The server reassembles them on local disk. Progress is tracked per-chunk. The user can resume after a dropped connection. No external dependencies. Sufficient for moderate loads on a single server.

Level 3 — Chunked upload to object storage. The server maps each browser chunk to an object-storage multipart part. The assembled file lands in S3 (or equivalent) rather than the server's filesystem. Scales across multiple servers. Staged objects survive server restarts.

Level 4 — Dedicated upload service. A separate lightweight service (typically Go or Rust) handles all chunked upload traffic. The main application server is completely out of the byte-transfer path and only receives a finalization callback once all bytes arrive.

Most implementations should reach Level 3 before considering Level 4. Level 4 adds real operational complexity — a new service, new deployment, inter-service communication — and is only worth it when the upload path demonstrably bottlenecks the main application.

The complete architecture

Assembling the pieces, a production chunked upload system looks like this:

code

Browser
  ├── compute fingerprint → dedup preflight → skip if duplicate
  ├── POST /upload/init → get sessionId, store in sessionStorage
  ├── loop:
  │    PUT /upload/<sessionId>  (x-upload-offset header)
  │    on 409: re-query server offset, adjust position
  │    on error: retry with backoff
  └── POST /upload/<sessionId>/complete

Upload Service (Go or similar)
  ├── POST /init    → CreateMultipartUpload, return sessionId
  ├── PUT  /<id>    → validate offset, UploadPart, update session
  ├── GET  /<id>    → return current committed bytes
  ├── DELETE /<id>  → AbortMultipartUpload, remove session
  └── POST /complete
        ├── CompleteMultipartUpload
        ├── build finalization payload (staged key, metadata, fingerprint)
        └── POST application-server/internal/finalize

Application Server (Node.js or similar)
  └── POST /internal/finalize
        ├── download staged object once
        ├── validate MIME and content
        ├── run deduplication
        ├── create database records
        ├── trigger downstream processing (transcoding, etc.)
        └── confirm → upload service deletes staged object

The application server is only involved once, at the end. Every chunk goes from the browser through the upload service directly to object storage. The application server never sees raw bytes in the throughput path.

What to build first

If you are starting from scratch, build in this order:

Single-request path first. A plain file input with a FormData POST. Get the end-to-end flow working: validation, storage, database records.
Add a size limit. Refuse uploads over a threshold — 50 MB is reasonable. Users hitting the limit confirms you need chunking.
Implement the session protocol. Add init, chunk, complete endpoints. Write to local disk for now. This validates the protocol before adding S3 complexity.
Add S3 multipart. Replace local-disk assembly with S3 part uploads. The protocol is unchanged; only the storage backend changes.
Add offset-based resumption. Store committed bytes in the session. Return 409 on offset mismatch. Add the server-status query endpoint. Add client-side sessionStorage. Test by killing the server mid-upload.
Add fingerprint dedup. Client hashes before upload. Server checks before accepting. The biggest quality-of-life improvement after basic resumption.
Extract the upload service. Only do this when the upload path is genuinely bottlenecking your application server.

FAQ

Why not just use a library like tus or Uppy?

You can, and for many projects you should. Tus is a solid open protocol with good client and server implementations. The reason to understand this pattern from first principles is that libraries abstract the decisions — and when something breaks at scale (a specific S3 behavior, a mobile OS file handle issue, a session expiry edge case), you need to know what is actually happening underneath to debug it. Building it once, even at Level 2, teaches you more than reading the tus spec.

What happens if the server crashes mid-upload?

With in-memory sessions, the session is lost on crash. The user has to start over. With persistent sessions (Redis or a database), the session survives the crash. The partially uploaded S3 multipart upload is still live — the user can resume by re-querying the server offset. The AbortMultipartUpload on graceful shutdown handles the clean exit case; crashes are messier and require the session expiry cleanup job to eventually abort orphaned multipart uploads.

Can I upload chunks in parallel instead of sequentially?

Yes, and it significantly improves throughput on high-latency connections. The tradeoff is that managing parallel chunk state is considerably more complex: you need to track which chunks are in flight, handle partial failures without losing confirmed chunks, and enforce ordering when completing the multipart upload (S3 assembles parts by number, not arrival order). Sequential upload is the right starting point. Parallel upload is a production optimization worth adding once the sequential version is stable.

What is the minimum part size in S3, and why does it matter?

S3 enforces a 5 MB minimum on all parts except the last one. If you send a 3 MB chunk (other than the final chunk), UploadPart will succeed but CompleteMultipartUpload will fail. This is why 8 MB is a common chunk size — it comfortably exceeds the 5 MB minimum while staying reasonable for mobile.

How should I handle file type validation?

Never trust the MIME type the browser sends — it is trivially spoofable. Validate file type server-side by reading the actual file bytes. For most formats, a magic byte check at the start of the file is sufficient: JPEG files start with FF D8 FF, PNG with 89 50 4E 47, and so on. Do this in the finalization step, not during chunking. Validating during chunking is complex and provides no security benefit since the server already has all the bytes when finalization runs.

Conclusion

Once you have seen this pattern once, you will recognize it in every production file system you use. Google Drive, Dropbox, YouTube, S3 itself — they all do the same thing: split the file, give the transfer a server-side identity, negotiate each piece independently, and separate byte transfer from processing.

The session model, the offset handshake, and the finalization callback are the three ideas that do the real work. Everything else — the chunk size choice, the S3 multipart mapping, the dedup preflight — is applied on top of those three invariants.

The implementation guide that follows — Building a Chunked Upload Service in Go — takes these concepts into working code: the Go session store, the S3 multipart SDK calls, the browser chunk client, and the finalization callback between services.

Let me know in the comments if something in the architecture is unclear, and subscribe for more practical development guides from the trenches of building Marta TV.

Thanks, Matija

By Matija Žiberna · Last updated June 2026 · Tested while building local TV, a self-hosted video platform on Next.js + Payload CMS

Why single-request uploads fail for large files

The simplest way to upload a file is a single HTTP POST. The browser reads the file, attaches it as a request body, and sends it. The server receives it and saves it. Simple.

For small files this is fine. For anything large — a video longer than a few minutes, a high-resolution photo set, a CAD export — it breaks in practice.

Every HTTP server (Nginx, Node.js, Go's net/http) has a maximum request body size. These limits exist to prevent memory exhaustion. A 2 GB video gets rejected long before it finishes uploading.

If the server must hold the entire file in memory while processing, a single large upload can consume gigabytes of RAM. With several concurrent users, the server falls over.

Chunked upload solves all of this with one insight: break the file into pieces and negotiate each piece independently.

What chunking actually means

A chunk is a contiguous slice of the file's bytes. You pick a size — 8 MB is a common choice — and divide the file into sequential slices.

A 100 MB file with 8 MB chunks produces 13 chunks: twelve 8 MB chunks and one 4 MB remainder.

The abstract protocol looks like this:

code

Client                                     Server
──────                                     ──────
"I want to upload a 100 MB file"        →
                                        ← "OK, session ID is abc-123"

"Here are bytes 0–8,388,607"            →
                                        ← "Confirmed, I have 8,388,608 bytes"

"Here are bytes 8,388,608–16,777,215"   →
                                        ← "Confirmed, I have 16,777,216 bytes"

... (10 more rounds) ...

"I'm done, finalize it"                 →
                                        ← "Upload complete"

Each exchange is independent. Any of them can fail and be retried without affecting the others.

Key terms

Before going further, here are the terms used throughout. If something feels unclear later, return here.

Term	Meaning
Chunk	One fixed-size slice of a large file, sent as a single HTTP request.
Session	The server-side record that tracks the state of an in-progress upload.
Offset	The byte position where the next chunk should begin.
Object storage	Storage designed for files (not databases): S3, R2, MinIO, Garage.
Multipart upload	An object-storage protocol that lets one file be assembled from separately uploaded parts.
ETag	A token returned by object storage for each uploaded part, used to verify and assemble the final object.
Staged object	A file fully uploaded to object storage but not yet processed or made visible in the application.
Finalization	The step where the staged file becomes a real application record.
Fingerprint	A SHA-256 hash of the file's contents, used for deduplication.

The session: giving an upload a persistent identity

For a resumable upload to work, the upload must have a persistent identity on the server. This is called a session.

The session stores, at minimum: the declared total file size, the number of bytes received so far, the file name and type, and a reference to wherever the partially-uploaded data is being stored.

The session is the contract between browser and server. It answers the question "where are we?" at any moment during the upload.

How much durability does a session need?

Where the browser keeps the session ID

The browser must remember the session ID across page refreshes, app backgrounding, and device sleep. The right place is sessionStorage — not localStorage, not a cookie.

The key used to look up the session ID should be derived from the file's identity:

code

upload-session:<endpoint>:<filename>:<filesize>:<lastModified>

Including lastModified prevents the browser from accidentally resuming the wrong file if the user selects a different file with the same name.

The offset handshake: how resumption actually works

The mechanism that makes resumption work is an offset header. Before sending any chunk, the browser declares: "I believe you have received N bytes. I am sending the next chunk starting at byte N."

The server checks whether N matches its actual committed byte count. If they match, the chunk is accepted. If not, the server returns its actual committed byte count and the browser adjusts.

code

PUT /upload/abc-123
x-upload-offset: 8388608
Content-Type: application/octet-stream
[binary chunk body]

This handshake ensures no gaps and no duplicated bytes in the stored data. It is the critical invariant that makes resumable uploads safe.

Choosing a chunk size

Chunk size is a throughput-versus-overhead tradeoff. The theoretical maximum throughput for a sequential stop-and-wait protocol is:

code

max throughput ≈ chunk_size / round_trip_time

On a connection with a 200 ms round trip:

Chunk size	Theoretical max
1 MB	5 MB/s
4 MB	20 MB/s
8 MB	40 MB/s
16 MB	80 MB/s

Multipart upload: where the bytes go on the server side

The production pattern is S3 multipart upload, which is a feature built into S3 and all S3-compatible stores (Cloudflare R2, MinIO, Garage, Google Cloud Storage). It works in three stages:

Initiate. Tell S3 "I want to upload an object at this key." S3 returns an uploadId.

Upload parts. Send each chunk to S3 as a numbered part. S3 confirms receipt with an ETag for each part.

Complete. Send S3 the list of all part numbers and ETags. S3 assembles the object atomically.

The server-side chunk handler maps directly onto this:

session init → CreateMultipartUpload
each chunk PUT → UploadPart
upload complete → CompleteMultipartUpload
session expiry or cancellation → AbortMultipartUpload

The object is not visible in storage until CompleteMultipartUpload succeeds. A partially uploaded file never appears as a complete file, which means no cleanup race condition.

Fingerprinting and deduplication

Before starting a chunked upload, the browser can compute a SHA-256 hash of the file locally and ask the server whether that file already exists. If it does, the upload is skipped entirely.

code

Browser computes SHA-256 of the file
Browser asks server: "Do you have a file with fingerprint X?"
Server checks:
  yes → return the existing file's ID, done
  no  → proceed with chunked upload

Computing SHA-256 on a large file in the browser takes a few seconds — measurable but not painful. The payoff is significant: if the file is already stored, the "upload" is instantaneous.

Finalization: separating byte transfer from business logic

None of this should happen during the upload itself. The upload is a byte-transfer concern. Business logic is a separate concern. Mixing them makes the upload slow and fragile.

The clean pattern is a finalization callback:

The upload service completes the multipart upload to object storage.
It calls your application server with metadata: storage key, file name, MIME type, file size, fingerprint.
The application server downloads the staged object once, processes it, and creates the necessary records.
The application server confirms success. The upload service deletes the staged object.

This separation means upload throughput is not affected by slow business logic. Business logic can fail and be retried independently of the upload. The upload service stays focused and simple.

The application server never sees raw bytes in the throughput path — it only participates once, at the end.

Invariants the server must enforce

A naive upload endpoint trusts the client completely. A production upload service enforces hard invariants.

Chunks must arrive in order with no gaps. The offset header must match the server's committed byte count exactly. Out-of-order chunks are rejected.

File size has a system maximum. An upload that exceeds the size limit is rejected at init time, not discovered mid-upload.

Finalization requires completeness. The server will not call the finalization callback until uploadedBytes === declaredFileSize.

Graceful shutdown aborts active uploads. Any in-progress multipart uploads are aborted before the process exits. This prevents orphaned partial uploads that would never complete.

Progress: measuring what actually matters

Progress is meaningless unless it reflects server-confirmed bytes. The browser knows how many bytes it has sent, but "sent" means "handed to the OS network stack," not "received and durably stored."

Progress can appear to stall even while data is flowing. That is correct behavior. The upload is not complete until the server says so.

Speed estimation smooths poorly if computed on raw time-between-responses. An exponential moving average handles this:

code

smoothedSpeed = (currentChunkSpeed × α) + (previousSmoothedSpeed × (1 - α))

A smoothing factor around 0.2–0.3 updates quickly when the connection genuinely changes but avoids jitter on every chunk.

Retries and error handling

A single chunk failure should not abort the upload. The correct behavior:

A chunk upload returns an error (network timeout, 5xx from the server).
Wait a short backoff period — 1 second, then 3 seconds.
Re-query the server's committed offset.
Retry from the server's confirmed position.

Go versus Node.js for a dedicated upload service

When you outgrow handling large uploads inside your application server, you might extract the upload path into a dedicated service. The tradeoffs between Go and Node.js are meaningful here.

Consideration	Node.js	Go
Concurrency model	Single-threaded event loop	Goroutine per request
Large binary streaming	Multiple buffer copies	Direct pipeline
Event loop contention	Measurable under load	None
Integration with JS codebase	Easy	Requires separate service
Standard library for HTTP + crypto	Adequate	Excellent

The four levels of upload architecture

There are four recognizable levels of upload system. Each solves a real problem that the previous level cannot handle.

The complete architecture

Assembling the pieces, a production chunked upload system looks like this:

code

Browser
  ├── compute fingerprint → dedup preflight → skip if duplicate
  ├── POST /upload/init → get sessionId, store in sessionStorage
  ├── loop:
  │    PUT /upload/<sessionId>  (x-upload-offset header)
  │    on 409: re-query server offset, adjust position
  │    on error: retry with backoff
  └── POST /upload/<sessionId>/complete

Upload Service (Go or similar)
  ├── POST /init    → CreateMultipartUpload, return sessionId
  ├── PUT  /<id>    → validate offset, UploadPart, update session
  ├── GET  /<id>    → return current committed bytes
  ├── DELETE /<id>  → AbortMultipartUpload, remove session
  └── POST /complete
        ├── CompleteMultipartUpload
        ├── build finalization payload (staged key, metadata, fingerprint)
        └── POST application-server/internal/finalize

Application Server (Node.js or similar)
  └── POST /internal/finalize
        ├── download staged object once
        ├── validate MIME and content
        ├── run deduplication
        ├── create database records
        ├── trigger downstream processing (transcoding, etc.)
        └── confirm → upload service deletes staged object

What to build first

If you are starting from scratch, build in this order:

Single-request path first. A plain file input with a FormData POST. Get the end-to-end flow working: validation, storage, database records.
Add a size limit. Refuse uploads over a threshold — 50 MB is reasonable. Users hitting the limit confirms you need chunking.
Implement the session protocol. Add init, chunk, complete endpoints. Write to local disk for now. This validates the protocol before adding S3 complexity.
Add S3 multipart. Replace local-disk assembly with S3 part uploads. The protocol is unchanged; only the storage backend changes.
Add offset-based resumption. Store committed bytes in the session. Return 409 on offset mismatch. Add the server-status query endpoint. Add client-side sessionStorage. Test by killing the server mid-upload.
Add fingerprint dedup. Client hashes before upload. Server checks before accepting. The biggest quality-of-life improvement after basic resumption.
Extract the upload service. Only do this when the upload path is genuinely bottlenecking your application server.

FAQ

Why not just use a library like tus or Uppy?

What happens if the server crashes mid-upload?

Can I upload chunks in parallel instead of sequentially?

What is the minimum part size in S3, and why does it matter?

How should I handle file type validation?

Conclusion

Let me know in the comments if something in the architecture is unclear, and subscribe for more practical development guides from the trenches of building Marta TV.

Thanks, Matija

Why single-request uploads fail for large files

What chunking actually means

Key terms

The session: giving an upload a persistent identity

How much durability does a session need?

Where the browser keeps the session ID

The offset handshake: how resumption actually works

Choosing a chunk size

Multipart upload: where the bytes go on the server side

Fingerprinting and deduplication

Finalization: separating byte transfer from business logic

Invariants the server must enforce

Progress: measuring what actually matters

Retries and error handling

Go versus Node.js for a dedicated upload service

The four levels of upload architecture

The complete architecture

What to build first

FAQ

Conclusion

Need Help Making the Switch?

📚 Comprehensive Payload CMS Guides

Frequently Asked Questions

Should I use tus or build my own chunked upload stack?

What happens if the server crashes mid-upload?

Can I upload chunks in parallel to improve speed?

What is the S3 minimum part size and why care?

How should I validate file type securely?

Why single-request uploads fail for large files

What chunking actually means

Key terms

The session: giving an upload a persistent identity

How much durability does a session need?

Where the browser keeps the session ID

The offset handshake: how resumption actually works

Choosing a chunk size

Multipart upload: where the bytes go on the server side

Fingerprinting and deduplication

Finalization: separating byte transfer from business logic

Invariants the server must enforce

Progress: measuring what actually matters

Retries and error handling

Go versus Node.js for a dedicated upload service

The four levels of upload architecture

The complete architecture

What to build first

FAQ

Conclusion