HTTP multipart requests: The developer’s guide to conquering mixed content APIs
Table of contents

- HTTP multipart requests are just plaintext requests with boundaries — nothing to fear here!
- Born from email (MIME), they’re now essential for modern APIs mixing JSON and files
- Perfect for file uploads with metadata, processing multiple files, and working with mixed content types
- HTTP clients make the implementation straightforward
Picture this: You’re staring at a multipart HTTP request in your network inspector, watching a jumbled mess of boundaries, headers, and binary data fly by, feeling like you’re missing some crucial piece of knowledge. The good news? You’re not. HTTP multipart requests might look intimidating, but they’re built on the same simple principles that make the rest of HTTP so powerful.
If you’ve ever struggled with API endpoints that need both JSON metadata and file uploads, this post will change how you think about HTTP multipart forever.
Handle multipart uploads, JSON instructions, and file processing with our APIs.
The multipart mystery
Here’s the thing about HTTP multipart: It’s gotten an unfair reputation as being complex or magical. Developers often treat it like a black box — something that “just works” when you use the right library, but feels impossible to debug when things go wrong.
This fear usually stems from a few common misconceptions:
- “It’s binary and unreadable” — Actually, multipart is human-readable plaintext.
- “It’s a completely different protocol” — Nope, it’s still just HTTP with structured content.
- “You need special tools to work with it” — While tools help, understanding the format makes everything clearer.
The reality is that multipart requests follow the same simple, text-based approach that makes HTTP so elegant. Once you understand the structure, you’ll see it’s no more complex than a well-formatted email.
HTTP is just text
Before diving into multipart specifics, remember what makes HTTP practical: It’s human-readable plaintext. When you make a request to an API, you’re literally sending text that looks like this:
POST https://api.nutrient.io/build HTTP/1.1Content-Type: application/jsonAuthorization: Bearer pdf_live_rest_of_your_api_token
{ "parts": [ { "file": { "url": "https://www.nutrient.io/api/assets/downloads/samples/docx/document.docx" } } ]}
That’s it. No magic, no binary protocols. Just structured text that both humans and computers can easily parse and understand. This simplicity makes HTTP incredibly easy to use, document, debug, and test.
This transparency extends to multipart requests too. They’re built using the same plaintext approach, just with a bit more structure to handle multiple pieces of content in a single request.
The history of multipart
To understand why multipart exists, we need to go back to email. In the early days of the internet, email could only handle plain text. But as people wanted to send attachments, documents, and formatted content, the Internet Engineering Task Force (IETF) created Multipurpose Internet Mail Extensions (MIME).
MIME introduced the concept of multipart requests: a way to package multiple pieces of content with different types into a single message. Think of it as a digital envelope that can hold a letter, some photos, and a document — each clearly labeled and separated.
When the web started evolving beyond simple form submissions, developers faced the same challenge email had solved: How do you send different types of content together? The solution was elegant: Adapt MIME’s multipart format for HTTP.
This evolution happened gradually:
- HTML forms with file uploads (1990s) — The first real need for mixing form data with binary files.
- AJAX and rich web applications (2000s) — Developers needed programmatic control over multipart uploads, which eventually led to the current day
FormData
(opens in a new tab) API in JavaScript. - Mobile APIs and modern applications (2010s and beyond) — APIs required sophisticated content mixing, like JSON metadata with multiple file attachments.
The formal standard, RFC 7578(opens in a new tab), defined multipart/form-data
specifically for HTTP, building on the solid foundation of MIME, while optimizing for web use cases.
As you can see, the multipart/form-data
MIME type indicates a clear connection to the original purpose of HTTP multipart — submitting HTML forms.
Understanding multipart structure
A multipart message is like a container with clearly marked sections. Each section has its own headers and content, separated by boundary markers that act like dividers.
Here’s what a real multipart request looks like:
POST https://api.nutrient.io/build HTTP/1.1Content-Type: multipart/form-data; boundary=------FormBoundary7MB4YZxkMrZuAuthorization: Bearer pdf_live_rest_of_your_api_token
------FormBoundary7MB4YZxkMrZuContent-Disposition: form-data; name="instructions"Content-Type: application/json
{ "parts": [ { "file": "scanned" } ], "actions": [ { "type": "ocr", "language": "english" } ]}------FormBoundary7MB4YZxkMrZuContent-Disposition: form-data; name="scanned"; filename="scanned.pdf"Content-Type: application/pdf
%PDF-1.41 0 obj<</Type /Catalog/Pages 2 0 R>>endobj[...binary PDF content...]------FormBoundary7MB4YZxkMrZu--
The next sections will break this down.
Boundary markers
The boundary (----FormBoundary7MB4YZxkMrZu
) is a unique divider for parts in the multipart message. It:
- Must be unique enough that it won’t appear in the content.
- Appears at the start of each part with two leading dashes.
- Ends the entire message with two trailing dashes.
Part headers
Each part has its own headers. The usual ones are:
Content-Disposition
— Describes the part’s purpose and provides metadata. Most important is thename
, which represents the “key” of the part in the multipart message.Content-Type
— Specifies the MIME type of this part’s content.
Part contents
After the headers comes a blank line, followed by the actual content. This can be:
- Plain text (like JSON or form values)
- Binary data (like files)
- Any other content type you need to transmit
The beauty of this format is its flexibility. You can mix any combination of content types, assuming that each part is properly labeled and separated.
Common use cases
Multipart isn’t just a theoretical concept — it shows up constantly in real-world development. From classic HTML forms to modern APIs that need to bundle JSON instructions with binary files, multipart solves the problem of transmitting mixed content in one atomic request.
Form submissions with file uploads
The most common multipart pattern is the traditional HTML form with file uploads:
<form enctype="multipart/form-data" method="post" action="/upload"> <input type="text" name="title" value="My Document" /> <input type="file" name="document" /> <input type="submit" value="Upload" /></form>
Submission of this form results in a multipart request with both text fields and file content, each in its own part.
Mixed content API: JSON metadata with binary files
Modern APIs often need to mix multiple content types. For example, for document-centric APIs like those Nutrient powers, multipart is essential.
As an example, consider an API endpoint that needs to:
- Accept document files in various formats
- Include processing instructions
- Accept metadata controlling aspects of document ingestion
Consider an example of a typical multipart request that creates a new document in Document Engine(opens in a new tab) using its powerful processing capabilities. This example will use JavaScript’s FormData
API to illustrate how simple it is to create such requests in real applications:
// JavaScript provides the convenient `FormData` API to produce bodies of HTTP multipart requests.const formData = new FormData();
// Primary document.formData.append("report", pdfFile, "report.pdf");formData.append("summary", summaryDocxFile, "summary.docx");
// Instructions for processing file on upload.formData.append( "instructions", JSON.stringify({ parts: [{ file: "report" }, { file: "summary" }], actions: [{ type: "watermark", text: "CONFIDENTIAL" }], output: { format: "pdfa", }, }),);
// Example of part that contains primitive text property.formData.append("document_id", "Custom Document ID");
// Document storage properties.formData.append( "storage", JSON.stringify({ backend: "S3", bucketName: "free-tenants-documents", bucketRegion: "us-west-2", }),);
fetch("/api/documents", { method: "POST", body: formData,});
This pattern lets you specify complex processing logic alongside the actual files, all in a single HTTP request.
Alternatives to multipart
While multipart is widely adopted, there are multiple alternatives that were developed over the years to address specific use cases.
Base64 encoding in JSON
Probably the simplest approach to transferring mixed content is to use Base64 encoding of binary content and send it as value in a JSON payload. This approach encodes binary data like images, or documents, into a text format that can be safely included in JSON objects, enabling you to send files alongside other structured data without needing special handling.
It’s conceptually straightforward: Encode the file, wrap it in JSON, and send it via a standard HTTP POST. However, this method comes with significant tradeoffs that make it suitable only for specific scenarios involving smaller files.
Pros:
- Simple to implement
- Works with standard JSON APIs
- Easy to debug and test
Cons:
- 33 percent size overhead from Base64 encoding
- With most JSON parsers, the entire content must be loaded into memory when working with such payloads
- Poor performance for large files
Use when:
- Files are small (< 1MB)
- Simplicity is paramount
- You use a limited client environment (e.g. some low-code/no-code tools don’t have good support for HTTP multipart)
Separate API calls
Another approach splits file uploads into distinct operations, first creating a record with metadata, and then uploading the file content in a second request. This pattern treats metadata and file content as separate resources with their own endpoints, allowing each to be optimized independently. While this increases complexity by requiring coordination between multiple requests, it provides greater flexibility for scenarios where metadata needs to be processed, validated, or stored before the actual file upload occurs, or when you need to support advanced features like resumable uploads or file replacements.
Pros:
- Clean separation of concerns
- Better error handling granularity
- Easier to cache and optimize individually
Cons:
- Multiple roundtrips increase latency
- Complex error recovery scenarios
- Complicated if it’s required to handle transactional state since it’s spanning multiple HTTP requests
Use when:
- Operations are logically separate
- You need to support resumable uploads
- Metadata and files have different lifecycles
Binary protocols
Binary protocols involve designing custom data formats that pack file content and metadata into highly optimized byte sequences, transmitting data in their most compact form without the overhead of text-based formats like JSON or HTTP.
This approach requires defining your own wire format, serialization rules, and communication protocol, often using technologies like Protocol Buffers or completely custom binary formats. While this delivers maximum performance and minimal bandwidth usage, it comes at the cost of significant implementation complexity and reduced interoperability with standard web tooling.
Additionally, debugging binary protocols can be challenging due to their opaque nature.
Pros:
- Maximum efficiency
- Custom optimization opportunities
- Precise control over wire format
Cons:
- Complex to implement and debug
- Poor tooling support
- Harder to evolve and maintain
Use when:
- Performance is critical
- Bandwidth is extremely limited
- You’re building specialized systems
Security considerations
While multipart requests simplify client-side implementation, they introduce several security considerations that require careful server-side handling. The flexibility that makes multipart requests powerful also creates potential attack vectors if they’re not properly validated.
Always remember that multipart’s convenience on the client side should never compromise security on the server side. Treat every part of a multipart request as potentially untrusted input that requires validation.
File upload security
The main use case for multipart is file uploads, which can introduce significant security risks if not handled correctly. Here are the most important areas to focus on when handling file uploads on the server-side:
File size limits — Always enforce reasonable file size limits to prevent denial-of-service attacks. Large uploads can exhaust server memory, disk space, or bandwidth.
Content type validation — Never trust the Content-Type
header alone. Validate file contents using proper file type detection libraries that examine file signatures rather than relying on client-provided headers.
Filename sanitization — User-provided filenames can contain path traversal attacks (../../../etc/passwd
) or other malicious content. Always sanitize and validate filenames before storage.
Content validation
Beyond file-specific security, the multipart format itself requires careful validation. Attackers can exploit parsing vulnerabilities or overwhelm servers through malformed requests:
Parse boundaries safely — Malformed boundary markers can cause parsing errors or resource exhaustion. Use well-tested multipart parsing libraries rather than implementing your own parser.
Limit part count — Restrict the number of parts in a multipart request to prevent attacks that send thousands of small parts to exhaust server resources.
Validate JSON parts — When parts contain JSON data, apply the same validation rules you would for any JSON API endpoint — validate schemas, sanitize input, and check for injection attacks.
Infrastructure considerations
Production systems handling multipart uploads need additional safeguards at the infrastructure level to prevent resource exhaustion and maintain system stability:
Temporary file cleanup — Ensure uploaded files are properly cleaned up, even if processing fails. Implement timeouts and cleanup routines to prevent disk space exhaustion.
Rate limiting — Apply rate limiting not just to request count, but also to upload volume per client to prevent abuse.
Embrace the multipart
HTTP multipart requests aren’t the mysterious, complex protocol they might seem at first glance. They’re a logical extension of HTTP’s plaintext philosophy, designed to solve the real-world problem of mixing different content types in a single request.
When you’re building APIs that handle document uploads, image processing, or any scenario involving mixed content types, multipart isn’t just an option — it’s often the most elegant solution. It provides atomic operations, clear content separation, and broad compatibility across clients and servers.
The next time you see a multipart request in your network inspector, don’t approach it as complexity. Instead, look at it as a well-structured, readable format that’s efficiently solving the challenge of mixing different types of content in a single HTTP request without overhead.
Ready to implement robust document processing APIs? Explore how Nutrient Document Web Services APIs(opens in a new tab) handle complex multipart uploads with built-in processing, conversion, and security features that make your document workflows effortless.