Secure Image Uploads in Next.js: 5 Edge Cases [2026]

Validating an image upload by its magic bytes confirms the file starts like a PNG. It does not confirm the file is safe. The robust defense for images specifically is to re-encode every upload through a maintained image library before you store or serve it. Re-encoding strips EXIF, rebuilds the container so an appended payload is discarded, and is the only practical guard against a decompression bomb.

This is the image-layer companion to the secure file uploads guide. That post covers the upload architecture: a private bucket, Row Level Security (RLS) on storage.objects, server-issued object keys, and a magic-byte check on the bytes that arrive. This post starts where that gate ends. Once a file claims to be an image and passes the byte check, five things can still go wrong that are specific to image formats, and the fix for four of them is the same single step.

The honest framing first: a magic-byte check is necessary. It is just not sufficient for images, because an image is a container that can legally carry data your parser, your CDN, or the next viewer's browser will treat as more than pixels.

TL;DR:

Magic bytes are necessary, not sufficient. A polyglot is a valid image AND a payload at the same time, so it passes any signature check [5][6]. The Content-Type header is worse: it "cannot be trusted, as it is trivial to spoof" [5].
Re-encode, don't just validate. Sharp strips all metadata by default [1], rebuilds the pixel data so appended bytes vanish, and rejects oversized inputs via limitInputPixels [2]. OWASP: "image rewriting techniques destroys any kind of malicious content injected in an image" [5].
EXIF leaks location. Phone photos embed GPS coordinates. Serving the original file exposes "where the user is (or was)," which CWE-359 names as private personal information [4].
A decompression bomb is small on disk, huge in memory. A tiny file with valid headers can decode to a multi-gigabyte buffer [3]. Size limits on the file never catch it; only a pixel limit does [2].
SVG is a document, not a raster. You cannot re-encode a script out of it. Reject it for image fields, or handle it as untrusted HTML at the serving layer.

Why isn't a magic-byte check enough for images?
What does re-encoding an image upload actually fix?
How do you stop an image decompression bomb?
What does EXIF metadata leak when you serve an upload?
Is your image library itself an attack surface?
Where does SVG fit in?
Building image hardening into the upload pipeline

Why isn't a magic-byte check enough for images?

A magic-byte check confirms the file starts like an image. It says nothing about what comes after the header. A polyglot file is valid as two formats at once: a real PNG header and pixel data, with a payload appended or interleaved so the same bytes also parse as a ZIP, a script, or an HTML document. It passes the signature check because, read as an image, it genuinely is one.

This is the gap between "the bytes look like an image" and "the bytes are only an image." Magic-byte verification, the kind the file-uploads guide recommends with the file-type library, is the right first gate. It blocks the crude attack: an executable renamed to avatar.png. It does not block a file that is a working image and something else.

The Content-Type header is weaker still. OWASP states it plainly: "The Content-Type for uploaded files is provided by the user, and as such cannot be trusted, as it is trivial to spoof" [5]. So you have two signals, the claimed type and the leading bytes, and a polyglot satisfies both.

CWE-434 frames the consequence as a class: the weakness exists when "the product allows the upload or transfer of dangerous file types that are automatically processed within its environment" [6]. The key phrase is automatically processed. A polyglot sitting inert in object storage is harmless. The danger is what reads it next:

Your image processor parses the whole file, including the part that is not pixels (this is how the ImageMagick class of bugs fires, covered below).
A browser served the file inline with the wrong content type sniffs it and runs the HTML or script half.
A downstream service (a virus scanner, a thumbnail worker, an OCR step) opens it with a different parser that trips on the payload.

The fix is not a smarter signature check. Attackers will always find a header that satisfies your detector. The fix is to stop trusting the uploaded bytes as a unit and rebuild the image from its pixels alone.

What does re-encoding an image upload actually fix?

Re-encoding fixes four problems at once: it strips metadata, discards any bytes that are not pixel data, normalizes the format to one you control, and gives you a single place to enforce a pixel limit. You decode the upload to a raw pixel buffer, then write a brand-new file from that buffer. Whatever the attacker appended, embedded, or hid in metadata never makes it into the output, because the output is built only from decoded pixels.

OWASP recommends exactly this for images: "applying image rewriting techniques destroys any kind of malicious content injected in an image" [5]. A polyglot survives a magic-byte check, but it does not survive a decode-and-re-encode. The PNG-plus-ZIP file decodes to pixels; the ZIP tail is not pixels, so when you re-encode, it is gone. You are not detecting the payload. You are throwing away everything that is not an image.

In a Next.js app the natural home for this is a Server Action or a background worker that runs after the upload lands, using Sharp, the standard libvips wrapper for Node.js:

import sharp from 'sharp'
import { createAdminClient } from '@/lib/supabase/admin'

// Runs server-side, after the file is in the private bucket.
export async function reencodeUpload(objectKey: string) {
  const admin = createAdminClient()

  const original = await admin.storage
    .from('user-uploads')
    .download(objectKey)
    .then((r) => r.data?.arrayBuffer())

  // Decode to pixels, then write a fresh file. Appended bytes,
  // EXIF, and embedded payloads are all dropped here.
  const clean = await sharp(Buffer.from(original!), { limitInputPixels: 24_000_000 })
    .rotate() // bake EXIF orientation into pixels before metadata is stripped
    .webp({ quality: 82 })
    .toBuffer()

  await admin.storage
    .from('user-uploads')
    .upload(objectKey.replace(/\.[^.]+$/, '.webp'), clean, {
      contentType: 'image/webp',
      upsert: true,
    })

  await admin.storage.from('user-uploads').remove([objectKey])
}

Two details carry most of the weight. The .webp() (or .jpeg(), .png()) call is what rebuilds the file from pixels, which is the part that neutralizes the polyglot. And Sharp strips metadata for free: "By default all metadata will be removed, which includes EXIF-based orientation" [1]. You get the EXIF-stripping benefit (covered below) without asking for it, which is why .rotate() comes first.

To be clear about what ships where: SecureStartKit does not bundle an image-processing pipeline. The template's upload pattern is the one from the file-uploads guide, a signed upload URL issued from a Server Action that validates with Zod and writes the object key to a regular Postgres table. Re-encoding is the hardening layer you add on top when an upload field accepts images specifically. The template gives you the secure transport; this is the image-format defense that sits behind it.

One caveat worth stating: re-encoding is a raster operation. It works on JPEG, PNG, WebP, GIF, AVIF, and other pixel formats. It does not work on SVG, which is a different problem with a different fix, at the end of this post.

How do you stop an image decompression bomb?

You stop a decompression bomb by rejecting the image based on its decoded pixel dimensions, not its file size. A decompression bomb is a small file with valid headers that expands to an enormous raster when decoded, exhausting memory before you ever finish reading it. CWE-409 calls this data amplification: "a compressed input with a very high compression ratio that produces a large output," where CPU and memory "can be quickly consumed" [3].

This is the edge case that file-size limits miss completely. A 50KB upload sails through a 25MB bucket cap. But a 50KB PNG can declare dimensions of 50,000 by 50,000 pixels, and decoding that allocates roughly 50000 × 50000 × 4 bytes, about 10GB of RAM, in one shot. Your bucket's allowedMimeTypes and file_size_limit see a small, well-typed PNG and wave it through. The danger lives in the gap between the compressed size on disk and the decoded size in memory.

Sharp guards this with limitInputPixels, set by default to 268402689 pixels (0x3FFF squared). The docs are precise about what it does: "Do not process input images where the number of pixels (width x height) exceeds this limit" [2]. Pass a smaller number to tighten it to what your app actually needs. An avatar is never 24 megapixels, so cap it:

// Reject before allocating the pixel buffer. The default is ~268M
// pixels; an avatar pipeline should be far stricter.
const clean = await sharp(input, { limitInputPixels: 16_000_000 })
  .resize(512, 512, { fit: 'cover' })
  .webp()
  .toBuffer()
// Throws "Input image exceeds pixel limit" before decoding the bomb.

There is a subtlety in the docs you have to respect: the limit "Assumes image dimensions contained in the input metadata can be trusted" [2]. Sharp reads the declared dimensions from the header and rejects early, which is exactly what you want against a bomb that honestly declares 50,000 pixels. A more careful attacker can lie about dimensions in the header, so the pixel limit is one layer, not the whole defense. Pair it with three things:

A real file-size limit on the bucket, so the compressed payload itself stays bounded.
A wall-clock timeout on the processing job, so a slow decode cannot hang a worker indefinitely.
Re-encoding off the request path, in a background job rather than inline, so a bomb that does get through degrades a worker instead of your user-facing function. The file-uploads guide makes the same point about the magic-byte scan: heavy per-file work belongs in a background job, not on the upload request.

Because a decompression bomb is also a resource-exhaustion attack, the same abuse controls that protect any expensive endpoint apply here. Rate-limiting the upload and processing path by user ID, the pattern from the Server Actions rate-limiting guide, stops one account from queuing ten thousand bombs and chewing through your worker budget.

What does EXIF metadata leak when you serve an upload?

Serving an uploaded photo unmodified leaks whatever the camera wrote into its EXIF block, and for phone photos that routinely includes GPS coordinates, the device model, and a capture timestamp. CWE-359 lists "Geographic location - where the user is (or was)" as exactly the kind of private personal information a product must not expose to unauthorized actors [4]. A user uploads a profile picture taken at home, and the original file you serve to every other user carries the latitude and longitude of their house.

This is a quiet failure because nothing breaks. The image renders fine. The leak is invisible until someone runs exiftool on the file your CDN is happily serving. EXIF rides along in JPEG and some PNG and WebP files as a metadata block the image viewer ignores and a privacy auditor does not.

The fix is the re-encode you are already doing. Sharp strips metadata by default, so a re-encoded image has no EXIF unless you explicitly ask to keep it. The docs are unambiguous: with keepMetadata not called, "the default behaviour ... is to convert to the device-independent sRGB colour space and strip all metadata, including the removal of any ICC profile" [1]. The metadata leak closes as a side effect of the polyglot defense, which is why re-encoding earns its place as the single image-hardening step rather than a separate one.

The one trap: EXIF also carries the orientation flag, the tag that tells a viewer "this photo was shot in portrait, rotate it 90 degrees." Strip EXIF naively and a batch of phone photos all render sideways, because the pixels were stored landscape and the rotation lived only in the metadata you just deleted. Sharp's .rotate() with no arguments reads the EXIF orientation and bakes it into the pixel data, so you keep the correct orientation after the metadata is gone. Call it before the format output, as in the re-encode example above. You want the rotation in the pixels and everything else discarded.

Is your image library itself an attack surface?

Yes. Image parsers are large bodies of C and C++ that handle dozens of formats, and they have a long history of memory-corruption and remote-code-execution bugs. The canonical case is ImageTragick (CVE-2016-3714), where ImageMagick achieved remote code execution from a crafted image because it guesses file type from content: "ImageMagick tries to guess the type of the file by it's content, so exploitation doesn't depend on the file extension" [7]. A file named logo.png that was actually an MVG script triggered a shell command during conversion.

ImageTragick is a decade old, but the pattern is not. ImageMagick shipped multiple critical CVEs through 2025 and 2026, including format-string RCE and use-after-free bugs in its parsers. The library you choose and how you run it both matter:

Verify magic bytes before the processor touches the file. The official ImageTragick guidance is to "verify that all image files begin with the expected 'magic bytes' ... before sending them to ImageMagick for processing" [7]. This is the same first gate from the file-uploads guide, and here it is load-bearing for a different reason: it keeps a delegate-triggering payload away from a parser that runs delegates.
Disable the coders you do not use. ImageMagick's policy.xml lets you turn off the dangerous coders (the URL, MVG, and MSL handlers that ImageTragick abused) [7]. If you process JPEG and PNG, you do not need a coder that fetches URLs.
Prefer a narrower, well-maintained library and keep it patched. Sharp wraps libvips, which has a smaller and more auditable surface than the full ImageMagick delegate system. The patch discipline is real work: when libvips disclosed CVE-2025-29769, the fix shipped in libvips 8.16.1, bundled in Sharp 0.34.1 [8]. The same advisory is a good argument for re-encoding as defense in depth, because Sharp was unaffected by that particular bug since it "will always attempt to convert 'multiband' input to another colourspace, typically sRGB, before processing" [8]. The normalize step that defeats polyglots also narrowed the blast radius of a parser CVE.
Run processing in isolation. Decode untrusted images in a background worker or an isolated function, not in the request handler that holds your session and your service role key. If a parser bug does fire, you want it to crash a sandboxed job, not execute in the context that can read your database.

The takeaway is not "image libraries are too scary to use." It is that the decode step runs attacker-controlled bytes through native code, so you treat it like any other untrusted-input boundary: minimal surface, patched, and isolated.

Where does SVG fit in?

SVG is the exception to everything above, because it is not a raster you can re-encode. SVG is an XML document the browser parses with the same engine that runs HTML, and the format permits a <script> element. There are no pixels to decode and rebuild, so the re-encode defense that neutralizes a JPEG polyglot does not apply. An SVG that contains script is not a corrupted image; it is a working document doing what the format allows.

That makes SVG an XSS vector the moment you serve an uploaded one inline from an origin that holds the viewer's session. The full serving-layer defense, sanitizing with DOMPurify's SVG profile, serving user files with Content-Disposition: attachment, and isolating the origin, is covered in the guide on rendering user HTML and Markdown safely. There is no reason to repeat it here.

The ingest-layer decision, the one this post is about, is simpler and comes first: decide whether you accept SVG at all. For an avatar, a logo upload, or a product image, you almost never need it, and the cleanest policy is to reject SVG and accept only raster formats you can re-encode. If you genuinely need user-supplied SVG (an icon library, a diagram tool), treat it as untrusted HTML from the moment it arrives, never as "just an image," and route it through the sanitize-and-serve-as-attachment path rather than the re-encode path. The mistake is letting SVG ride the same upload lane as JPEG and PNG, where "it's an image" quietly means "it skipped the document checks."

Building image hardening into the upload pipeline

Image upload security is two layers, and they answer different questions. The transport layer, covered in the secure file uploads guide, answers "who can put a file here and who can read it back": a private bucket, RLS on storage.objects, server-issued keys, signed URLs, and a magic-byte gate. The image layer answers "is this thing actually safe to decode and serve," and its core move is to re-encode every raster upload so EXIF is stripped, polyglot tails are discarded, and a pixel limit caps the bomb.

If you ship one thing from this post, ship the re-encode step. Decode to pixels, cap limitInputPixels to what your app needs, bake orientation in with .rotate(), write a fresh file, and do it in a background job behind a rate limit. Keep the parser patched, disable coders you do not use, and reject SVG from raster upload fields. That sequence closes four of the five edge cases with one operation and hands the fifth to the sanitization layer.

The patterns SecureStartKit ships, signed upload URLs, Zod-validated Server Actions, and backend-only access to storage, give you the secure foundation to bolt this onto. If you are auditing an existing upload flow, the SaaS security checklist covers the bucket, policy, and key-handling checks that sit underneath the image work, and the multi-tenant storage guide covers keeping one tenant's uploads invisible to another. Get the transport right first, then make the bytes themselves safe.

TL;DR:

Magic bytes are necessary, not sufficient. A polyglot is a valid image AND a payload at the same time, so it passes any signature check [5][6]. The Content-Type header is worse: it "cannot be trusted, as it is trivial to spoof" [5].
Re-encode, don't just validate. Sharp strips all metadata by default [1], rebuilds the pixel data so appended bytes vanish, and rejects oversized inputs via limitInputPixels [2]. OWASP: "image rewriting techniques destroys any kind of malicious content injected in an image" [5].
EXIF leaks location. Phone photos embed GPS coordinates. Serving the original file exposes "where the user is (or was)," which CWE-359 names as private personal information [4].
A decompression bomb is small on disk, huge in memory. A tiny file with valid headers can decode to a multi-gigabyte buffer [3]. Size limits on the file never catch it; only a pixel limit does [2].
SVG is a document, not a raster. You cannot re-encode a script out of it. Reject it for image fields, or handle it as untrusted HTML at the serving layer.

Why isn't a magic-byte check enough for images?
What does re-encoding an image upload actually fix?
How do you stop an image decompression bomb?
What does EXIF metadata leak when you serve an upload?
Is your image library itself an attack surface?
Where does SVG fit in?
Building image hardening into the upload pipeline

Why isn't a magic-byte check enough for images?

Your image processor parses the whole file, including the part that is not pixels (this is how the ImageMagick class of bugs fires, covered below).
A browser served the file inline with the wrong content type sniffs it and runs the HTML or script half.
A downstream service (a virus scanner, a thumbnail worker, an OCR step) opens it with a different parser that trips on the payload.

What does re-encoding an image upload actually fix?

In a Next.js app the natural home for this is a Server Action or a background worker that runs after the upload lands, using Sharp, the standard libvips wrapper for Node.js:

import sharp from 'sharp'
import { createAdminClient } from '@/lib/supabase/admin'

// Runs server-side, after the file is in the private bucket.
export async function reencodeUpload(objectKey: string) {
  const admin = createAdminClient()

  const original = await admin.storage
    .from('user-uploads')
    .download(objectKey)
    .then((r) => r.data?.arrayBuffer())

  // Decode to pixels, then write a fresh file. Appended bytes,
  // EXIF, and embedded payloads are all dropped here.
  const clean = await sharp(Buffer.from(original!), { limitInputPixels: 24_000_000 })
    .rotate() // bake EXIF orientation into pixels before metadata is stripped
    .webp({ quality: 82 })
    .toBuffer()

  await admin.storage
    .from('user-uploads')
    .upload(objectKey.replace(/\.[^.]+$/, '.webp'), clean, {
      contentType: 'image/webp',
      upsert: true,
    })

  await admin.storage.from('user-uploads').remove([objectKey])
}

How do you stop an image decompression bomb?

// Reject before allocating the pixel buffer. The default is ~268M
// pixels; an avatar pipeline should be far stricter.
const clean = await sharp(input, { limitInputPixels: 16_000_000 })
  .resize(512, 512, { fit: 'cover' })
  .webp()
  .toBuffer()
// Throws "Input image exceeds pixel limit" before decoding the bomb.

A real file-size limit on the bucket, so the compressed payload itself stays bounded.
A wall-clock timeout on the processing job, so a slow decode cannot hang a worker indefinitely.
Re-encoding off the request path, in a background job rather than inline, so a bomb that does get through degrades a worker instead of your user-facing function. The file-uploads guide makes the same point about the magic-byte scan: heavy per-file work belongs in a background job, not on the upload request.

What does EXIF metadata leak when you serve an upload?

Is your image library itself an attack surface?

Verify magic bytes before the processor touches the file. The official ImageTragick guidance is to "verify that all image files begin with the expected 'magic bytes' ... before sending them to ImageMagick for processing" [7]. This is the same first gate from the file-uploads guide, and here it is load-bearing for a different reason: it keeps a delegate-triggering payload away from a parser that runs delegates.
Disable the coders you do not use. ImageMagick's policy.xml lets you turn off the dangerous coders (the URL, MVG, and MSL handlers that ImageTragick abused) [7]. If you process JPEG and PNG, you do not need a coder that fetches URLs.
Prefer a narrower, well-maintained library and keep it patched. Sharp wraps libvips, which has a smaller and more auditable surface than the full ImageMagick delegate system. The patch discipline is real work: when libvips disclosed CVE-2025-29769, the fix shipped in libvips 8.16.1, bundled in Sharp 0.34.1 [8]. The same advisory is a good argument for re-encoding as defense in depth, because Sharp was unaffected by that particular bug since it "will always attempt to convert 'multiband' input to another colourspace, typically sRGB, before processing" [8]. The normalize step that defeats polyglots also narrowed the blast radius of a parser CVE.
Run processing in isolation. Decode untrusted images in a background worker or an isolated function, not in the request handler that holds your session and your service role key. If a parser bug does fire, you want it to crash a sandboxed job, not execute in the context that can read your database.

Secure Image Uploads in Next.js: 5 Edge Cases [2026]

Table of Contents

Why isn't a magic-byte check enough for images?

What does re-encoding an image upload actually fix?

How do you stop an image decompression bomb?

What does EXIF metadata leak when you serve an upload?

Is your image library itself an attack surface?

Where does SVG fit in?

Building image hardening into the upload pipeline

References

Related Posts

Supabase Storage Multi-Tenant RLS: 5 Leak Modes [2026]

Rotate Leaked API Keys Without Downtime [2026]

Supabase MFA Recovery: 5 Lost-Device Failure Modes [2026]

Secure Image Uploads in Next.js: 5 Edge Cases [2026]

Table of Contents

Why isn't a magic-byte check enough for images?

What does re-encoding an image upload actually fix?

How do you stop an image decompression bomb?

What does EXIF metadata leak when you serve an upload?

Is your image library itself an attack surface?

Where does SVG fit in?

Building image hardening into the upload pipeline

References

Related Posts

Supabase Storage Multi-Tenant RLS: 5 Leak Modes [2026]

Rotate Leaked API Keys Without Downtime [2026]

Supabase MFA Recovery: 5 Lost-Device Failure Modes [2026]