The Base64 Bug That Corrupted Our Azure Blob Storage Filenames

The incident

Our image processing pipeline stored user-uploaded files in Azure Blob Storage. To create unique, predictable blob names without relying on sequential IDs or GUIDs, we generated a key by Base64-encoding the combination of the user ID and the original filename. The idea was sound: a deterministic key means idempotent uploads (re-uploading the same file to the same user account produces the same blob name), and Base64 produces a compact, printable string from arbitrary bytes.

For weeks this worked fine. Then we started getting intermittent "file not found" reports from users downloading their files. The files were definitely uploaded successfully — we could see them in Azure Storage Explorer. But the download endpoint was returning 404. Stranger still, the bug only affected a subset of files. We couldn't figure out what the affected files had in common until we looked more carefully at the blob names.

What the buggy code looked like

// Original implementation — generates blob names that break in URLs
private static string GetBlobKey(string userId, string fileName)
{
    // Combine the user ID and filename into a single byte array
    var combined = $"{userId}:{fileName}";
    var bytes = Encoding.UTF8.GetBytes(combined);

    // Standard Base64 — compact, unique, deterministic
    return Convert.ToBase64String(bytes);
}

// In the controller
[HttpGet("files/{blobKey}")]
public async Task<IActionResult> DownloadFile(string blobKey)
{
    // blobKey arrives from the URL path — already URL-decoded by ASP.NET Core
    var blobClient = _containerClient.GetBlobClient(blobKey);
    if (!await blobClient.ExistsAsync()) return NotFound();
    // ...
}

The problem is buried in that innocent-looking call to Convert.ToBase64String. Standard Base64 (RFC 4648, section 4) uses 64 characters: A–Z, a–z, 0–9, +, and /. The last two are not URL-safe. When a blob key like dXNlcjEyMzphbG1vc3QgdGhlcmU+ was embedded in a download URL as /api/files/dXNlcjEyMzphbG1vc3QgdGhlcmU+, HTTP clients decoded the + as a space. The path segment that ASP.NET Core received was dXNlcjEyMzphbG1vc3QgdGhlcmU — with a trailing space — which didn't match any blob in storage.

The / case was even worse. A blob key containing / like dXNlcjE6L3RoaW5n caused the URL to break into two path segments: /api/files/dXNlcjE6 and L3RoaW5n. The route didn't match at all — the request hit a 404 before reaching our controller. This is why only some files were affected: only files whose Base64-encoded key happened to contain a + or / byte showed the problem. For most files, the encoded bytes fell in the safe range.

Why the bug was hard to notice

In Base64, whether a + or / appears depends entirely on the bit pattern of the encoded bytes. For most user IDs and common filenames, the encoded output consists entirely of alphanumeric characters. The problem only manifests for specific byte combinations — and in our dataset, this was about 4% of all uploads. In testing, we always used simple ASCII usernames and filenames, none of which happened to produce the problematic characters. The bug went undetected until our user base grew large enough that the 4% failure rate became noticeable.

The fix

// Option 1: Manual replacement — no extra packages, works everywhere in .NET
private static string GetBlobKey(string userId, string fileName)
{
    var combined = $"{userId}:{fileName}";
    var bytes = Encoding.UTF8.GetBytes(combined);

    return Convert.ToBase64String(bytes)
        .Replace('+', '-')  // + is not URL-safe
        .Replace('/', '_')  // / breaks URL path segments
        .TrimEnd('=');      // = padding can be omitted in Base64URL
}

// Option 2: Use Microsoft.AspNetCore.WebUtilities (already in every ASP.NET Core project)
// WebEncoders.Base64UrlEncode produces URL-safe Base64 directly
private static string GetBlobKey(string userId, string fileName)
{
    var combined = $"{userId}:{fileName}";
    var bytes = Encoding.UTF8.GetBytes(combined);
    return WebEncoders.Base64UrlEncode(bytes); // produces - and _ instead of + and /
}

// Decoding back (to verify a blob key or look up a file)
private static string DecodeKey(string blobKey)
{
    // WebEncoders handles the - and _ back to + and / and adds missing padding
    var bytes = WebEncoders.Base64UrlDecode(blobKey);
    return Encoding.UTF8.GetString(bytes);
}

Both options produce the same URL-safe output. The manual replacement approach works in any .NET context, including .NET class libraries that don't reference ASP.NET Core. The WebEncoders approach is cleaner and handles edge cases like missing padding automatically — use it in ASP.NET Core projects where the package is already available.

We also added a validation step in the upload handler to detect if a generated key contains any URL-unsafe characters before using it, so a similar bug would be caught immediately during testing:

// Guard: catch URL-unsafe keys during development
private static string GetBlobKey(string userId, string fileName)
{
    var bytes = Encoding.UTF8.GetBytes($"{userId}:{fileName}");
    var key = WebEncoders.Base64UrlEncode(bytes);

    // Sanity check — Base64URL should never contain + / or = characters
    System.Diagnostics.Debug.Assert(
        !key.Contains('+') && !key.Contains('/') && !key.Contains('='),
        $"Unsafe Base64 key generated: {key}");

    return key;
}

Standard Base64 vs Base64URL: a quick reference

The distinction matters everywhere binary data gets embedded in text-based protocols:

Standard Base64 (RFC 4648 §4) — uses + and / for characters 62 and 63. Output ends with = padding. Use this for email attachments (MIME), binary data in JSON payloads, and HTTP Basic Auth headers. Do not use in URLs.
Base64URL (RFC 4648 §5) — uses - and _ instead. Padding is typically omitted. This is what JWTs use for their header and payload segments, what OAuth tokens use, and what you should use for any Base64 value embedded in a URL or filename.

Note that Convert.ToBase64String in .NET always produces standard Base64. It has no option for URL-safe output. You must use either WebEncoders.Base64UrlEncode (ASP.NET Core) or the manual replace pattern for URL-safe output.

Where else this bug hides in .NET codebases

I've seen the same mistake in three other patterns after this incident:

State parameters in OAuth flows. Serializing state as standard Base64 and passing it as the state query parameter. The + decodes as a space and the round-trip breaks.
Email confirmation tokens stored in URLs. ASP.NET Core Identity generates token bytes and some codebases Base64-encode them for inclusion in confirmation link URLs. The correct method is WebEncoders.Base64UrlEncode, which Identity actually uses internally — but custom token handlers often miss this.
Cursor-based pagination tokens. Encoding a page cursor (containing a timestamp and record ID) as standard Base64 and including it as a query parameter. The / in the encoded output produces a 404 on the next page request.

The generalizable lesson

Any time you embed Base64 output in a URL — whether in a path segment, a query parameter, a filename, or a cookie value — you must use Base64URL. The standard variant is not interchangeable with the URL-safe variant, and the encoding produces visually similar output (both use alphanumeric characters for most bytes). This makes the bug silent until you hit the specific byte patterns that produce + or /, which only happens for a fraction of inputs.

Make it a rule in your codebase: Convert.ToBase64String for binary data in JSON, email, and non-URL contexts; WebEncoders.Base64UrlEncode for everything that ends up in a URL. The Base64 Encoder on DevToolsHub uses standard Base64 — if you paste a string and see + or / in the output, you know you need the URL-safe variant before embedding it in a link.

The Base64 Encoding Bug That Corrupted Our Azure Blob Storage Filenames