TinyBase64: A Minimal Base64 Encoder/Decoder for Tiny Projects
Base64 encoding is a ubiquitous way to represent binary data as ASCII text, useful for embedding images in HTML/CSS, serializing small blobs for text-only channels, or transmitting data where binary is unsupported. For tiny projects—embedded devices, minimal libraries, or single-file scripts—you often want a Base64 implementation that’s compact, dependency-free, and easy to audit. TinyBase64 delivers exactly that: a small, readable encoder/decoder with minimal memory overhead and no external dependencies.
Why choose TinyBase64
- Simplicity: A focused implementation that does only Base64 encode/decode, making it easy to inspect and trust.
- Small footprint: Minimal code size and low RAM use—suited for microcontrollers and single-file utilities.
- Portability: Plain C/JavaScript/Python variants are easy to drop into most projects.
- Performance: Reasonable speed using straightforward bit operations; optimized inner loops keep allocations low.
Design goals
- Keep the core implementation under ~200 lines of code.
- Avoid dynamic allocations when possible; support in-place operations or user-provided buffers.
- Provide predictable behavior for padding and invalid input (e.g., return errors or ignore non-Base64 characters depending on the API).
- Provide both strict (error on invalid input) and lenient (skip whitespace/non-Base64 bytes) decode modes.
Base64 basics (short)
Base64 maps each 3 bytes (24 bits) of input into 4 printable characters (6 bits each). If input length isn’t a multiple of 3, padding using ‘=’ is added to produce an output whose length is a multiple of 4.
Example implementations
C (compact, no heap)
c
// tinybase64.c - minimal Base64 encode/decode #includestatic const char b64_enc[] = “ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/”; static const signed char b64_dec[256] = { /* initialize all to -1, and set indices for valid chars / }; size_t tb64_encode(const unsigned char in, size_t inlen, char out) { size_t i=0, o=0; while (i+2 < inlen) { unsigned v = (in[i]<<16) | (in[i+1]<<8) | in[i+2]; out[o++] = b64_enc[(v >> 18) & 63]; out[o++] = b64_enc[(v >> 12) & 63]; out[o++] = b64_enc[(v >> 6) & 63]; out[o++] = b64_enc[v & 63]; i += 3; } if (i < inlen) { unsigned v = in[i] << 16; if (i+1 < inlen) v |= in[i+1] << 8; out[o++] = b64_enc[(v >> 18) & 63]; out[o++] = b64_enc[(v >> 12) & 63]; out[o++] = (i+1 < inlen) ? b64_enc[(v >> 6) & 63] : ’=’; out[o++] = ’=’; } return o; // out length } ssize_t tb64_decode(const char in, size_t inlen, unsigned char out) { size_t i=0, o=0; int pad=0; unsigned buf=0; int count=0; for (; i<inlen; ++i) { unsigned char c = in[i]; signed char v = b64_dec[c]; if (v == -1) continue; // skip non-base64 (lenient); change to return error for strict buf = (buf << 6) | v; if (++count == 4) { out[o++] = (buf >> 16) & 0xFF; if (in[i-1] != ’=’) out[o++] = (buf >> 8) & 0xFF; if (in[i] != ’=’) out[o++] = buf & 0xFF; buf = count = 0; } } return o; }
Notes: initialize b64dec array to -1 and set indices 0..63 mapping to corresponding characters; adapt error handling to your needs. This variant intentionally skips non-Base64 bytes to be tolerant of whitespace.
JavaScript (single-file, browser/node)
javascript
const TinyBase64 = { encode: (bytes) => { let s = “; const enc = “ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/”; for (let i=0;i<bytes.length;i+=3) { const a = bytes[i], b = bytes[i+1]||0, c = bytes[i+2]||0; const v = (a<<16)|(b<<8)|c; s += enc[(v>>18)&63] + enc[(v>>12)&63] + (i+1<bytes.length?enc[(v>>6)&63]:’=’) + (i+2<bytes.length?enc[v&63]:’=’); } return s; }, decode: (str) => { const dec = new Uint8Array(Math.floor(str.length3/4)); const table = (() => { const t = new Array(256).fill(-1); for(let i=0;i<64;i++) t[“ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/”[i].charCodeAt(0)] = i; return t; })(); let out=0, buf=0, count=0; for (let i=0;i<str.length;i++) { const v = table[str.charCodeAt(i)]; if (v === -1) continue; buf = (buf<<6)|v; if (++count==4) { dec[out++] = (buf>>16)&255; if (str[i-1] !== ’=’) dec[out++] = (buf>>8)&255; if (str[i] !== ’=’) dec[out++] = buf&255; buf = count = 0; } } return dec.slice(0,out); } };
Python (concise, no stdlib base64)
python
_b64 = “ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/” def encode(b: bytes) -> str: out = [] for i in range(0, len(b), 3): chunk = b[i:i+3] v = int.from_bytes(chunk, ‘big’) << (8*(3-len(chunk))) out.append(_b64[(v>>18)&63]) out.append(_b64[(v>>12)&63]) out.append(_b64[(v>>6)&63] if len(chunk)>1 else ’=’) out.append(_b64[v&63] if len(chunk)>2 else ’=’) return “.join(out) def decode(s: str) -> bytes: table = {c:i for i,c in enumerate(_b64)} buf = 0; cnt=0; out = bytearray() for ch in s: if ch == ’=’: buf = (buf<<6); cnt+=1; continue if ch not in table: continue buf = (buf<<6)|table[ch]; cnt+=1 if cnt==4: out.extend(((buf>>16)&255, (buf>>8)&255, buf&255)) buf = cnt = 0 # remove padding bytes if s.endswith(’==’): return bytes(out[:-2]) if s.endswith(’=’): return bytes(out[:-1]) return bytes(out)
API choices and trade-offs
- Strict decode: returns an error on invalid characters — useful for security-sensitive code.
- Lenient decode: skips whitespace and unknown bytes — useful for user-provided or legacy inputs.
- In-place encoding/decoding: reduces heap use but requires careful buffer sizing.
- Streaming vs whole-buffer: streaming supports large inputs with constant memory; whole-buffer is simpler for tiny projects.
Integration tips
- For microcontrollers, prefer the C version with user-supplied output buffer and avoid dynamic memory.
- For web clients, prefer the JS version but use built-in btoa/atob when available and inputs are ASCII; TinyBase64 covers arbitrary binary safely.
- For scripts, Python’s stdlib base64 is fine, but TinyBase64 is helpful when avoiding stdlib for portability or auditability.
Tests and verification
- Verify encode-decode roundtrip with random inputs.
- Test edge cases: empty input, lengths 1 and 2 mod 3, strings with whitespace, invalid characters, and all possible byte values.
- Compare outputs to a standard library implementation (e.g., OpenSSL, Python base64, Node Buffer).
Conclusion
TinyBase64 provides a pragmatic, minimal Base64 encoder/decoder ideal for tiny projects where simplicity, auditability, and low resource use matter. Use strict or lenient modes depending on your input trust level, and prefer in-place or streaming APIs on constrained devices.
If you want, I can produce a ready-to-drop single-file implementation in C, JavaScript, or Python with tests and build instructions—tell me which language.
Leave a Reply