PDF Changer
21 browser-only PDF tools. Nothing leaves the browser. I built it because every online PDF tool I tested sends your files to a server, and most of them pipe your download tokens through Google Analytics.
The core problem
I ran network captures on iLovePDF and Smallpdf. iLovePDF sends your download token to Google Analytics — so Google knows you merged a tax return. Smallpdf made 215 requests during a single merge operation. Both upload your files to their servers for processing.
PDF Changer processes everything client-side. The CSP header blocks all network access from the processing sandbox. If the code tries to phone home, the browser kills the request and the VPE monitor logs the attempt.
Verified Processing Environment
Three monitors run concurrently during every operation: PerformanceObserver watches for network activity, a CSP violation listener catches blocked requests, and a MutationObserver detects DOM injection. WebRTC is monkey-patched to block ICE candidate leaks. All events hash into a tamper-evident HMAC chain — alter one event and every subsequent hash breaks.
The output is a downloadable audit report: timestamped event log, HMAC chain integrity proof, and a summary of what happened during processing. The user can verify independently that nothing leaked.
Byte-level metadata stripping
pdf-lib can't reach embedded image streams. So the scrubber scans raw bytes for JPEG SOI markers (FF D8) and PNG magic (89 50 4E 47), then excises APP1/APP2/APP13/APP14 segments and tEXt/iTXt/eXIf/iCCP chunks at the byte level. Overlapping regions get merge-sorted before excision.
Most "metadata removal" tools don't touch embedded images. I only found out by hex-dumping a pdf-lib output.
Printer tracking dots
Colour laser printers embed a 15×8 grid of yellow dots on every page — date, time, serial number, encoded in a pattern documented by the EFF and TU Dresden. PDF Changer decodes the Xerox DocuColor pattern and shows the user what's embedded before they share the document.
Only covers Xerox DocuColor patterns. Other manufacturers use different encodings that aren't publicly documented.
pdf-lib alone clears PDF-level fields and stops there — the embedded image streams hold separate metadata that the library cannot reach. PDF Changer’s differentiator is the byte-level scanner in src/scrub/jpeg.ts and src/scrub/png.ts; Xerox dots are decoded, other manufacturers’ patterns aren’t publicly documented and no tool clears that column.From the source
function findJpegMarkers(buf: Uint8Array): Marker[] {
const markers: Marker[] = [];
for (let i = 0; i < buf.length - 1; i++) {
if (buf[i] !== 0xff || buf[i + 1] !== 0xd8) continue;
// Found SOI — walk forward to find APP segments
let pos = i + 2;
while (pos < buf.length - 3) {
if (buf[pos] !== 0xff) break;
const type = buf[pos + 1];
const len = (buf[pos + 2] << 8) | buf[pos + 3];
if (type >= 0xe0 && type <= 0xef) {
markers.push({ offset: pos, type, length: len + 2 });
}
pos += len + 2;
}
}
return markers;
}async function appendEvent(chain: Chain, event: VpeEvent): Promise<Chain> {
const payload = JSON.stringify({
seq: chain.length,
prev: chain.at(-1)?.hash ?? GENESIS,
event,
ts: Date.now(),
});
const hash = await hmacSha256(chain.key, payload);
return [...chain, { payload, hash }];
}What it doesn't do
- ▲No E2E browser tests. Unit and integration tests cover the processing pipeline but not the full UI flow.
- ▲Printer dot decoding only covers Xerox DocuColor. Other manufacturers use undocumented patterns.
- ▲The VPE audit is self-attested. The site delivers the JavaScript that creates the audit — a compromised server could serve modified code. This is the code delivery problem (same limitation as ProtonMail, Bitwarden, MEGA).
- ▲OCR (Tesseract.js) runs in the browser. Accuracy on scanned documents varies significantly with image quality.
Stack
React SPA + pdf-lib + PDF.js + Tesseract.js + Web Crypto (browser layer). Iframe sandbox with CSP connect-src 'none'. Service Worker for PWA/offline. Hono + Cloudflare Workers + D1 (edge API). WebAuthn passkeys + ECDSA P-256 offline entitlements.