redaction security pdf

Why Black Boxes Don't Actually Redact PDFs (And What Does)

TaxRedact Team
| | 4 min read

You’ve got a PDF with a Social Security number you need to hide. So you open it in Adobe Reader, draw a black rectangle over the SSN, and save. Done, right?

Wrong. That SSN is still in your document—fully readable by anyone who knows where to look.

The Black Box Illusion

When you draw a shape over text in a PDF viewer, you’re adding a new layer on top of the existing content. Think of it like placing a sticky note over text on a printed page—the text is still there underneath.

The original text remains in the file, stored as searchable, selectable, copy-paste-able data. The black box is just a visual element that sits on top.

The Copy-Paste Test

Here’s how easy it is to “unredact” a black-boxed PDF:

  1. Open the “redacted” PDF
  2. Press Ctrl+A (or Cmd+A) to select all text
  3. Press Ctrl+C to copy
  4. Paste into a text editor

Surprise—the “hidden” text appears right there in your clipboard.

Real-World Consequences

This isn’t a theoretical problem. High-profile redaction failures have exposed sensitive information with serious consequences:

The Paul Manafort Filing (2019)

Lawyers for Paul Manafort filed a court document with black boxes covering confidential information about his business dealings. Within hours, journalists discovered they could simply copy-paste the “redacted” text, revealing details about his interactions with a Russian associate.

TSA Security Procedures (2009)

The Transportation Security Administration accidentally published airport screening procedures with fake redactions. The “hidden” text revealed security vulnerabilities, checkpoint bypass methods, and screening exemptions.

Countless FOIA Requests

Government agencies regularly release Freedom of Information Act documents with fake redactions. Researchers have found that many “redacted” government documents contain fully readable sensitive information.

What Actual Redaction Looks Like

True PDF redaction removes the underlying data from the document’s content stream. After proper redaction:

  • The text is deleted, not covered
  • Copy-paste returns nothing (or replacement text like “REDACTED”)
  • The file size decreases because data has been removed, not added
  • Search won’t find the redacted content
  • PDF inspection tools show no trace of the original text

How PDFs Store Text

To understand why black boxes fail, you need to understand PDF structure. A PDF file contains:

  1. Content streams — The actual text data, fonts, and positioning
  2. Annotations — Additional elements like highlights, comments, and shapes
  3. Metadata — Document properties, author info, etc.

When you draw a black rectangle, you’re adding an annotation layer. The content stream—where the text lives—remains untouched.

True redaction tools modify the content stream itself, permanently removing the text characters from the file.

Tools That Provide Real Redaction

Not all PDF tools offer true redaction. Here’s what to look for:

Does Offer True Redaction:

  • Adobe Acrobat Pro (using the dedicated Redaction tool, not shapes)
  • TaxRedact (AI-powered, removes text from content stream)
  • Some specialized legal/government tools

Does NOT Provide True Redaction:

  • Drawing rectangles or shapes in any PDF viewer
  • Using the highlight tool with black color
  • Screenshot and image overlay methods
  • Most free PDF editors

How to Verify Your Redaction Worked

Before sharing any redacted document, test it:

  1. Copy-paste test — Select all text, paste elsewhere. Redacted content shouldn’t appear.
  2. Search test — Search for words that should be redacted. Nothing should be found.
  3. File size check — True redaction usually reduces file size since data is removed.
  4. Use pdftotext — Run pdftotext document.pdf - to extract all text. Redacted content shouldn’t appear.

The Bottom Line

If you’re using black boxes, highlighter tools, or image overlays to “redact” sensitive data, you’re not protecting anything. The data remains in the file, waiting to be discovered.

Use a tool that actually removes the data—not one that just hides it from view. Your sensitive information deserves real protection.


Need to redact sensitive data from your PDFs? TaxRedact uses AI to find SSNs, EINs, and other sensitive information, then permanently removes them from your document—not just covers them up.