How to Black Out Text in PDF (+ Verify It Worked)

You searched for how to black out text in a PDF. The top results will show you how to draw a black box. This guide is different: it explains why that box does not remove the text underneath, shows you the methods that actually work, and — crucially — covers the verification step that every other guide skips entirely.

Key takeaways:

Drawing a black rectangle in a PDF viewer places a visual layer on top of text. The text itself remains in the file's content stream and is trivially recoverable.
True PDF redaction requires a tool that strips the underlying content, not one that paints over it.
After redacting, you must verify. The verification step is not optional — it is the only way to know the redaction worked.
Free tools exist, but most browser-based "redaction" services upload your document to a server, which is itself a data exposure risk.

The consequences of getting this wrong range from embarrassment to criminal liability. In 2019 a law firm accidentally revealed Paul Manafort's classified financial information to the world by filing a PDF with black boxes drawn over the sensitive text. The boxes looked correct on screen. The text was readable by anyone who opened the file in a text editor.

Let us start with why this failure mode exists.

Why does blacking out text in a PDF fail?

A PDF file is not a flat image. It is a structured document format that separates what content exists from *how that content is rendered*. Understanding this distinction is the key to understanding why the black-box approach fails.

When a PDF is created, text is encoded into the file's content stream — a sequence of operators that describe where characters are placed, what font they use, and what size they appear. This content stream is stored independently of any visual elements placed on top of it. When you open the file in a viewer, the renderer reads those instructions and draws the text on screen.

Now consider what happens when you use a drawing tool to add a black rectangle over some text. The PDF viewer adds a new drawing instruction — "paint a black filled rectangle at coordinates X, Y with width W and height H" — to the file. This instruction is layered on top of the existing content. The original text instructions are still there, unchanged, below the new rectangle in the rendering order.

The rendering engine draws the black box over the text, so it looks redacted on screen. But the underlying content stream still contains every character of the original text. Any application that reads the content stream directly — rather than rendering it visually — will see the original text in full. This includes:

PDF text extraction tools (pdftotext, PyMuPDF, pdfplumber)
Screen readers and accessibility software
Search engines that index PDFs
Copy-and-paste from any PDF viewer that supports text selection
PDF parsers used by automated processing systems

The problem goes deeper still. Some PDF creation workflows produce files with multiple content layers — a concept explored in detail in the guide to phantom layers and PDF technical anatomy. In those documents, even a properly placed redaction annotation may fail to cover content on a non-default layer that the viewer is not displaying.

The short answer to why blacking out fails: a visual layer is not a data layer. Covering text visually does nothing to the underlying data structure. Redaction must operate on the content stream, not the rendering surface.

What happens when PDF blackout redaction fails?

The failure mode is not theoretical. It has exposed state secrets, compromised criminal investigations, and embarrassed government agencies. A full catalogue of the most significant incidents is documented in the pdf redaction failures case study collection, but three cases illustrate the pattern most clearly.

The Manafort filing (January 2019)

In January 2019, lawyers for Paul Manafort — Donald Trump's former campaign chairman, then under house arrest — filed a court document in which they disputed allegations made by prosecutors. The document contained passages that described Manafort's alleged contacts with a Russian associate named Konstantin Kilimnik and detailed financial transactions that prosecutors argued indicated witness tampering.

The law firm, Wilkie Farr & Gallagher, applied black rectangles over the sensitive passages before filing. The document appeared correctly redacted on screen. Journalists and court watchers who downloaded the filing and attempted to copy the blacked-out text discovered they could paste the original words directly into a text editor.

Within hours, the content of the supposedly sealed passages — including names, locations, financial figures, and the nature of the alleged contact — was published by multiple news outlets. The error was attributed to the use of a tool that drew over the text rather than removing it from the content stream. The firm had to file a corrected, properly redacted document the following day.

The information revealed was significant enough that legal commentators suggested it may have affected the course of the investigation.

The TSA security procedures leak

A Transportation Security Administration document describing screening procedures for airport security checkpoints was posted publicly after an attempted redaction using the same black-box approach. The document had been created by exporting from a word processor and then applying drawn rectangles in a PDF application.

Because the document contained vector-formatted text rather than a scanned image, the text layer survived intact beneath the redaction marks. Security researchers who obtained the document were able to extract the complete text, including details about which categories of traveler received expedited screening, the specific items screeners were instructed to examine more closely, and internal procedural thresholds.

The incident prompted a congressional inquiry and a review of how federal agencies handle document redaction.

The court system pattern

Both cases above follow a pattern that recurs throughout court systems, government agencies, and legal practices. Courts publish enormous volumes of partially redacted documents. Security researchers who systematically analyse these filings have found that a significant proportion of "redacted" documents filed before roughly 2015 contain recoverable text beneath their black boxes.

The pattern matters because court documents often contain home addresses of witnesses, Social Security numbers, medical records, and confidential business information — data that was supposed to be permanently removed. In most jurisdictions, once a document is filed publicly, it cannot be recalled.

How to properly black out text in Adobe Acrobat Pro

Adobe Acrobat Pro (not the free Acrobat Reader) contains a genuine redaction tool that modifies the content stream rather than painting over it. It is the most widely used professional PDF redaction solution and the one recommended in most legal and government workflows.

The process has four stages. Work through them in order and do not skip the final save step.

Stage 1: Open the Redact toolset

Open your document in Acrobat Pro. Go to Edit > Redact in the menu bar. In older versions of Acrobat, this may be under Tools > Redact or Tools > Protect & Standardize > Redact. The Redact toolbar will appear.

Stage 2: Mark content for redaction

Click Mark for Redaction. Your cursor changes to a crosshair. Draw a rectangle over each piece of content you want to remove. You can also:

Double-click a word to mark it individually
Triple-click a line to mark the entire line
Use Find & Redact (search bar within the Redact panel) to find and mark all instances of a specific term, phone number, or pattern across the entire document

Marked regions appear with a red border before the redaction is applied. At this stage, nothing has been removed — you are only queuing content for removal.

Stage 3: Apply the redaction

Click Apply in the Redact toolbar (or Apply All Marks in some versions). Acrobat will warn you that this action is permanent and irreversible. Confirm. The tool will:

1. Remove the marked text from the content stream 2. Remove any associated metadata or comments for the marked region 3. Replace the region visually with a solid black rectangle (or your configured fill colour)

Stage 4: Save as a new file

This step is critical and is the one most commonly missed. After applying redactions, use File > Save As and save the document with a new filename. Do not use Save (Ctrl+S). The reason: Acrobat's Save As operation strips undo history and ensures the removed content is not preserved in the file's incremental update structure. Using plain Save can leave the redacted content recoverable from the file's update log.

Note on Acrobat Reader: The free version of Acrobat Reader does not include the Redact tool. If you see a prompt to upgrade or if the Redact menu option is greyed out, you are using Reader. You cannot perform proper PDF redaction with Reader.

How to black out text in PDF without Acrobat

Acrobat Pro is expensive. A subscription costs around $240 per year as of early 2026. Depending on your operating system and the nature of the document, several free or lower-cost alternatives can perform genuine PDF redaction.

macOS: Preview and macOS Sonoma redaction

The Preview application included with macOS has long frustrated users who assumed its markup tools could redact text. Drawing a black rectangle in Preview's markup toolbar creates a vector shape layer on top of the text — not a redaction. The underlying text remains selectable and copyable.

macOS Sonoma (14.0, released 2023) introduced a limited Redact tool into Preview. It appears as a black marker icon in the markup toolbar. When used correctly on a text-based PDF, it does remove the text content from the selected area. However, the feature has known limitations: it does not reliably handle multi-column layouts, and it does not strip document metadata. For simple, single-column documents with a small number of redactions, it is usable. For anything more complex, treat it as a starting point and verify the result.

Browser-based tools: client-side rasterization

The safest category of web-based PDF tools are those that perform all processing inside your browser using WebAssembly or JavaScript PDF libraries, with no server upload. obfuscate.online takes this approach: when you load a PDF, the processing happens entirely within your browser tab. The tool can rasterize PDF pages — converting each page from a text-and-vector document into a flat bitmap image — which permanently eliminates the text layer. Once rasterized, there is no content stream to extract from.

Be cautious with other browser-based tools. Most "online PDF redactor" services upload your document to a remote server for processing. You are then trusting that server not to log, store, or process the content you just tried to redact. For sensitive documents, that trade-off is rarely acceptable.

Linux: command-line tools

On Linux, two approaches work well for programmatic redaction.

Using pdftk and coordinate-based masking: pdftk can add annotation overlays, but like other overlay approaches this does not remove the underlying text. It is not a true redaction tool.

Using qpdf and content stream editing: qpdf --qdf decompresses a PDF into a human-readable format where content streams can be manually edited. This is powerful but requires understanding of PDF syntax and is error-prone for complex documents.

Using pdf-redactor (Python): The pdf-redactor library for Python allows regex-based content stream substitution — finding and replacing text within the raw content stream. It is the most reliable open-source approach for automated redaction pipelines.

For single documents, the most practical free Linux approach is often to rasterize each page using ImageMagick or Ghostscript and reassemble into a PDF, accepting the trade-off of losing text searchability.

How to verify your PDF redaction actually removed the text

This section is the one most guides skip. It is also the most important section in this article. Applying a redaction method is not enough — you must confirm the redaction worked before distributing the file. Four verification methods exist, ranging from a 10-second quick check to a thorough technical audit.

Method 1: Select All and Copy (quick check, 30 seconds)

Open the redacted PDF in any viewer that supports text selection. Press Ctrl+A (or Cmd+A on Mac) to select all content on the page. Press Ctrl+C to copy. Open a plain text editor (Notepad, TextEdit, gedit) and paste.

If the redacted text appears in the pasted content, the redaction failed. If the redacted area is represented by a blank space or is absent entirely, this is a positive indicator — but not a guarantee.

Limitation: This method catches the most common failure mode (drawn rectangles over text) but will miss certain edge cases, particularly when text is encoded as a form field rather than inline content, or when the content stream uses non-standard encoding.

Method 2: Rasterize and OCR verification (the definitive method)

This is the most thorough verification approach. The principle: convert the redacted PDF to a flat image, then run optical character recognition (OCR) on that image. If the OCR output contains the text that was supposed to be redacted, the redaction failed.

The logic is sound because a flat image has no content stream. The only text that OCR can detect from an image is text that is visually present. If the redacted region appears as a solid black rectangle in the image, OCR will either return nothing or return a meaningless symbol. If OCR returns the original text, it means the redaction did not actually remove the visual text — the black box was not properly applied.

Tools for this approach:

1. Convert PDF page to image: pdftoppm -r 150 redacted.pdf page (creates PNG files) 2. Run OCR: tesseract page-1.png output (creates output.txt) 3. Search the output text for terms that should have been redacted

Alternatively, obfuscate.online can rasterize your PDF in-browser, and you can then check whether the resulting image-based PDF has any selectable text — it should not. For a deeper exploration of how PDF layers create hidden text that survives visual redaction, see phantom layers and PDF technical anatomy.

For automated verification pipelines that need to process many documents, consider integrating OCR-based redaction checking with the broader automated data anonymization workflows used for structured data — the same principle of systematic verification applies.

Method 3: PDF parser extraction

PDF text extraction libraries read the content stream directly, bypassing the rendering layer entirely. Run your redacted document through one of the following:

Python/pdfplumber: import pdfplumber; pdf = pdfplumber.open('redacted.pdf'); print(pdf.pages[0].extract_text())
Command line/pdftotext: pdftotext redacted.pdf - (outputs extracted text to stdout)
Node.js/pdf-parse: Reads the raw content stream and returns all text tokens

If any of these tools return the text that was supposed to be removed, the redaction failed. This method is faster than OCR and catches content-stream-level failures that the Select All test might miss.

Method 4: Metadata check

Redaction tools can fail to strip metadata even when they correctly remove inline text. Before distributing a redacted document, check:

Document properties: In Acrobat, File > Properties > Description. Check Author, Title, Subject, and Keywords fields — these sometimes contain names or system paths.
XMP metadata: Use exiftool redacted.pdf on the command line to dump all embedded metadata.
Incremental update history: As noted above, if Acrobat's plain Save was used instead of Save As, the file's incremental update log may contain the pre-redaction content. The qpdf --qdf decompress trick will expose this.
Embedded attachments: Some PDFs embed source files (Word documents, spreadsheets) as attachments. These are not redacted by content-stream tools. Check File > Properties > Attachments in Acrobat.

Quick reference — redaction method comparison

The table below summarises the most common approaches to PDF text removal, scored across the criteria that matter most for a genuine redaction decision.

Method	Removes text layer?	Free?	Supports verification?	Server upload?
Draw black rectangle (Preview, Word, annotation tools)	No	Yes	N/A — fails	No
Adobe Acrobat Pro Redact	Yes	No ($240/yr)	Yes — use Save As	No
macOS Sonoma Preview Redact	Mostly	Yes (macOS only)	Partial	No
obfuscate.online (rasterize)	Yes — converts to image	Yes	Yes — no text layer	No (client-side)
pdf-redactor (Python library)	Yes	Yes	Yes — use pdftotext	No
Online PDF services (most)	Varies	Often free tier	Rarely documented	Yes
Ghostscript rasterize + reassemble	Yes — converts to image	Yes	Yes — no text layer	No

The two rightmost columns often trade off against each other: tools that are free frequently require a server upload, which defeats the purpose of redacting sensitive content. The exception is client-side browser tools that use WebAssembly or JavaScript PDF engines — these process the document locally while remaining free to use.

For a broader framework on what it means to truly sanitize data rather than obscure it, the data sanitization guide covers the underlying principles that apply across formats, not just PDF.

Frequently asked questions

Does blacking out text in a PDF actually remove it?

No, in the vast majority of cases it does not. When you draw a black rectangle using annotation or drawing tools in Preview, Google Docs, LibreOffice, or most PDF viewers, you are adding a visual layer on top of the text. The text remains in the file's content stream and can be extracted by copying and pasting, using a PDF text extraction library, or opening the file in a text editor. True removal requires a tool that modifies the content stream itself, not the rendering layer. Only dedicated redaction tools — Adobe Acrobat Pro's Redact function, the macOS Sonoma Preview redaction tool, or rasterization — actually remove the underlying text data.

What is the difference between blacking out text and redacting it?

"Blacking out" typically describes the visual act of covering text with a black shape. It says nothing about whether the underlying data has been removed. "Redacting" — in its technical, legally meaningful sense — means permanently removing content from a document so it cannot be recovered by any means. A proper redaction removes the text from the content stream, strips associated metadata, and may also rasterize the page to eliminate any remaining structured data. The two terms are often used interchangeably in everyday language, but they describe fundamentally different operations. A document that has been visually blacked out but not technically redacted looks redacted while remaining fully readable to anyone who knows how to extract PDF text.

How can I verify that my PDF redaction actually worked?

The fastest method is the Select All and Copy test: open the redacted PDF, select all text on the affected page, copy it, and paste into a plain text editor. If the redacted text appears, the redaction failed. For a more thorough check, use a PDF text extraction tool such as pdftotext on the command line or the Python pdfplumber library — these read the content stream directly rather than relying on visual rendering. The most definitive method is rasterize-and-OCR: convert the page to a flat image, run OCR, and check whether the supposedly removed text is detectable in the image. If OCR cannot read it, neither can any content-stream extraction tool.

How can I black out text in a PDF for free?

On macOS Sonoma or later, the Preview app includes a basic redaction tool that genuinely removes text (with some limitations on complex layouts). On any platform, Ghostscript can rasterize PDF pages for free from the command line, converting each page to a flat image and eliminating the text layer entirely. The Python pdf-redactor library is free and open-source for programmatic content-stream redaction. For a no-install browser option, obfuscate.online processes PDFs client-side — your document never leaves your browser — and can convert pages to image-only format. Avoid free online services that require uploading your document to a remote server; the upload itself is a data exposure event for sensitive content.

Verify your PDF redactions now with obfuscate.online — no upload, no risk.

Try Free Tool

Verify Your PDF Redactions

Use obfuscate.online to rasterize PDF pages entirely in your browser — convert text layers to flat images so the original text is permanently destroyed.

Try Free Data Sanitization Tool

How to Black Out Text in PDF (And Verify It's Actually Gone)