Question 1

Which highlight types are recognised?

Accepted Answer

The standard PDF highlight annotation (subtype "Highlight") that every common reading app produces. Squiggly, underline and strikeout annotations are also captured if you toggle the option.

Question 2

Will it work on scanned PDFs?

Accepted Answer

Only if the scanned document was OCR-processed and has a real text layer. The extractor uses Mozilla’s PDF.js renderer to map highlight rectangles to the underlying glyph runs; without a text layer there is nothing to map to. Run OCR PDF first if needed.

Question 3

How is the output organised?

Accepted Answer

Highlights are grouped by page in the order they appear, prefixed with their page number. Pick the .txt format for raw lines or Markdown for a structured outline ready to paste into Notion / Obsidian / Bear.

Question 4

Does it include the surrounding context?

Accepted Answer

No — only the actual highlighted text is exported. Surrounding sentence context can be added with the post-processing toggle, which extends each extract to the nearest sentence boundary on either side.

Question 5

Is anything uploaded?

Accepted Answer

No. The PDF is processed by pdf.js running inside your browser tab. The text extracts are built in memory and offered as a download.

Question 6

Which highlight tools is this compatible with?

Accepted Answer

Anything that produces a standard PDF highlight annotation, which is essentially every reading app — every major desktop and mobile PDF reader, and the highlight tools built into iOS Books and Android’s default PDF viewer. The output is identical regardless of which app produced the highlight, because the on-disk format is standardised.

Question 7

Why do some highlights come back with garbled or missing text?

Accepted Answer

Three causes. First, the PDF was scanned without OCR — there is no text under the highlight to map to. Second, the text layer is unusually fragmented and Mozilla’s PDF.js could not reassemble the glyph order. Third, the highlight in the source app was drawn with a freeform shape rather than as a real highlight annotation. Re-OCR the PDF first if you suspect the first cause; for the third, switch to the standard highlight tool in your reading app.

Question 8

Can it export comments and notes attached to highlights?

Accepted Answer

Yes. If you typed a popup note on a highlight in a standards-compliant PDF reader, the comment is exported alongside the highlight text, prefixed with "Note:" in the .txt format and rendered as italic text in the .md format.

Question 9

Does it work on iPad highlights done with Apple Pencil?

Accepted Answer

It works on highlights produced with the Highlighter tool in apps like GoodNotes and PDF Expert, because those produce real highlight annotations. It does not work on freeform pen markings drawn directly with the Pencil — those are shape annotations, not text highlights.

Question 10

Will it find highlights in a password-protected PDF?

Accepted Answer

Not directly. PDF passwords encrypt the entire document including the annotations, so the extractor cannot read them. Use Remove PDF Password first to unlock the file with the correct password, then run the unlocked output through this tool.

Question 11

Is anything uploaded?

Accepted Answer

No. Mozilla’s PDF.js renderer parses the document in your browser; the extracted text is built in memory and offered as a download. Nothing leaves your device.

Question 12

How does this compare to most reference-manager exports?

Accepted Answer

The underlying mechanism is identical — both use PDF.js to map highlight rectangles to text. This tool runs entirely client-side without an account, accepts any PDF (not just papers in a single reference manager), and lets you switch between .txt and Markdown output formats at export time.

PDF Highlight Extractor — Export Highlights to Text

Related tools

About PDF Highlight Extractor

How it works

Common use cases

FAQ

PDF Highlight Extractor — Export Highlights to Text

Related tools

About PDF Highlight Extractor

How it works

Common use cases

FAQ

Explore more PDF Tools