How to Analyze a PDF: Structure, Fonts, and Properties

A practical look at what PDF analysis shows you — and why it matters for debugging, compliance, and optimization.

← Back to Blog

How to Analyze a PDF: Structure, Fonts, and Properties

Quick summary

Quick answer: Analyzing a PDF means inspecting its structure without changing it — pages, fonts, embedded images, form fields, security settings, and format version. Adobe Acrobat's Preflight tool gives the deepest view; Mapsoft's PDF Hub Analyze PDF is the fastest online option; pdfinfo and qpdf --json cover command-line workflows.

You can also analyze a PDF online for free using Mapsoft's PDF Hub — no installation required.

What PDF analysis reveals

A PDF analysis is a read-only inspection of the file. It tells you what's in the document, how it's put together, and whether anything looks wrong — without modifying the file. The information you can expect to see:

  • Basic properties. PDF version, page count, page sizes per page, file size, creation and modification dates, and the producer application (e.g. "Microsoft Word 2024" or "Adobe Acrobat 2023").
  • Fonts. The list of fonts used, whether each is embedded, subsetted, or only referenced, and its encoding. Missing fonts are a major source of rendering surprises.
  • Images and resources. Count and total size of images, their resolution, color space, compression (JPEG, JPEG 2000, CCITT, JBIG2), and the proportion of the file they occupy.
  • Structure. Bookmarks, named destinations, page labels, logical structure tree (important for accessibility), layers (OCGs), and attached files.
  • Security and permissions. Whether the file is encrypted, what encryption method, which operations are permitted (printing, copying, form filling).
  • Interactive elements. Form fields (AcroForm or XFA), annotations, JavaScript actions, links, multimedia objects.
  • Metadata. Both the DocInfo dictionary and any XMP stream. Our post on PDF metadata goes deeper on this.
  • Conformance. Whether the file claims conformance to PDF/A, PDF/X, PDF/UA, or another ISO sub-standard, and whether it actually meets the requirements.

Why analyze a PDF?

  • Debugging rendering problems. "Why does this look wrong in Acrobat?" almost always leads back to a missing font, a bad color space, or a transparency-flattening issue that analysis surfaces immediately.
  • Pre-press and print verification. Before sending a PDF to a commercial printer, confirm color spaces, font embedding, image resolution, and PDF/X conformance. An analysis pass catches most "file rejected" situations before they happen.
  • Compliance checking. PDF/A for archival, PDF/UA for accessibility, FDA for pharmaceutical submissions — each has a specific set of requirements, and analysis tells you whether the file meets them.
  • Security review. Is the document encrypted? What are its permissions? Are there hidden JavaScript actions? For documents received from outside an organization, a quick analysis is a basic security hygiene step.
  • Optimization planning. Before compressing or rewriting a file, analysis tells you where the bytes are: big raster images, unused fonts, orphaned resources, or simply a bloated page tree. That lets you target the real cause rather than reaching for a generic compress button.
  • Forensics and discovery. Legal and investigative workflows often need to know exactly what's in a PDF, including hidden elements, embedded files, and revision history in incremental updates.

Analysis methods

Method 1 — Adobe Acrobat's Preflight

Preflight is the most comprehensive PDF analysis tool available. In Acrobat Pro, open the document and go to Tools → Print Production → Preflight (or Edit → Preflight in newer builds). Preflight ships with dozens of built-in profiles — PDF/A-2b compliance, PDF/X-4 compliance, RGB-to-CMYK check, low-resolution image detection, and so on.

Run a profile and Preflight produces a detailed report: every issue, its severity, and the specific objects affected. You can click through a finding to jump to the page and object in question. It's slower than a quick summary tool, but for high-stakes work — a document going to press, or a submission that has to pass a compliance check — it's the right tool.

Document Properties (File → Properties) gives a faster, shallower view: version, security, fonts, initial view. For most routine questions, this panel is enough.

Method 2 — Analyze online, free

Mapsoft's Analyze PDF tool uploads a file and returns a structured report: version, page count, page size, security, fonts (with embedding status), image count and breakdown by type, form fields, bookmarks, linearisation status, and attached files. It's the quickest way to answer "what's in this PDF?" when you don't have Acrobat or don't want to open a file from an unknown sender.

Use Analyze PDF when you need a summary in thirty seconds. Use Preflight when you need an audit trail and issue-level detail.

Method 3 — Command line (pdfinfo, qpdf)

For scripting and automation, two standard tools cover most needs:

  • pdfinfo file.pdf (part of poppler-utils) outputs a compact summary: title, author, producer, creation date, pages, page size, encryption, tagged status, PDF version. Fast, portable, easy to grep.
  • qpdf --json file.pdf emits a full machine-readable dump of the file's object structure. Combined with jq, this is powerful: you can answer "which pages use non-embedded fonts?" or "how many images are larger than 1 MB?" from a shell one-liner.
  • exiftool file.pdf gives a deep metadata read, including XMP fields and custom schemas that other tools skip.

Common issues analysis surfaces

Some findings come up often enough to recognize at a glance:

  • Unembedded fonts. The PDF references a font but doesn't include it. Readers substitute a similar font, which shifts layout. Fix: re-export with "Embed all fonts" enabled.
  • RGB images in a CMYK workflow. Commercial print expects CMYK. RGB images need to be converted and may look different — verify color-critical images after conversion.
  • Low-resolution images. Anything below 150 DPI for color or 600 DPI for line art will look soft in print. Catch this before the file goes to press.
  • Large file size. Usually one of: huge un-downsampled images, unused fonts that weren't stripped, many revisions embedded as incremental updates, or uncompressed streams. Analysis shows you which. See also How to Reduce PDF File Size.
  • Security restrictions. Discovering at production time that a PDF is password-protected or print-restricted is painful. An early analysis catches it.
  • Old PDF version. A file declaring PDF 1.3 is likely missing features a modern pipeline expects — transparency, layers, modern fonts. Occasionally you need to rewrite to a newer version.
  • Non-conformance to a declared standard. A PDF that claims PDF/A-1b but doesn't actually meet the requirements will fail a validator. Preflight's matching profile tells you exactly which rules it breaks.

Analysis as a habit

PDFs that reach you come from hundreds of sources, each with its own generator, each with its own quirks. Running a quick analysis on anything you didn't create yourself — and certainly on anything you're about to print, sign, or submit — is cheap insurance. Thirty seconds with a summary tool like Mapsoft's Analyze PDF, or a few minutes with Preflight for anything higher-stakes, catches a surprising fraction of the issues that show up at print time or in a compliance review.

Frequently Asked Questions

What's inside a PDF file?

A header, a sequence of numbered objects (pages, fonts, images, forms, and the rest of the content), a cross-reference table that indexes those objects, and a trailer. Analysis tools read the object graph and report what they find without modifying anything.

Can I check what fonts a PDF uses?

Yes. Adobe Acrobat's File → Properties → Fonts tab lists every font and its embedding status. Mapsoft's Analyze PDF shows the same information in its online report. Command line: pdffonts from poppler-utils.

How do I find a PDF's security settings?

Acrobat: File → Properties → Security tab shows permissions and encryption method. Online tools report the same in their security section. Command line: pdfinfo prints an Encrypted: line when a file is protected.

Is there a free tool to analyze PDFs?

Several. Mapsoft's PDF Hub Analyze PDF is free to use online. Open-source options include pdfinfo, pdffonts, qpdf, and exiftool. Adobe Acrobat Reader (free) shows basic properties but not the full Preflight analysis that Acrobat Pro provides.

Will analyzing a PDF change the file?

No. Analysis is read-only by definition. It reads the file, builds a report, and leaves the original untouched. This is why it's safe to run on files from untrusted sources — as long as the tool itself is trustworthy.

Related Articles

PDF File Structure Explained: Headers, Objects & Cross-Reference Tables

The internal anatomy analysis tools read — essential background for interpreting what a report actually tells you.

PDF Document Metadata

Metadata (DocInfo and XMP) is one of the most useful outputs of an analysis pass. This post goes deeper on how it's structured.

PDF 2.0 vs 1.7: Key Differences Between Versions

Analysis tells you a PDF's version. This post explains what that version means for features and compatibility.

Try it yourself

Free online — no installation, no sign-up.