Question 1

Why convert office documents to Markdown?

Accepted Answer

Markdown is text-based, version-controllable and parses cleanly in every AI agent and note tool. PDF and Word are binary containers an LLM has to unpack. Converting first saves tokens, speeds up answers, and makes content searchable.

Question 2

Does this work offline or in the browser?

Accepted Answer

Yes, fully. Once the tool page has loaded, everything runs in the browser tab. You can keep the network tab open in the developer tools and verify — no server round-trip, no upload.

Question 3

What happens with images in PDFs and Word documents?

Accepted Answer

If the source contains images, you get two download buttons: a plain `.md` file and a ZIP with `.md` plus an `images/` folder. The markdown file links images relatively — ready for Obsidian or Hugo imports.

Question 4

Which converter handles Excel with multiple sheets?

Accepted Answer

The XLSX converter walks every sheet and writes each as its own H1 section with the corresponding pipe table. Merged cells keep their value in the top-left cell while the others stay empty — preserves table structure for reading.

Question 5

Do Word documents keep lists and tables intact?

Accepted Answer

Yes — lists, tables, bold/italic, headings and footnotes all transfer. Deeply nested lists are harder than flat ones; occasional manual cleanup may be needed — that is a known lossy property of the path, not a bug.

Question 6

What happens with scanned PDFs?

Accepted Answer

The PDF converter automatically detects when a PDF is image-only and offers OCR. The OCR model runs in the browser too — first activation downloads about 7 MB once, then stays local.

Question 7

How large can a file be?

Accepted Answer

PDF up to 100 MB, DOCX up to 50 MB, XLSX up to 20 MB, HTML up to 25 MB, CSV up to 50 MB. On devices with less than 4 GB RAM the limits are halved so the tab does not crash.

Question 8

What is YAML frontmatter and do I need it?

Accepted Answer

Frontmatter is a small YAML block at the start of a `.md` file with fields like `title`, `created` and `source`. Obsidian, Logseq, Hugo and Astro read it automatically. The converters write frontmatter by default — you can untick the box for a plain `.md`.

Question 9

Are my files really not uploaded anywhere?

Accepted Answer

There is no server path. Parsing and conversion happen exclusively inside your browser tab. You can see this in the network tab of the developer tools — no data leaves your machine during conversion.

Question 10

Which converter handles ODS, XLS or TSV?

Accepted Answer

The XLSX converter also accepts `.xls` and `.ods` directly — the underlying library reads both natively. `.tsv` files (tab-separated) belong to the CSV converter; it detects the delimiter automatically.

Format	Inputs	Typical use case	Max file size	Images
PDF	PDF (digital + scanned with OCR)	Bring reports, studies, books into AI workflows	100 MB	Sidecar ZIP
DOCX	Word documents (.docx)	Move notes and manuscripts from Word into Obsidian/Logseq	50 MB	Sidecar ZIP
XLSX/ODS	Excel and OpenDocument (.xlsx, .xls, .ods)	Export multi-sheet workbooks as pipe-tables	20 MB	—
HTML	HTML files or paste snippets	Bring web content and newsletters into note vaults	25 MB	as URL refs
CSV/TSV	CSV, TSV (comma, semicolon, tab)	Hand off data exports as compact markdown tables	50 MB	—

Markdown Converter — Office Documents for AI Workflows

Which converter for which format?

Why Markdown now?

Tools at a glance

Privacy

Frequently asked questions