Skip to content
CONVERTER FAMILY

Markdown Converter — Office Documents for AI Workflows

Five browser-only tools that turn your files into Markdown — entirely in the browser, no upload, no signup. Built for modern AI workflows with Obsidian, Logseq, Hugo and Claude Code, but just as useful for plain-text note keeping.

Which converter for which format?

Format Inputs Typical use case Max file size Images
PDF PDF (digital + scanned with OCR) Bring reports, studies, books into AI workflows 100 MB Sidecar ZIP
DOCX Word documents (.docx) Move notes and manuscripts from Word into Obsidian/Logseq 50 MB Sidecar ZIP
XLSX/ODS Excel and OpenDocument (.xlsx, .xls, .ods) Export multi-sheet workbooks as pipe-tables 20 MB
HTML HTML files or paste snippets Bring web content and newsletters into note vaults 25 MB as URL refs
CSV/TSV CSV, TSV (comma, semicolon, tab) Hand off data exports as compact markdown tables 50 MB

Why Markdown now?

Markdown has become the lingua franca for AI agents and personal knowledge vaults. Each of the five conversions has a clear motivation:

  • Obsidian import PDF books and DOCX notes land in your vault with YAML frontmatter — no plugin, no external tool.
  • Claude-Code wiki Karpathy's LLM-wiki idea needs clean markdown files. The converters produce exactly the deterministic shape an LLM works with well.
  • RAG pipeline prep Before embedding a knowledge base: convert documents into structure first, then chunk. Markdown saves 30–60% tokens vs raw text.
  • Hugo / Astro / Jekyll Static sites need markdown. If the source content lives in Word, Excel or a PDF, conversion is the bridge.
  • Logseq migration Logseq expects markdown with its own syntax flavour — the frontmatter template here is compatible.
  • Newsletter archive Move email HTML into a searchable note collection. The HTML converter with paste mode covers that.

Tools at a glance

Privacy

There is no server upload. All five converters run only in your browser. You can keep using the tools offline once they have loaded. No tracking, no signup, no cookies beyond the strictly necessary.

Frequently asked questions

Why convert office documents to Markdown?
Markdown is text-based, version-controllable and parses cleanly in every AI agent and note tool. PDF and Word are binary containers an LLM has to unpack. Converting first saves tokens, speeds up answers, and makes content searchable.
Does this work offline or in the browser?
Yes, fully. Once the tool page has loaded, everything runs in the browser tab. You can keep the network tab open in the developer tools and verify — no server round-trip, no upload.
What happens with images in PDFs and Word documents?
If the source contains images, you get two download buttons: a plain `.md` file and a ZIP with `.md` plus an `images/` folder. The markdown file links images relatively — ready for Obsidian or Hugo imports.
Which converter handles Excel with multiple sheets?
The XLSX converter walks every sheet and writes each as its own H1 section with the corresponding pipe table. Merged cells keep their value in the top-left cell while the others stay empty — preserves table structure for reading.
Do Word documents keep lists and tables intact?
Yes — lists, tables, bold/italic, headings and footnotes all transfer. Deeply nested lists are harder than flat ones; occasional manual cleanup may be needed — that is a known lossy property of the path, not a bug.
What happens with scanned PDFs?
The PDF converter automatically detects when a PDF is image-only and offers OCR. The OCR model runs in the browser too — first activation downloads about 7 MB once, then stays local.
How large can a file be?
PDF up to 100 MB, DOCX up to 50 MB, XLSX up to 20 MB, HTML up to 25 MB, CSV up to 50 MB. On devices with less than 4 GB RAM the limits are halved so the tab does not crash.
What is YAML frontmatter and do I need it?
Frontmatter is a small YAML block at the start of a `.md` file with fields like `title`, `created` and `source`. Obsidian, Logseq, Hugo and Astro read it automatically. The converters write frontmatter by default — you can untick the box for a plain `.md`.
Are my files really not uploaded anywhere?
There is no server path. Parsing and conversion happen exclusively inside your browser tab. You can see this in the network tab of the developer tools — no data leaves your machine during conversion.
Which converter handles ODS, XLS or TSV?
The XLSX converter also accepts `.xls` and `.ods` directly — the underlying library reads both natively. `.tsv` files (tab-separated) belong to the CSV converter; it detects the delimiter automatically.