Markdown Converter — Office Documents for AI Workflows
Five browser-only tools that turn your files into Markdown — entirely in the browser, no upload, no signup. Built for modern AI workflows with Obsidian, Logseq, Hugo and Claude Code, but just as useful for plain-text note keeping.
Which converter for which format?
| Format | Inputs | Typical use case | Max file size | Images |
|---|---|---|---|---|
| PDF (digital + scanned with OCR) | Bring reports, studies, books into AI workflows | 100 MB | Sidecar ZIP | |
| DOCX | Word documents (.docx) | Move notes and manuscripts from Word into Obsidian/Logseq | 50 MB | Sidecar ZIP |
| XLSX/ODS | Excel and OpenDocument (.xlsx, .xls, .ods) | Export multi-sheet workbooks as pipe-tables | 20 MB | — |
| HTML | HTML files or paste snippets | Bring web content and newsletters into note vaults | 25 MB | as URL refs |
| CSV/TSV | CSV, TSV (comma, semicolon, tab) | Hand off data exports as compact markdown tables | 50 MB | — |
Why Markdown now?
Markdown has become the lingua franca for AI agents and personal knowledge vaults. Each of the five conversions has a clear motivation:
- Obsidian import PDF books and DOCX notes land in your vault with YAML frontmatter — no plugin, no external tool.
- Claude-Code wiki Karpathy's LLM-wiki idea needs clean markdown files. The converters produce exactly the deterministic shape an LLM works with well.
- RAG pipeline prep Before embedding a knowledge base: convert documents into structure first, then chunk. Markdown saves 30–60% tokens vs raw text.
- Hugo / Astro / Jekyll Static sites need markdown. If the source content lives in Word, Excel or a PDF, conversion is the bridge.
- Logseq migration Logseq expects markdown with its own syntax flavour — the frontmatter template here is compatible.
- Newsletter archive Move email HTML into a searchable note collection. The HTML converter with paste mode covers that.
Tools at a glance
PDF Bring reports, studies, books into AI workflows Open DOCX Move notes and manuscripts from Word into Obsidian/Logseq Open XLSX/ODS Export multi-sheet workbooks as pipe-tables Open HTML Bring web content and newsletters into note vaults Open CSV/TSV Hand off data exports as compact markdown tables Open
Privacy
There is no server upload. All five converters run only in your browser. You can keep using the tools offline once they have loaded. No tracking, no signup, no cookies beyond the strictly necessary.
Frequently asked questions
- Why convert office documents to Markdown?
- Markdown is text-based, version-controllable and parses cleanly in every AI agent and note tool. PDF and Word are binary containers an LLM has to unpack. Converting first saves tokens, speeds up answers, and makes content searchable.
- Does this work offline or in the browser?
- Yes, fully. Once the tool page has loaded, everything runs in the browser tab. You can keep the network tab open in the developer tools and verify — no server round-trip, no upload.
- What happens with images in PDFs and Word documents?
- If the source contains images, you get two download buttons: a plain `.md` file and a ZIP with `.md` plus an `images/` folder. The markdown file links images relatively — ready for Obsidian or Hugo imports.
- Which converter handles Excel with multiple sheets?
- The XLSX converter walks every sheet and writes each as its own H1 section with the corresponding pipe table. Merged cells keep their value in the top-left cell while the others stay empty — preserves table structure for reading.
- Do Word documents keep lists and tables intact?
- Yes — lists, tables, bold/italic, headings and footnotes all transfer. Deeply nested lists are harder than flat ones; occasional manual cleanup may be needed — that is a known lossy property of the path, not a bug.
- What happens with scanned PDFs?
- The PDF converter automatically detects when a PDF is image-only and offers OCR. The OCR model runs in the browser too — first activation downloads about 7 MB once, then stays local.
- How large can a file be?
- PDF up to 100 MB, DOCX up to 50 MB, XLSX up to 20 MB, HTML up to 25 MB, CSV up to 50 MB. On devices with less than 4 GB RAM the limits are halved so the tab does not crash.
- What is YAML frontmatter and do I need it?
- Frontmatter is a small YAML block at the start of a `.md` file with fields like `title`, `created` and `source`. Obsidian, Logseq, Hugo and Astro read it automatically. The converters write frontmatter by default — you can untick the box for a plain `.md`.
- Are my files really not uploaded anywhere?
- There is no server path. Parsing and conversion happen exclusively inside your browser tab. You can see this in the network tab of the developer tools — no data leaves your machine during conversion.
- Which converter handles ODS, XLS or TSV?
- The XLSX converter also accepts `.xls` and `.ods` directly — the underlying library reads both natively. `.tsv` files (tab-separated) belong to the CSV converter; it detects the delimiter automatically.