How do you use this tool?
- Drag a DOCX file in or use the file picker — up to 25 MB per file
- Check the live preview and adjust options if needed
- Click 'Download Markdown' — multiple files come back as a ZIP with audit report
Why convert DOCX to Markdown?
Microsoft Word documents are the standard format for business and knowledge content. As long as they stay in Word, everything is fine — but the moment a second platform enters the picture (Obsidian vault, Hugo site, Claude Code documentation, RAG index, Logseq graph), DOCX becomes a dead end. Markdown is plain-text based, diffable, readable in every editor, and the format that almost every AI and knowledge pipeline in 2026 understands natively.
This tool builds the bridge. Drop in a .docx file, the tool opens the
Office Open XML archive in the browser, reads headings, paragraphs, lists,
tables and inline formatting, and emits
GitHub Flavored Markdown.
How does the conversion work?
DOCX files are ZIP archives containing XML documents — that’s Office Open XML, the Word standard format since 2007. An established open-source library unzips the archive and walks the XML structure in a two-stage pass: first into semantic HTML (with heading levels, lists, tables and inline formats preserved), then from HTML into Markdown.
Word style definitions are matched: Heading 1 → # , Heading 2 →
## , bullet lists → - , numbered lists → 1. , bold and italic
become ** and *. Images are extracted from the media/ folder of the
ZIP and saved next to the Markdown file as referenced files.
What is the tool actually used for?
- Obsidian wiki from Word briefings. Briefings, onboarding docs, internal guides become Markdown files with backlinks.
- Claude Code wiki from architecture docs. Architecture specs in Word
end up as
.mdfiles alongside the code modules. - Hugo / Astro content migration. Existing Word content libraries become static-site content.
- RAG index preparation. Markdown chunks cleanly along heading boundaries — DOCX structures would have to be extracted first.
- Logseq block import. Heading hierarchies turn into Logseq block structures.
What stays — what doesn’t?
Preserved: heading hierarchies (# through ######), paragraphs,
ordered and unordered lists with nesting, simple tables as GFM pipe
tables, inline formats (bold, italic, strikethrough, inline code),
hyperlinks with anchor text, images as external file references,
footnotes and endnotes.
Deliberately dropped: tracked changes, comments, formatting features without a Markdown equivalent (custom fonts, text colors, specific pixel spacings), headers and footers, page margins, Word form fields, embedded objects (OLE embeds, Excel inserts).
Markdown has a deliberately minimal formatting palette — what Word offers goes far beyond that, and much of it is noise inside a knowledge base anyway. We convert the meaningful, not the visual.
How does the tool keep my Word document private?
Most free DOCX-to-Markdown services upload your Word file to a server. For confidential briefings, contracts, internal drafts or HR records that is rarely acceptable. Even if the provider claims to delete the file after 24 hours — the upload itself is the issue.
None of that happens here. The DOCX archive is unzipped and parsed
inside the browser tab via web standards (fetch, WebAssembly,
native HTML parser). Open the Network panel of your developer tools and
verify: no file leaves your machine. The conversion itself also runs
fully offline after the first page load.
Which related converters exist?
This tool is part of the Markdown converter family:
- PDF to Markdown — including scanned PDFs via OCR.
- HTML to Markdown — file or paste mode.
- XLSX to Markdown — Excel including XLS and ODS.
- CSV to Markdown — also TSV with delimiter auto-detection.
Last updated: