Skip to content
Runs local · no upload

HTML to Markdown

Converts HTML files or pasted snippets into Markdown — with sanitize pass and GFM output.

How It Works

  1. 01

    Bring HTML in

    File via drag & drop or HTML directly in the paste box — both modes share the same path.

  2. 02

    Check sanitize

    Scripts and inline handlers are stripped by default. External resource links stay as Markdown links.

  3. 03

    Take the Markdown

    A live preview shows the result. Copy to clipboard with the button or download as `.md`.

Privacy

There is no server path. HTML is parsed, sanitized and turned into Markdown inside your browser tab. The paste box also doesn't talk to any servers — content stays in the tab.

HTML is everywhere — in web exports, email source, note-app backups, scraped content. Markdown is the format in which knowledge gets maintained long-term. This tool takes HTML — either as a file or via paste — and returns clean Markdown. Scripts and external resources are stripped during the sanitize pass; heading hierarchies, lists, tables and inline formats stay. Everything runs in your browser tab.

01 — How to Use

How do you use this tool?

  1. Drag an HTML file in — or paste HTML source directly into the paste box
  2. Optional: review sanitize options (strip scripts, keep external images?)
  3. Click 'Convert' and download the Markdown or grab it via the copy button

Why convert HTML to Markdown?

HTML is the source-code format of the web — and very often the data form in which you find knowledge: an email newsletter that’s only searchable as HTML source; a note backup from Evernote or OneNote; a scraped article; a CMS post export. Markdown, by contrast, is the format in which you maintain knowledge long-term — diffable, plain-text, readable in any editor, natively understood by Obsidian and Logseq.

This tool builds the bridge. Drop in an HTML file or HTML source code, the tool parses the DOM structure, sanitizes scripts and inline handlers out, and writes GitHub Flavored Markdown with headings, lists, tables and inline formats preserved.

How does the conversion work technically?

HTML is parsed into a DOM tree via the browser’s native HTML parser — the same mechanism every web page uses internally, but in sandboxed mode without script execution. A sanitize pass strips <script> tags, <style> blocks, inline event handlers, and <iframe> embeds before further processing. That makes it safe to convert even scraped or foreign HTML files.

A proven open-source HTML-to-Markdown library translates the DOM tree into Markdown: headings (<h1>#, <h2>##), paragraphs (<p> → empty line between blocks), lists (<ul>-, <ol>1.), tables (<table> → GFM pipe tables), inline formats (<strong>**, <em>*, <code>`).

What are typical use cases?

  • Archive email newsletters. Newsletters in HTML format come into your Obsidian vault as Markdown.
  • Prepare web content for AI prompts. A long article becomes Markdown that fits inside your Claude or GPT prompt.
  • Note migration from Evernote / OneNote. HTML note exports land as clean .md files in your new note system.
  • Blog migration to Hugo / Astro. Existing HTML posts become Markdown posts with frontmatter that static-site generators understand.
  • Wiki content from Confluence exports. HTML exports from Confluence/SharePoint become Markdown pages for Notion alternatives.

What stays — what doesn’t?

Preserved: heading hierarchies (<h1> through <h6>), paragraphs, ordered and unordered lists with nesting, tables without merges as GFM pipe tables, inline formats (bold, italic, inline code, strikethrough), hyperlinks with anchor text, block quotes, code blocks with language hint, images as Markdown references (original src is kept).

Deliberately stripped: scripts, <style> blocks, inline event handlers, external iframes (sanitize pass). That’s non-negotiable — especially with scraped or clipped HTML, it protects against hidden script payloads.

Not 1:1 convertible: tables with colspan/rowspan (hint block, because Markdown pipe tables don’t support cell merging), CSS layout tricks, JavaScript-generated content (the sanitize pass runs before script execution — JS content is not rendered).

How does the tool keep my HTML private?

When you’re parsing an HTML export from a private note app, the last thing you want is the content going to a foreign server. Newsletter archives can also carry confidential tracking IDs or personal greeting fields embedded in the HTML.

None of that happens here. The HTML is parsed and converted to Markdown inside the browser tab via web standards (native HTML parser, WebAssembly). Open the Network panel of your developer tools and watch: no request, no upload, no server communication. The paste box runs entirely client-side as well.

This tool is part of the Markdown converter family:

Last updated:

You might also like