·8 min read·Blog

Markdown to HTML: The Conversion Gotchas That Cost Me an Afternoon

Markdown is almost a standard. The "almost" is where the problems live. Different parsers handle tables, nested lists, code blocks, and inline HTML differently enough that converting a large markdown document can surface a dozen rendering bugs you didn't expect.

The conversion that started this

I was migrating a documentation site from a Confluence wiki to a static site generator. The export gave me a folder of Markdown files generated by a Confluence-to-Markdown tool. The static site generator used a different Markdown parser. About 15% of the pages had rendering issues — tables that didn't render, code blocks that merged with adjacent text, links that broke.

I fixed them by running each file through the browser-based Markdown converter to see the HTML output before publishing, catching issues before they went live. This is the list of what I found.

Why Markdown isn't actually a standard

John Gruber's original Markdown spec from 2004 left many edge cases unspecified. Different parsers (CommonMark, GitHub Flavored Markdown, Pandoc, MultiMarkdown, Python-Markdown, marked.js) have made different choices for the ambiguous cases. CommonMark was created in 2014 specifically to standardize the spec, and GitHub Flavored Markdown (GFM) extends CommonMark with tables, task lists, strikethrough, and autolinks.

If you write Markdown for one system and render it in another, the differences in parser behavior will cause inconsistencies.

The five gotchas I ran into

1. Tables require pipe on both ends (GFM) but not always

GFM tables require a pipe character at the start and end of each row:

| Column 1 | Column 2 |
|----------|----------|
| Cell 1   | Cell 2   |

Some parsers are lenient and allow tables without outer pipes. Others require the alignment row separator to have at least three dashes per column (---). Confluence-exported Markdown sometimes omits the leading pipe, which caused every table on the migrated site to render as plain text paragraphs instead of HTML tables. Fix: add the leading pipe to every row.

2. Fenced code blocks: backtick count matters

Code blocks delimited by triple backticks work in most parsers. But if the code inside the block contains triple backticks (common in documentation about Markdown itself), you need to use more backticks in the fence:

````
This block contains ``` inside
````

Parsers that count backtick pairs correctly handle this. Parsers that don't will close the code block at the first triple-backtick inside the content, causing the rest to render as normal text.

3. Nested lists require consistent indentation

CommonMark requires 2-space or 4-space indentation for nested list items (depending on the parser). Confluence exports sometimes use 3 spaces for nesting. A 3-space indent is valid in some parsers and not in others. The symptom: a list that looks correct in the original system renders as a flat, unnested list in the destination.

Fix: standardize all list indentation to 2 spaces (CommonMark) or 4 spaces before migration. Never mix indentation levels in the same list.

4. Inline HTML: sanitized or passed through?

Markdown allows inline HTML, and most parsers pass it through to the rendered output. But some parsers (particularly those used in public-facing CMSes) sanitize or strip HTML tags for security reasons. If your Markdown contains inline <div>, <span>, or custom HTML elements and they disappear in the rendered output, the parser is sanitizing them.

This is a deliberate security choice, not a bug. If you need raw HTML in your output, you need a parser that allows it (marked.js with sanitize: false, for example) and you need to trust your input source.

5. Line breaks: trailing spaces vs double newline

In original Markdown, a hard line break within a paragraph requires two trailing spaces at the end of the line. CommonMark also supports a backslash at the end of the line. Many editors strip trailing whitespace automatically, which silently removes hard line breaks from Markdown files.

If your Markdown has prose that should break at specific points (addresses, poems, code examples written as prose) and the line breaks are disappearing in the rendered output, trailing spaces were stripped. The fix is to use the backslash line break (\at end of line) instead of trailing spaces, since backslashes survive editor formatting.

Checking output before publishing

The fastest way to verify Markdown renders correctly: paste it into the Markdown to HTML converter and switch between the rendered preview and raw HTML output. The raw HTML shows you exactly what the parser produced, including any wrapping tags, attribute handling, and element nesting.

For a large migration (dozens or hundreds of files), write a test script that runs each file through the target parser and checks for:

  • Files where the output contains fewer HTML elements than expected (a sign that a list or table failed to parse)
  • Files where <pre> or <code> tags appear in unexpected positions (code block boundary misalignment)
  • Files shorter than expected (truncation from an unclosed block)

Markdown flavors at a glance

FlavorTablesStrikethroughTask listsUsed by
CommonMarkNoNoNoBase spec
GFMYesYes (~~text~~)YesGitHub, Gitea
MultiMarkdownYesYesNoiA Writer
Pandoc MDYesYesNoPandoc

Related tools


Written by Achraf A., founder of TheFreeAITools — built in Morocco. The Confluence migration described above involved 340 pages; about 50 required manual fixes after automated conversion.

☕ Support Us