Skip to content

prepareList has O(n^2) complexity from per-item events.splice #49

@heiskr

Description

@heiskr

Initial checklist

Affected package

mdast-util-from-markdown@2.0.3

Steps to reproduce

import {fromMarkdown} from 'mdast-util-from-markdown'

// Generate a document with many list items.
const n = 5000
const doc = Array.from({length: n}, (_, i) => `* item ${i}`).join('\n')

const t0 = performance.now()
fromMarkdown(doc)
const t1 = performance.now()

console.log(`${n} items: ${(t1 - t0).toFixed(0)} ms`)

On my machine, 5000 items takes ~1400 ms.
10000 items takes ~5500 ms. 4x time for 2x input, so quadratic.

We hit this on docs.github.com. We're the GitHub Docs team. Our GraphQL objects reference page has ~6400 list items and ~185000 events. prepareList takes ~3.2s locally and ~10s in production on that page. remarkParse is 84-97% of render time on our autogenerated API reference pages.

Actual behavior

prepareList calls events.splice() twice per list item. Once for exit, once for enter. Each splice shifts every element after the insertion point. Processing n items does O(n^2) element copies total.

With 5000 list items the events array is ~30000 entries. Each splice shifts most of that tail.

Expected behavior

Linear time. Collect the insertions during the walk, apply them in a single backward merge pass. Same output.

A patch doing this brings 5000 items from ~1400 ms to ~500 ms:

https://github.com/github/docs/blob/main/patches/mdast-util-from-markdown%2B2.0.3.patch

The patch also tightens the backward line-ending scan to while (tailIndex-- > start) instead of while (tailIndex--) to prevent walking past the current list's start.

Runtime

node@24.13.0

Operating system

macOS Tahoe 26.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions