For now, we use various PDF-MD converters.
We will concentrate on docling, but we have to modify the scripts to
- combine tables that span multiple pages
- incorporate footnotes
For the current test case it's susfficient to manually re-build the table + footnotes, but future scripts should consolidate those automatically.
See:
- Part one table 22: /prj/doctoral_letters/guide/data/guidelines/docling/pdf_pages/_62/tables/table_000.json
- Part two table 22: /prj/doctoral_letters/guide/data/guidelines/docling/pdf_pages/_63/tables/table_000.json
- Footnotes: /prj/doctoral_letters/guide/data/guidelines/markdown/pages/_63.md
For now, we use various PDF-MD converters.
We will concentrate on docling, but we have to modify the scripts to
For the current test case it's susfficient to manually re-build the table + footnotes, but future scripts should consolidate those automatically.
See: