Skip to content

Latest commit

 

History

History
241 lines (206 loc) · 12 KB

File metadata and controls

241 lines (206 loc) · 12 KB

A Fundamentally Different Approach to AI Slide Creation

The Problem

Every current approach conflates two different tasks:

Task LLM Capability Current Approach
Content reasoning — what should the slide say, which data matters, what's the thesis World-class LLM does this well
Spatial rendering — pixel-perfect placement, proportional sizing, color accuracy, EMU coordinates Terrible LLM does this badly

Claude for PowerPoint tries to do both. The plugin's python-pptx skills try to do both. Our HTML → PPTX pipeline partially separates them. But none fully solves the problem.

The Insight

Professional slides are structured data rendered through templates. They are NOT creative art.

A Goldman Sachs DCF valuation slide is:

{
  "template": "gs_financial_table",
  "title": "Preliminary Discounted Cash Flow Analysis",
  "data": {
    "columns": ["2023A", "2024A", "2025A", "2026E", "2027E"],
    "rows": [
      {"label": "Revenue", "values": [1058651, 1154599, 1422530, 1564783, 1689966], "format": "number", "style": "bold"},
      {"label": "% Growth", "values": [0.113, 0.091, 0.232, 0.100, 0.080], "format": "percent", "style": "italic_gray"}
    ]
  },
  "source": "Source: Company filings, FactSet, GS estimates",
  "confidential": true,
  "slide_number": 15
}

A McKinsey bubble matrix is:

{
  "template": "mckinsey_bubble_matrix",
  "title": "Most institutions are testing GenAI use cases, with super regionals leading deployment",
  "subtitle": "Number of use cases by institution type and development stage",
  "y_axis": ["Mega banks (8)", "Super regionals (7)", "Core regionals (13)", "Other (5)"],
  "x_axis": ["Ideation", "Proof of Concept", "Pilot", "Full Deployment", "Discontinued"],
  "data": [[30, 9, 9, 4, 2], [15, 12, 6, 6, 0], [54, 13, 13, 6, 0], [8, 9, 6, 4, 0]],
  "circle_color": "#1B2A4A",
  "label_color": "#FFFFFF"
}

The LLM's job is to produce this JSON. A deterministic rendering engine converts it to PPTX.

The Architecture

┌─────────────────────────────────────────────────────┐
│  Layer 1: CONTENT REASONING (LLM)                   │
│                                                     │
│  Input: "Build a DCF valuation summary for ADUS"    │
│  + adus_10k_master.json (data context)              │
│  + slide_type_catalog.json (available templates)    │
│                                                     │
│  Output: Structured slide JSON (content + template  │
│  selection + data binding)                          │
│                                                     │
│  The LLM NEVER specifies x/y coordinates, font     │
│  sizes, EMU values, or shape IDs. It specifies      │
│  WHAT, not WHERE.                                   │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│  Layer 2: TEMPLATE LIBRARY (Deterministic)          │
│                                                     │
│  Pre-built, pixel-perfect templates for each        │
│  common slide pattern:                              │
│                                                     │
│  IB Templates:                                      │
│  ├─ gs_cover              (GS cover format)         │
│  ├─ gs_section_divider    (navy bg, white text)     │
│  ├─ gs_financial_table    (right-aligned numbers)   │
│  ├─ gs_bar_chart          (proportional bars)       │
│  ├─ gs_sensitivity_grid   (highlighted base case)   │
│  ├─ moelis_key_terms      (navy pills + content)    │
│  ├─ moelis_cover          (circle + navy rectangle) │
│  ├─ ib_su_table           (sources & uses)          │
│  ├─ ib_football_field     (valuation range)         │
│  └─ ib_buyer_universe     (color-coded grid)        │
│                                                     │
│  Consulting Templates:                              │
│  ├─ mckinsey_cover        (gradient + curves)       │
│  ├─ mckinsey_split_layout (chart left, bullets right)│
│  ├─ mckinsey_color_table  (intensity-coded cells)   │
│  ├─ mckinsey_bubble_matrix(sized circles in grid)   │
│  ├─ mckinsey_contents     (cyan bg, outlined item)  │
│  ├─ mckinsey_dark_bars    (dark bg + orange badges)  │
│  └─ bcg_matrix            (2x2 strategic framework) │
│                                                     │
│  Each template is a python-pptx function that takes │
│  a JSON schema and produces a pixel-perfect slide.  │
│  Templates are HAND-CRAFTED once by a designer,     │
│  then reused infinitely.                            │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│  Layer 3: RENDERING ENGINE (Deterministic)          │
│                                                     │
│  For each slide in the JSON array:                  │
│  1. Look up template by name                        │
│  2. Validate data against template schema           │
│  3. Render PPTX using template function             │
│  4. Verify: right-alignment, proportionality,       │
│     table objects (not text), color accuracy         │
│                                                     │
│  Output: .pptx file that opens cleanly in PowerPoint│
│  (uses xlsxwriter-equivalent approach for PPTX)     │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│  Layer 4: VERIFICATION (Deterministic)              │
│                                                     │
│  - Number provenance: every value traces to source  │
│  - Table verification: all tables are table objects  │
│  - Proportionality: bar heights match data ratios   │
│  - Text extraction: no Unicode corruption           │
│  - Color accuracy: hex values match spec            │
│  - Layout: no element overlap                       │
└─────────────────────────────────────────────────────┘

Why This Is Different

Current Approach (Plugin)

User: "Build a DCF summary slide"
  → LLM reasons about content AND tries to write python-pptx code
  → LLM guesses x=Inches(2.5), y=Inches(3.1), width=Inches(4.8)
  → Numbers not right-aligned
  → Bars not proportional
  → Table is text with tabs, not a table object

Proposed Approach

User: "Build a DCF summary slide"
  → LLM selects template: "gs_financial_table"
  → LLM fills schema: {title: "...", data: [...], source: "..."}
  → Rendering engine maps data → pre-built pixel-perfect template
  → Numbers are ALWAYS right-aligned (template guarantees it)
  → Bars are ALWAYS proportional (template function computes heights)
  → Tables are ALWAYS real table objects (template creates them correctly)

The LLM no longer needs to be good at spatial layout. It only needs to be good at:

  1. Choosing the right template
  2. Writing the right action title
  3. Selecting the right data
  4. Sourcing numbers correctly

The Template Library Is the Product

The real competitive moat isn't the LLM — it's the template library.

Banks pay $50K+/year for tools like UpSlide, Macabacus, and Templafy precisely because they provide PIXEL-PERFECT TEMPLATES that guarantee formatting consistency. If Anthropic builds a library of 30-50 templates covering the common IB + consulting slide patterns, Claude becomes the first AI tool that produces bank-ready output.

Minimum Viable Template Library (30 templates)

IB Core (12):

  1. Cover (GS style)
  2. Cover (Moelis style)
  3. Disclaimer
  4. Table of Contents / Agenda
  5. Section Divider
  6. Financial Data Table (right-aligned)
  7. Sensitivity Grid (highlighted base case)
  8. Sources & Uses
  9. Key Transaction Terms (pill labels)
  10. Buyer Universe Grid
  11. Process Timeline
  12. Football Field (valuation range)

IB Charts (6): 13. Bar Chart (proportional, with data labels) 14. Stacked Bar Chart (with logos) 15. Line Chart (dual axis) 16. Waterfall Chart (bridge) 17. Pie/Donut Chart 18. Scatter Plot

Consulting Core (8): 19. Cover (McKinsey gradient style) 20. Contents (cyan/colored background) 21. Split Layout (chart + bullets) 22. Color-Coded Data Table 23. Bubble Matrix 24. Dark Background with Callout Badges 25. Framework Slide (2x2 matrix) 26. Executive Summary (dense bullets)

Universal (4): 27. Appendix Cover 28. Contact/Next Steps 29. Blank with Title Only 30. Full-Bleed Image with Overlay Text

How This Changes the Survey Answer

Current answer: "Claude's financial cognition is elite but its spatial execution hits a hard ceiling."

Better answer: "The product architecture is wrong. Claude shouldn't be writing python-pptx code or Office JS calls — it should be selecting from a template library and filling structured schemas. The LLM handles content reasoning (action titles, data selection, narrative). A deterministic rendering engine handles spatial execution (pixel placement, right-alignment, proportional sizing). The template library is the product. Banks pay $50K+/year for UpSlide templates. Anthropic should build 30 templates covering IB + consulting patterns and let Claude fill them. Every slide produced would be pixel-perfect by construction, not by luck."

Implementation Path

Phase 1: Prove the concept (1 week)

  • Build 5 templates (GS cover, financial table, sensitivity grid, McKinsey split layout, Moelis key terms)
  • Define the JSON schema for each
  • Write the rendering engine (python-pptx functions)
  • Have Claude produce the JSON from natural language prompts
  • Compare output quality against current plugin

Phase 2: Scale the library (1 month)

  • Build remaining 25 templates from real bank/consulting deck analysis
  • Add template customization (swap brand colors, fonts)
  • Build template creation tool (designer builds template → auto-generates JSON schema)

Phase 3: Integrate with data layer (2 months)

  • Connect to /extract command (Phase 1 of our pipeline)
  • Auto-populate templates from master JSON
  • Add cross-slide consistency verification
  • Build /deck command that produces a full 12-slide presentation from a single prompt

Why Banks Would Pay For This

Current state: A first-year analyst spends 6-8 hours formatting a 30-slide board presentation. The analysis takes 2 hours. The formatting takes 4-6 hours.

With template-based AI: The analyst tells Claude "Build a 30-slide board deck for ADUS with DCF, LBO, buyer list, and process timeline." Claude produces the structured JSON in 30 seconds. The rendering engine produces pixel-perfect slides in 5 seconds. The analyst reviews for 30 minutes.

Time savings: 6 hours → 30 minutes. Per deck. Per deal. Per week.

That's the value proposition that justifies enterprise pricing. And it only works if the output is pixel-perfect by construction — which requires the template architecture, not the "LLM tries to draw slides" approach.