Skip to content

Goal.md Hallucinated References #238

@asimsinan

Description

@asimsinan

Stage 01 goal.md includes unverified / hallucinated references

Summary

Stage 01 (TOPIC_INIT) currently asks the model to provide recent papers for “trend validation” and then writes the raw LLM output directly into goal.md without grounding or verification.
From an LLM systems perspective, this is a predictable failure mode: the prompt requests highly specific bibliographic facts (paper title, year, venue, SOTA context), but the generation path has no retrieval, no identifier resolution, and no post-hoc validation. As a result, goal.md can contain a mix of:

  • real papers,
  • real papers with incorrect metadata,
  • plausible but fabricated references,
  • overconfident benchmark/SOTA claims.
    This is especially risky because Stage 01 is upstream of the rest of the pipeline and presents these citations as factual context, which can anchor later reasoning.
  1. The prompt induces citation hallucination
    Stage 01 explicitly asks for recent papers and benchmark/SOTA framing, which pushes the model toward precise factual recall under uncertainty.
  2. The output is treated as authoritative
    The generated goal.md is saved as a pipeline artifact and used as context downstream.
  3. Verification exists later, but not here
    The pipeline already has citation verification logic for references.bib / final paper stages, but Stage 01 bypasses that entirely. So the earliest planning artifact is currently the least trustworthy one.
  4. Even if later bibliography stages are verified, early hallucinated references can still shape problem framing, novelty claims, benchmark choice, and related-work positioning.

Evidence

1. Stage 01 prompt explicitly requests named recent papers

researchclaw/prompts.py:

TREND VALIDATION (MANDATORY):
- Identify 2-3 recent papers (2024-2026) that establish the relevance of this research direction.
- Name the specific benchmark/dataset that will be used for evaluation.
- If no standard benchmark exists, explain how results will be measured.
- State whether SOTA results exist on this benchmark and what they are.
- Add a 'Benchmark' subsection listing: name, source, metrics, current SOTA (if known).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions