Goal.md Hallucinated References

# Stage 01 `goal.md` includes unverified / hallucinated references
## Summary
Stage 01 (`TOPIC_INIT`) currently asks the model to provide recent papers for “trend validation” and then writes the raw LLM output directly into `goal.md` without grounding or verification.
From an LLM systems perspective, this is a predictable failure mode: the prompt requests highly specific bibliographic facts (paper title, year, venue, SOTA context), but the generation path has no retrieval, no identifier resolution, and no post-hoc validation. As a result, `goal.md` can contain a mix of:
- real papers,
- real papers with incorrect metadata,
- plausible but fabricated references,
- overconfident benchmark/SOTA claims.
This is especially risky because Stage 01 is upstream of the rest of the pipeline and presents these citations as factual context, which can anchor later reasoning.


1. **The prompt induces citation hallucination**
   Stage 01 explicitly asks for recent papers and benchmark/SOTA framing, which pushes the model toward precise factual recall under uncertainty.
2. **The output is treated as authoritative**
   The generated `goal.md` is saved as a pipeline artifact and used as context downstream.
3. **Verification exists later, but not here**
   The pipeline already has citation verification logic for `references.bib` / final paper stages, but Stage 01 bypasses that entirely. So the earliest planning artifact is currently the least trustworthy one.
4. Even if later bibliography stages are verified, early hallucinated references can still shape problem framing, novelty claims, benchmark choice, and related-work positioning.
## Evidence
### 1. Stage 01 prompt explicitly requests named recent papers
`researchclaw/prompts.py`:
```text
TREND VALIDATION (MANDATORY):
- Identify 2-3 recent papers (2024-2026) that establish the relevance of this research direction.
- Name the specific benchmark/dataset that will be used for evaluation.
- If no standard benchmark exists, explain how results will be measured.
- State whether SOTA results exist on this benchmark and what they are.
- Add a 'Benchmark' subsection listing: name, source, metrics, current SOTA (if known).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Goal.md Hallucinated References #238

Stage 01 `goal.md` includes unverified / hallucinated references

Summary

Evidence

1. Stage 01 prompt explicitly requests named recent papers

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Goal.md Hallucinated References #238

Description

Stage 01 goal.md includes unverified / hallucinated references

Summary

Evidence

1. Stage 01 prompt explicitly requests named recent papers

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Stage 01 `goal.md` includes unverified / hallucinated references