You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+10-9Lines changed: 10 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -51,6 +51,7 @@ Furthermore, GraphGen incorporates multi-hop neighborhood sampling to capture co
51
51
After data generation, you can use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) and [xtuner](https://github.com/InternLM/xtuner) to finetune your LLMs.
52
52
53
53
## 📌 Latest Updates
54
+
- 🎉 **2026.04.13**: The paper based on GraphGen, *Knowledge-to-Verification: Exploring RLVR for LLMs in Knowledge-Intensive Domains*, has been accepted to the **ACL 2026** Main Conference! Congratulations!
54
55
-**2026.02.04**: We support HuggingFace Datasets as input data source for data generation now.
55
56
-**2026.01.15**: **LLM benchmark synthesis** now supports single/multiple-choice & fill-in-the-blank & true-or-false—ideal for education 🌟🌟
We applied reinforcement learning directly to the Qwen2.5-7B base model without any prior SFT. Here are the results.
104
+
Reinforcement Learning with Verifiable Rewards (RLVR) has demonstrated promising potential to enhance the reasoning capabilities of large language models (LLMs). However, its applications on knowledge-intensive domains have not been effectively explored due to the scarcity of high-quality verifiable data. By leveraging **GraphGen** for automated verifiable data synthesis, we extend RLVR to these broader domains. We applied reinforcement learning directly to the Qwen2.5-7B base model using the synthesized data without any prior SFT. Here are the results.
0 commit comments