Skip to content

haynafi/scholar-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ScholarAI — Academic Research Search Engine

AI-enhanced academic paper discovery with hybrid ranking
Powered by OpenAlex (260M+ scholarly works) · Scopus verification · Local AI summaries

Papers AI Free MIT


Why ScholarAI?

Large language models like ChatGPT and Gemini often hallucinate citations — generating fake paper titles, non-existent DOIs, and fabricated authors. ScholarAI solves this by searching real academic databases directly.

ChatGPT / Gemini Google Scholar ScholarAI
Real papers with DOIs ❌ Often hallucinated
Live citation counts
AI-powered summaries ✅ (but fake sources) (real source + AI)
Scopus indexed badge
Citation export (BibTeX/APA) ❌ Unreliable ✅ Formatted correctly
Excel bulk export
Advanced filters ❌ Limited ❌ Basic ✅ Year, citations, type, author, OA
Hybrid ranking algorithm ❌ Opaque ✅ Relevance 50% + Citations 30% + Recency 20%
Open source & self-hosted

Features

  • Hybrid Ranking — Papers scored by relevance (50%), citation count (30%), and recency (20%)
  • AI Summaries — Local Llama/Ollama integration (free, private, no API key needed)
  • Scopus Verification — Shows "Scopus Indexed" badge with direct Scopus links
  • Advanced Filters — Year range, minimum citations, paper type, author name, open access only
  • Citation Export — BibTeX and APA format, single paper or batch
  • Excel Export — Download full search results as .xlsx
  • Rate Limiting — Built-in abuse protection for production deployment
  • Security Headers — HSTS, X-Frame-Options, XSS protection, CORS hardening

Architecture

┌─────────────┐     ┌──────────────────────┐     ┌─────────────────┐
│  Frontend   │────▶│   Backend (FastAPI)   │────▶│   OpenAlex API  │
│  Next.js    │     │                      │     │   260M+ papers  │
│  Tailwind   │     │  ┌────────────────┐  │     └─────────────────┘
│  shadcn/ui  │     │  │ Ollama (Llama) │  │     ┌─────────────────┐
└─────────────┘     │  │ AI Summaries   │  │────▶│   Scopus API    │
                    │  └────────────────┘  │     │   Verification  │
                    └──────────────────────┘     └─────────────────┘

Quick Start

Prerequisites

  • Python 3.10+
  • Node.js 18+
  • Ollama (optional, for AI summaries) — ollama.com

1. Clone & Configure

git clone https://github.com/YOUR_USERNAME/scholar-ai.git
cd scholar-ai

# Create environment file
cp .env.example .env
# Edit .env with your settings

2. Backend

pip install -r requirements.txt
python main.py

API runs at http://localhost:9999 · Docs at http://localhost:9999/docs

3. Frontend

cd web
npm install
npm run dev

Frontend runs at http://localhost:3000

4. AI Summaries (Optional)

# Install Ollama from https://ollama.com, then:
ollama pull llama3.1

The backend auto-detects Ollama and enables AI summaries — no configuration needed.

Configuration

Copy .env.example to .env and customize:

Variable Description Default
APP_ENV development or production development
LLM_PROVIDER auto, ollama, or openai auto
OLLAMA_MODEL Ollama model name llama3.2
SCOPUS_API_KEY Free key from dev.elsevier.com
OPENALEX_EMAIL Email for faster API rate limits
CORS_ORIGINS Allowed frontend origins *

See .env.example for all options including rate limits and OpenAI fallback.

API Endpoints

Method Endpoint Description Rate Limit
GET /search Search papers with hybrid ranking 30/min
GET /scopus/check Check Scopus indexing by DOI 20/min
POST /summarize AI paper summary (Ollama/OpenAI) 10/min
POST /cite Generate BibTeX or APA citation 30/min
POST /cite/batch Batch citation export 5/min
GET /export Export results to Excel (.xlsx) 5/min
GET /trending Publication trends by field 30/min
GET /health System status & feature availability 60/min

Tech Stack

  • Backend: Python 3, FastAPI, slowapi (rate limiting)
  • Frontend: Next.js, TypeScript, Tailwind CSS, shadcn/ui, Lucide Icons
  • AI: Ollama + Llama 3 (local, free) · OpenAI (optional fallback)
  • Data: OpenAlex API (260M+ works) · Scopus API (indexing verification)

Production Deployment

For production, set in .env:

APP_ENV=production
CORS_ORIGINS=https://your-domain.com

Production mode enables:

  • Security headers (HSTS, X-Frame-Options, CSP)
  • API docs hidden
  • Binds to 0.0.0.0 for reverse proxy / tunnel access
  • Structured logging with request timing

Data Sources & Credibility

All paper metadata comes from OpenAlex — a free, open catalog of 260M+ scholarly works maintained by a nonprofit. Data is sourced from Crossref, PubMed, DOAJ, and publisher feeds. Every paper has a verifiable DOI.

AI summaries are the only generated content and are clearly labeled. Paper metadata is never hallucinated.

Citation: Priem, J., Piwowar, H., & Orr, R. (2022). OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. ArXiv. https://arxiv.org/abs/2205.01833

License

MIT — free for academic and commercial use.

About

AI-enhanced academic paper search engine powered by OpenAlex (260M+ works), Scopus verification, and local Llama AI summaries. Every result is a real paper , no hallucinations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors