Skip to content

Commit 7843b42

Browse files
authored
Create README.md
1 parent 8ef940b commit 7843b42

1 file changed

Lines changed: 52 additions & 0 deletions

File tree

README.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
Open-source framework and scripts for harvesting datasets into [PortalJS](https://portaljs.com).
2+
This repo is designed as a **template** — fork or clone it to quickly set up your own dataset harvesting pipelines.
3+
4+
It includes:
5+
6+
* Reusable scripts for extracting datasets from common sources (APIs, CSVs, spreadsheets, etc.)
7+
* A plug-and-play **ETL framework** for transforming and publishing datasets
8+
* GitHub Actions workflow for automated harvesting
9+
* Config-driven setup — no need to hard-wire pipelines
10+
11+
## 🚀 Quickstart
12+
13+
1. **Use this template**
14+
Click **“Use this template”** on GitHub to bootstrap your own repo.
15+
16+
2. **Configure harvesters**
17+
Edit `config.yml` to define dataset sources and pipelines:
18+
19+
```yaml
20+
sources:
21+
- name: world-bank
22+
type: api
23+
url: https://api.worldbank.org/v2/
24+
format: json
25+
```
26+
27+
3. **Run**
28+
29+
TODO
30+
31+
4. **Automate with GitHub Actions**
32+
Push your repo — harvesting will run on schedule using the included workflow (`.github/workflows/harvest.yml`).
33+
34+
## 🛠 Features
35+
36+
* **Modular scripts** – add your own connectors or reuse provided ones
37+
* **Config-driven** – no need to edit code for new datasets
38+
* **CI/CD ready** – run pipelines directly in GitHub Actions
39+
* **Extensible** – works with PortalJS or standalone
40+
41+
## 📦 Repo Structure
42+
43+
TODO
44+
45+
## 🤝 Contributing
46+
47+
PRs and new connectors welcome!
48+
Please open an issue if you’d like to propose a new feature or source integration.
49+
50+
## 📄 License
51+
52+
MIT License. See [LICENSE](./LICENSE) for details.

0 commit comments

Comments
 (0)