Skip to content

Latest commit

 

History

History
39 lines (32 loc) · 939 Bytes

File metadata and controls

39 lines (32 loc) · 939 Bytes

data-engineering-notebooks

Curated list of experiment data pipelines used to build datasets across domains

Pre-requisites

  • Python 3.x+
  • Jupyter Lab

Skills needed

  • Web scraping
  • Consuming RESTful APIs
  • Pandas
  • Polars
  • DuckDB

Quick Setup

The project uses Pip to keep track of its dependencies. To install it, you can follow the instructions here.

Once Pip has been installed, you can run the following commands to set up the project in your local:

git clone git@github.com:nathanbaleeta/data-engineering-notebooks.git

python3 -m venv venv

source venv/bin/activate

pip install --quiet pandas requests jupterlab duckdb
To freeze the libaries use:
pip freeze > requirements.txt
#### Install packages from frozen file
pip install -r requirements. txt

Launch notebook

jupyter lab