A modular collection of Docker environments for data engineering, data science, and machine learning projects.
Each environment is contained in its own directory under environments/.
| Environment | Framework | Python | Java | Base OS | Description |
|---|---|---|---|---|---|
| spark247-jupyter | Spark 2.4.7 | 3.8 | OpenJDK 8 | Ubuntu 20.04 | Spark with Jupyter Notebook |
| pyspark-notebook | Spark Latest | 3.10 | OpenJDK 8 | Ubuntu 22.04 | PySpark with Jupyter Notebook |
| pyspark24 | Spark 2.4.0 | 3.6 | OpenJDK 8 | Ubuntu 20.04 | PySpark2.4 Data Science Workbench |
| multimodal-sbt | Spark 2.4.0 | - | OpenJDK 8 | - | Scala SBT project for building and managing Spark applications |
| rapids-notebook | RAPIDS | 3.10 | - | Ubuntu 22.04 | GPU-accelerated data science with NVIDIA RAPIDS |
This project is licensed under the MIT License.