🏆 A ranked list of awesome Python open-source libraries & tools. Updated weekly.
This curated list contains 400 awesome open-source projects with a total of 2.1M stars grouped into 28 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome!
🧙♂️ Discover other best-of lists or create your own.
📫 Subscribe to our newsletter for updates and trending projects.
- Data Serialization 16 projects
- Data Containers & Dataframes 31 projects
- Data Structures 15 projects
- Data Validation 15 projects
- Algorithms & Design Patterns 4 projects
- Date & Time Utilities 9 projects
- File & Path Utilities 10 projects
- Compatiblity 7 projects
- Cryptography 7 projects
- Infrastructure & DevOps 20 projects
- Process Utilities 4 projects
- Asynchronous Programming 7 projects
- Configuration 9 projects
- CLI Development 20 projects
- Development Tools 1 projects
- Data Caching 6 projects
- GUI Development 10 projects
- Computer & Machine Vision 2 projects
- Machine Learning & Data Engineering 1 projects
- Text Data 12 projects
- Web Development 1 projects
- Database Clients 64 projects
- Data Loading & Extraction 31 projects
- Data Pipelines & Streaming 44 projects
- File Formats 3 projects
- Code Inspection 4 projects
- General Utilities 15 projects
- Python Implementations 6 projects
- Others 21 projects
- 🥇🥈🥉 Combined project-quality score
- ⭐️ Star count from GitHub
- 🐣 New project (less than 6 months old)
- 💤 Inactive project (6 months no activity)
- 💀 Dead project (12 months no activity)
- 📈📉 Project is trending up or down
- ➕ Project was recently added
- ❗️ Warning (e.g. missing/risky license)
- 👨💻 Contributors count from GitHub
- 🔀 Fork count from GitHub
- 📋 Issue count from GitHub
- ⏱️ Last update timestamp on package manager
- 📥 Download count from package manager
- 📦 Number of dependent projects
Pandas related project
flatbuffers (🥇45 · ⭐ 26K) - FlatBuffers: Memory Efficient Serialization Library. Apache-2
marshmallow (🥈41 · ⭐ 7.2K) - A lightweight library for converting complex objects to and from.. MIT
orjson (🥈36 · ⭐ 8K) - Fast, correct Python JSON library supporting dataclasses, datetimes,.. Apache-2
jsonpickle (🥈36 · ⭐ 1.3K) - Python library for serializing any arbitrary object graph into.. BSD-3
cloudpickle (🥉28 · ⭐ 1.9K) - Extended pickling support for Python objects. BSD-3
Show 8 hidden projects...
- simplejson (🥈35 · ⭐ 1.7K) - simplejson is a simple, fast, extensible JSON..
❗Unlicensed - pyasn1 (🥈34 · ⭐ 250 · 💀) - Generic ASN.1 library for Python.
BSD-2 - ultrajson (🥉28 · ⭐ 4.5K) - Ultra fast JSON decoder and encoder written in C with Python..
❗Unlicensed - dill (🥉27 · ⭐ 2.4K · 📉) - serialize all of Python.
❗Unlicensed - msgpack (🥉26 · ⭐ 2.1K) - MessagePack serializer implementation for Python..
❗Unlicensed - python-rapidjson (🥉26 · ⭐ 530) - Python wrapper around rapidjson.
❗Unlicensed - hickle (🥉26 · ⭐ 500 · 💀) - a HDF5-based python pickle replacement.
MIT - pysimdjson (🥉24 · ⭐ 760 · 💤) - Python bindings for the simdjson project.
❗Unlicensed
General-purpose data containers as well as utilities & extensions for pandas.
h5py (🥇44 · ⭐ 2.2K) - HDF5 for Python -- The h5py package is a Pythonic interface to the HDF5.. BSD-3
Bottleneck (🥈36 · ⭐ 1.2K) - Fast NumPy array functions written in C. BSD-2
Vaex (🥈33 · ⭐ 8.5K) - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization.. MIT
StaticFrame (🥉29 · ⭐ 480) - Immutable and statically-typeable DataFrames with runtime type and.. MIT
datasketch (🥉27 · ⭐ 2.9K) - MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog,.. MIT
Pandas Summary (🥉25 · ⭐ 530) - Engine for ML/Data tracking, visualization,.. Apache-2 
pickleDB (🥉21 · ⭐ 1.1K) - pickleDB is an in memory key-value store using Pythons orjson module.. BSD-3
Show 15 hidden projects...
- numpy (🥇51 · ⭐ 32K) - The fundamental package for scientific computing with Python.
❗Unlicensed - Blaze (🥈32 · ⭐ 3.2K · 💀) - NumPy and Pandas interface to Big Data.
BSD-3 - docarray (🥈32 · ⭐ 3.1K · 💀) - Represent, send, store and search multimodal data.
Apache-2 - Koalas (🥉29 · ⭐ 3.4K · 💀) - Koalas: pandas API on Apache Spark.
Apache-2spark - Arctic (🥉28 · ⭐ 3.1K · 💀) - Arctic is a high performance datastore for numeric data.
❗️LGPL-2.1 - sklearn-pandas (🥉28 · ⭐ 2.9K · 💀) - Pandas integration with sklearn.
❗️Zlibsklearn - datatable (🥉28 · ⭐ 1.9K · 💀) - A Python package for manipulating 2-dimensional tabular data..
MPL-2.0 - swifter (🥉27 · ⭐ 2.6K · 💀) - A package which efficiently applies any function to a pandas..
MIT - bcolz (🥉27 · ⭐ 960 · 💀) - A columnar data container that can be compressed.
BSD-3 - Pandaral·lel (🥉24 · ⭐ 3.8K · 💀) - A simple and efficient tool to parallelize Pandas..
BSD-3jupyter - pandasql (🥉20 · ⭐ 1.3K · 💀) - sqldf for pandas.
MIT - fletcher (🥉20 · ⭐ 230 · 💀) - Pandas ExtensionDType/Array backed by Apache Arrow.
MIT - daffy (🥉19 · ⭐ 57) - Lightweight DataFrame validation decorators for Pandas, Polars, Modin,..
MIT - Bounter (🥉16 · ⭐ 930 · 💀) - Efficient Counter that uses a limited (bounded) amount of memory..
MIT - PandaPy (🥉9 · ⭐ 550 · 💀) - PandaPy has the speed of NumPy and the usability of..
❗Unlicensed
pyrsistent (🥇32 · ⭐ 2.2K) - Persistent/Immutable/Functional data structures for Python. MIT
python-benedict (🥈29 · ⭐ 1.6K) - dict subclass with keylist/keypath support, built-in I/O.. MIT
python-box (🥉24 · ⭐ 2.8K) - Python dictionaries with advanced dot notation access. MIT
Show 9 hidden projects...
- addict (🥇31 · ⭐ 2.5K · 💀) - The Python Dict thats better than heroin.
MIT - ordered-set (🥇31 · ⭐ 230 · 💀) - A mutable set that remembers the order of its entries. One of..
MIT - sqlitedict (🥈29 · ⭐ 1.2K · 💀) - Persistent dict, backed by sqlite3 and pickle, multithread-..
Apache-2 - multidict (🥉27 · ⭐ 490) - The multidict implementation.
❗Unlicensed - python-sortedcontainers (🥉26 · ⭐ 3.9K · 💀) - Python Sorted Container Types: Sorted List, Sorted..
❗Unlicensed - glom (🥉26 · ⭐ 2.1K) - Pythons nested data operator (and CLI), for all your declarative..
❗Unlicensed - immutables (🥉26 · ⭐ 1.2K · 💀) - A high-performance immutable mapping type for Python.
Apache-2 - munch (🥉23 · ⭐ 780 · 💀) - A Munch is a Python dictionary that provides attribute-style access (a..
MIT - cleverdict (🥉15 · ⭐ 100 · 💀) - A JSON-friendly data structure which allows both object attributes..
MIT
jsonschema (🥇41 · ⭐ 4.9K) - An implementation of the JSON Schema specification for Python. MIT
voluptuous (🥈35 · ⭐ 1.8K) - CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data.. BSD-3
validators (🥈35 · ⭐ 1.1K) - Python Data Validation for Humans. MIT
python-email-validator (🥉31 · ⭐ 1.4K) - A robust email syntax and deliverability validation.. Unlicense
dirty-equals (🥉25 · ⭐ 980) - Doing dirty (but extremely useful) things with equals. MIT
Show 5 hidden projects...
- strictyaml (🥉29 · ⭐ 1.6K · 💀) - Type-safe YAML parser and validator.
MIT - schematics (🥉27 · ⭐ 2.6K · 💀) - Python Data Structures for Humans.
❗Unlicensed - typical (🥉17 · ⭐ 180 · 💀) - Typical: Fast, simple, & correct data-validation using Python 3 typing.
MIT - validr (🥉12 · ⭐ 220 · 💀) - A simple, fast, extensible python library for data validation.
❗Unlicensed - dataklasses (🥉7 · ⭐ 810 · 💀) - A different spin on dataclasses.
❗Unlicensed
🔗 python-patterns ( ⭐ 43K) - Collection of design patterns/idioms in Python.
algorithms (🥇33 · ⭐ 25K) - Minimal examples of data structures and algorithms in Python. MIT
transitions (🥉32 · ⭐ 6.5K · 💤) - A lightweight, object-oriented finite state machine.. MIT
Show 1 hidden projects...
dateparser (🥈39 · ⭐ 2.8K) - python parser for human readable dates. BSD-3
python-dateutil (🥉30 · ⭐ 2.6K) - Useful extensions to the standard Python datetime features. Apache-2
tzlocal (🥉30 · ⭐ 220 · 💤) - A Python module that tries to figure out what your local timezone is. MIT
Show 2 hidden projects...
- isodate (🥉32 · ⭐ 170 · 💀) - ISO 8601 date/time parser.
BSD-3 - parsedatetime (🥉29 · ⭐ 710 · 💀) - Parse human-readable date/time strings.
Apache-2
filesystem_spec (🥇40 · ⭐ 1.3K) - A specification that python filesystems should adhere to. BSD-3
pyfilesystem2 (🥉26 · ⭐ 2.1K · 💤) - Pythons Filesystem abstraction layer. MIT
scandir (🥉25 · ⭐ 540 · 💤) - Better directory iterator and faster os.walk(). Archived, as this.. BSD-3
Show 4 hidden projects...
- appdirs (🥉32 · ⭐ 1.1K · 💀) - A small Python module for determining appropriate platform-specific..
MIT - path (🥉30 · ⭐ 1.1K) - Object-oriented file system path manipulation.
❗Unlicensed - zipp (🥉22 · ⭐ 67) - A pathlib-compatible Zipfile object wrapper.
❗Unlicensed - Unipath (🥉19 · ⭐ 510 · 💀) - An object-oriented approach to Python file/directory..
❗Unlicensed
Show 6 hidden projects...
- future (🥇37 · ⭐ 1.2K · 💀) - Easy, clean, reliable Python 2/3 compatibility.
MIT - dataclasses (🥈27 · ⭐ 590 · 💀) - An implementation of PEP 557: Data Classes.
Apache-2 - typing (🥉26 · ⭐ 1.7K) - Python static typing home. Hosts the documentation and a user..
❗Unlicensed - pathlib2 (🥉26 · ⭐ 84 · 💀) - Backport of pathlib aiming to support the full stdlib Python API.
MIT - contextlib2 (🥉26 · ⭐ 38) - contextlib2 is a backport of the standard librarys contextlib..
❗️psfrag - futures (🥉23 · ⭐ 240 · 💀) - Backport of the concurrent.futures package to Python 2.6 and..
❗Unlicensed
Show 6 hidden projects...
- cryptography (🥇45 · ⭐ 7.6K) - cryptography is a package designed to expose..
❗Unlicensed - keyring (🥈37 · ⭐ 1.4K) - Store and access your passwords safely.
❗Unlicensed - tink (🥉36 · ⭐ 14K · 💀) - Tink is a multi-language, cross-platform, open source library that..
Apache-2 - pycryptodomex (🥉36 · ⭐ 3.2K) - A self-contained cryptographic library for Python.
❗Unlicensed - asn1crypto (🥉31 · ⭐ 360 · 💀) - Python ASN.1 library with a focus on performance and a pythonic API.
MIT - rsa (🥉21 · ⭐ 490 · 💤) - Python-RSA is a pure-Python RSA implementation.
❗Unlicensed
ansible (🥇48 · ⭐ 68K) - Ansible is a radically simple IT automation platform that makes your.. ❗️GPL-3.0
docker-compose (🥈41 · ⭐ 37K) - Define and run multi-container applications with Docker. Apache-2
paramiko (🥈39 · ⭐ 9.7K) - The leading native Python SSHv2 protocol library. ❗️LGPL-2.1
kubernetes (🥉35 · ⭐ 7.5K) - Official Python client library for kubernetes. Apache-2
Show 8 hidden projects...
- awscli (🥈37 · ⭐ 17K) - Universal Command Line Interface for Amazon Web Services.
❗Unlicensed - schedule (🥉34 · ⭐ 12K · 💀) - Python job scheduling for humans.
MIT - parallel-ssh (🥉26 · ⭐ 1.3K) - Asynchronous parallel SSH client library.
❗️LGPL-2.1 - fabtools (🥉23 · ⭐ 1.3K · 💀) - Tools for writing awesome Fabric files.
BSD-2 - storm (🥉21 · ⭐ 3.9K · 💀) - Manage your SSH like a boss.
MIT - pypyr (🥉21 · ⭐ 640 · 💀) - pypyr task-runner cli & api for automation pipelines. Automate..
Apache-2 - wssh (🥉16 · ⭐ 1.4K · 💀) - SSH to WebSockets Bridge.
MIT - Grai (🥉11 · ⭐ 310 · 💤) - Platform to programmatically manage, test, and debug data..
❗️MIT-0
supervisor (🥇37 · ⭐ 9K) - Supervisor process control system for Unix.. ❗️Repoze Public License
Show 2 hidden projects...
- pexpect (🥈35 · ⭐ 2.8K · 💤) - A Python module for controlling interactive programs in a..
❗Unlicensed - ptyprocess (🥉21 · ⭐ 240 · 💤) - Run a subprocess in a pseudo terminal.
❗Unlicensed
anyio (🥇41 · ⭐ 2.4K) - High level asynchronous concurrency and networking framework that works on.. MIT
Show 3 hidden projects...
python-dotenv (🥇38 · ⭐ 8.7K) - Reads key-value pairs from a .env file and can set them as.. BSD-3
omegaconf (🥈34 · ⭐ 2.4K) - Flexible Python configuration system. The last one you will ever need. BSD-3
gin-config (🥉28 · ⭐ 2.1K) - Gin provides a lightweight configuration framework for Python. Apache-2
Show 2 hidden projects...
- python-decouple (🥉32 · ⭐ 3K · 💀) - Strict separation of config from code.
MIT - Dynaconf (🥉16 · ⭐ 5 · 💤) - dynaconf mirror (mainly for stats)- ORIGINAL REPO ON -..
MIT
rich (🥇47 · ⭐ 56K) - Rich is a Python library for rich text and beautiful formatting in the terminal. MIT
python-fire (🥈37 · ⭐ 28K · 💤) - Python Fire is a library for automatically generating.. Apache-2
python-prompt-toolkit (🥈37 · ⭐ 10K) - Library for building powerful interactive command line.. BSD-3
argcomplete (🥈33 · ⭐ 1.6K) - Python and tab completion, better together. Apache-2
ConfigArgParse (🥈33 · ⭐ 760) - Drop-in replacement for argparse with added support for config.. MIT
questionary (🥉32 · ⭐ 2.1K) - Python library to build pretty command line user prompts Easy to use.. MIT
asciimatics (🥉30 · ⭐ 4.3K · 💤) - A cross platform package to do curses-like operations, plus.. Apache-2
blessings (🥉28 · ⭐ 1.5K · 💤) - A thin, practical wrapper around terminal capabilities in Python. MIT
Click Extra (🥉23 · ⭐ 110) - Drop-in replacement for Click to make user-friendly and.. ❗️GPL-2.0
Show 6 hidden projects...
- wcwidth (🥈35 · ⭐ 450) - Python library that measures the width of strings in a terminal.
❗Unlicensed - docopt-ng (🥉24 · ⭐ 220 · 💀) - Humane command line arguments parser. Now with maintenance,..
MIT - clint (🥉23 · ⭐ 97 · 💀) - Python Command-line Application Tools.
ISC - bashplotlib (🥉20 · ⭐ 1.9K · 💀) - plotting in the terminal.
MIT - colout (🥉19 · ⭐ 1.2K · 💀) - Color text streams with a polished command line interface.
❗️GPL-3.0 - onecite (🥉12 · ⭐ 56) - An intelligent toolkit to automatically parse, complete, and format..
MIT
🔗 best-of-python-dev ( ⭐ 1.2K) - A ranked list of awesome python developer tools and libraries. Updated..
cachetools (🥇41 · ⭐ 2.7K) - Extensible memoizing collections and decorators. MIT
pylibmc (🥉27 · ⭐ 490 · 💤) - A Python wrapper around the libmemcached interface from TangentOrg. BSD-3
Show 2 hidden projects...
- cached-property (🥈29 · ⭐ 710 · 💀) - A decorator for caching properties in classes.
BSD-3 - beaker (🥉25 · ⭐ 540) - WSGI middleware for sessions and caching.
❗Unlicensed
🔗 best-of-web-python - Web UI ( ⭐ 2.7K) - Collection of libraries to implement web-based UIs.
kivy (🥇40 · ⭐ 19K) - Open source UI framework written in Python, running on Windows, Linux, macOS,.. MIT
DearPyGui (🥈38 · ⭐ 15K) - Dear PyGui: A fast and powerful Graphical User Interface Toolkit for.. MIT
Gooey (🥉30 · ⭐ 22K) - Turn (almost) any Python command line program into a full GUI application.. MIT
Eel (🥉24 · ⭐ 6.7K · 💤) - A little Python library for making simple Electron-like HTML/JS GUI apps. MIT
Show 4 hidden projects...
- PySimpleGUI (🥈34 · ⭐ 14K · 📈) - PySimpleGUI is a Python package that enables Python..
❗️PySimpleGUI License - Phoenix (🥉27 · ⭐ 2.6K) - wxPythons Project Phoenix. A new implementation of wxPython,..
❗Unlicensed - enaml (🥉27 · ⭐ 1.6K) - Declarative User Interfaces for Python.
❗Unlicensed - flexx (🥉25 · ⭐ 3.3K · 💀) - Write desktop and web apps in pure Python.
BSD-2
🔗 best-of-ml-python - Computer Vision ( ⭐ 23K) - Collection of computer vision and image processing..
Show 1 hidden projects...
🔗 best-of-ml-python ( ⭐ 23K) - A ranked list of awesome machine learning Python libraries. Updated..
🔗 best-of-ml-python - NLP ( ⭐ 23K) - Collection of text processing and NLP libraries.
phonenumbers (🥇35 · ⭐ 3.7K) - Python port of Googles libphonenumber. Apache-2
python-slugify (🥈33 · ⭐ 1.6K) - Returns unicode slugs. MIT
pyahocorasick (🥉30 · ⭐ 1.1K) - Python module (C extension and plain python) implementing Aho-.. BSD-3
price-parser (🥉20 · ⭐ 340) - Extract price amount and currency symbol from a raw text string. BSD-3
Show 3 hidden projects...
- humanize (🥉32 · ⭐ 1.7K · 💀) - python humanize functions.
MIT - millify (🥉16 · ⭐ 110 · 💀) - Convert long numbers into a human-readable format in Python.
MIT - awesome-slugify (🥉13 · ⭐ 490 · 💀) - Python flexible slugify function.
❗Unlicensed
🔗 best-of-web-python ( ⭐ 2.7K) - A ranked list of awesome python libraries for web development. Updated..
Libraries for connecting to, operating, and querying databases.
SQLAlchemy (🥇46 · ⭐ 12K) - The Database Toolkit for Python. MIT
google-cloud-storage (🥇45 · ⭐ 5.3K) - Google Cloud Client Libraries for Python. Apache-2
azure-storage-blob (🥇44 · ⭐ 5.5K) - This repository is for active development of the Azure SDK.. MIT
elasticsearch (🥇42 · ⭐ 4.4K) - Official Python client for Elasticsearch. Apache-2
kafka-python (🥇41 · ⭐ 5.9K) - Python client for Apache Kafka. Apache-2
sqlmodel (🥈40 · ⭐ 18K) - SQL databases in Python, designed for simplicity, compatibility, and.. MIT pydantic
MongoEngine (🥈38 · ⭐ 4.4K · 💤) - A Python Object-Document-Mapper for working with MongoDB. MIT
AWS Data Wrangler (🥈38 · ⭐ 4.1K) - pandas on AWS - Easy integration with Athena, Glue,.. Apache-2 
tortoise-orm (🥈37 · ⭐ 5.5K) - Familiar asyncio ORM for python, built with relations in mind. Apache-2
mysqlclient (🥈36 · ⭐ 2.5K) - MySQL/MariaDB connector for Python. ❗️GPL-2.0
Prometheus Client (🥈35 · ⭐ 4.3K) - Prometheus instrumentation library for Python.. Apache-2
Elasticsearch DSL (🥈35 · ⭐ 3.9K · 💤) - High level Python client for Elasticsearch. Apache-2
PyPika (🥈35 · ⭐ 2.9K) - PyPika is a python SQL query builder that exposes the full richness.. Apache-2
google-cloud-bigtable (🥈34 · ⭐ 5.3K · 📈) - This library has moved to.. Apache-2
dataset (🥈34 · ⭐ 4.9K) - Easy-to-use data handling for SQL data stores with support for implicit.. MIT
Cassandra Driver (🥉33 · ⭐ 1.4K) - Python Driver for Apache Cassandra. Apache-2
python-bigquery (🥉30 · ⭐ 800) - This library has moved to.. Apache-2
ODMantic (🥉29 · ⭐ 1.2K) - Sync and Async ODM (Object Document Mapper) for MongoDB based on python.. ISC
libcloud (🥉28 · ⭐ 2.1K) - Apache Libcloud is a Python library that hides differences between.. Apache-2
pandas-gbq (🥉27 · ⭐ 490) - This library has moved to https://github.com/googleapis/google-.. BSD-3
s3transfer (🥉25 · ⭐ 230 · 📉) - Amazon S3 Transfer Manager for Python. Apache-2
psycopg3 (🥉19 · ⭐ 2.4K) - New generation PostgreSQL database adapter for the Python.. ❗️LGPL-3.0
-
GitHub (👨💻 96 · 🔀 230 · 📋 670 - 5% open · ⏱️ 01.04.2026):
git clone https://github.com/psycopg/psycopg
Show 22 hidden projects...
- confluent-kafka-python (🥈37 · ⭐ 470) - Confluents Kafka Python Client.
❗Unlicensed - psycopg2 (🥈35 · ⭐ 3.6K) - PostgreSQL database adapter for the Python programming language.
❗Unlicensed - pyodbc (🥈34 · ⭐ 3.1K) - Python ODBC bridge.
❗️MIT-0 - SQLAlchemy-Utils (🥈34 · ⭐ 1.3K) - Various utility functions and datatypes for SQLAlchemy.
❗Unlicensed - influxdb (🥉30 · ⭐ 1.7K · 💀) - Python client for InfluxDB.
MIT - redis-py-cluster (🥉30 · ⭐ 1.1K · 💀) - Python cluster client for the official redis cluster...
MIT - gino (🥉29 · ⭐ 2.8K · 💀) - GINO Is Not ORM - a Python asyncio ORM on SQLAlchemy core.
BSD-3 - cx-Oracle (🥉29 · ⭐ 890 · 💤) - Obsolete Python interface to Oracle Database, now..
❗Unlicensed - mongo-connector (🥉28 · ⭐ 1.9K · 💀) - MongoDB data stream pipeline tools by YouGov (adopted..
Apache-2 - Databases (🥉25 · ⭐ 4K · 💀) - Async database support for Python.
BSD-3 - neo4j-driver (🥉25 · ⭐ 1K) - Neo4j Bolt driver for Python.
❗Unlicensed - pyhdb (🥉24 · ⭐ 320 · 💀) - SAP HANA Connector in pure Python.
Apache-2 - cloudant (🥉24 · ⭐ 160 · 💀) - A Python library for Cloudant and CouchDB.
Apache-2 - prisma (🥉23 · ⭐ 2.1K · 💀) - Prisma Client Python is an auto-generated and fully type-safe..
Apache-2 - aioprometheus (🥉22 · ⭐ 190 · 💀) - A Prometheus Python client library for asyncio-based..
MIT - Queries (🥉21 · ⭐ 260 · 💀) - PostgreSQL database access simplified.
BSD-3 - db.py (🥉20 · ⭐ 1.2K · 💀) - db.py is an easier way to interact with your databases.
BSD-2 - PyMODM (🥉19 · ⭐ 350 · 💀) - A Pythonic, object-oriented interface for working with MongoDB.
Apache-2 - py2neo (🥉19 · ⭐ 32 · 💀) - EOL! Py2neo is a comprehensive Neo4j driver library and toolkit for..
Apache-2 - gsheets-db-api (🥉18 · ⭐ 220 · 💀) - A Python DB-API and SQLAlchemy dialect to Google Spreasheets.
MIT - lazydata (🥉14 · ⭐ 620 · 💀) - Lazydata: Scalable data dependencies for Python projects.
Apache-2 - SuperSQLite (🥉13 · ⭐ 710 · 💀) - A supercharged SQLite library for Python.
MIT
Libraries for loading, collecting, and extracting data from a variety of data sources and formats.
Datasets (🥇45 · ⭐ 21K) - The largest hub of ready-to-use datasets for AI models with fast,.. Apache-2
xmltodict (🥇37 · ⭐ 5.7K) - Python module that makes working with XML feel like you are working.. MIT
python-magic (🥈35 · ⭐ 2.9K) - A python wrapper for libmagic. MIT
Intake (🥈34 · ⭐ 1.1K) - Intake is a lightweight package for finding, investigating, loading and.. BSD-2
csvkit (🥈31 · ⭐ 6.4K) - A suite of utilities for converting to and working with CSV, the king of.. MIT
smart-open (🥈31 · ⭐ 3.4K) - Utils for streaming large files (S3, HDFS, gzip, bz2...). MIT
snorkel (🥈30 · ⭐ 6K) - A system for quickly generating training data with weak supervision. Apache-2
img2dataset (🥉27 · ⭐ 4.4K · 💤) - Easily turn large sets of image urls to an image dataset. Can.. MIT
rows (🥉21 · ⭐ 880) - A common, beautiful interface to tabular data, no matter the format. ❗️LGPL-3.0
Upgini (🥉21 · ⭐ 350) - Data search & enrichment library for Machine Learning Easily find and add.. BSD-3
Singer (🥉20 · ⭐ 1.3K · 💤) - Standard for moving data between databases, web APIs, files,.. ❗️AGPL-3.0
csvs-to-sqlite (🥉15 · ⭐ 930 · 💤) - Convert CSV files into a SQLite database. Apache-2
Show 14 hidden projects...
- xlwings (🥈36 · ⭐ 3.3K) - xlwings is a Python library that makes it easy to call Python..
❗Unlicensed - SDV (🥈33 · ⭐ 3.5K) - Synthetic data generation for tabular data.
❗Unlicensed - pandas-datareader (🥈30 · ⭐ 3.2K · 💤) - Extract data from a wide range of Internet sources..
❗Unlicensed - xlrd (🥈30 · ⭐ 2.2K · 💤) - Please use openpyxl where you can...
❗Unlicensed - PDFMiner (🥉29 · ⭐ 5.3K · 💀) - Python PDF Parser (Not actively maintained). Check out pdfminer.six.
MIT - borb (🥉26 · ⭐ 3.6K) - borb is a library for reading, creating and manipulating PDF files..
❗Unlicensed - tabulator-py (🥉26 · ⭐ 240 · 💀) - Python library for reading and writing tabular data via streams.
MIT - excalibur (🥉22 · ⭐ 1.8K · 💀) - A web interface to extract tabular data from PDFs.
MIT - deepdish (🥉22 · ⭐ 270 · 💀) - Flexible HDF5 saving/loading and other data science tools from the..
BSD-3 - pyexcel-xlsx (🥉22 · ⭐ 120 · 💤) - A wrapper library to read, manipulate and write data..
❗Unlicensed - messytables (🥉21 · ⭐ 390 · 💀) - Tools for parsing messy tabular data. This is now..
❗Unlicensed - datatest (🥉20 · ⭐ 300 · 💀) - Tools for test driven data-wrangling and data validation.
❗Unlicensed - Squirrel (🥉12 · ⭐ 280 · 💀) - A Python library that enables ML teams to share, load, and..
Apache-2 - Wayback-Archive (🥉7 · ⭐ 7 · 🐣) - Download complete websites from the Wayback Machine with..
❗️GPL-3.0
Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.
Airflow (🥇50 · ⭐ 46K) - Platform to programmatically author, schedule, and monitor workflows. Apache-2
-
GitHub (👨💻 4.3K · 🔀 16K · 📥 69K · 📦 19K · 📋 14K - 8% open · ⏱️ 16.04.2026):
git clone https://github.com/apache/airflow -
PyPi (📥 19M / month):
pip install apache-airflow -
Conda (📥 2M · ⏱️ 16.04.2026):
conda install -c conda-forge airflow -
Docker Hub (📥 1.6B · ⭐ 630 · ⏱️ 15.04.2026):
docker pull apache/airflow
Celery (🥇47 · ⭐ 28K) - Asynchronous task queue/job queue based on distributed message passing. BSD-3
Prefect (🥇45 · ⭐ 22K) - Prefect is a workflow orchestration framework for building resilient.. Apache-2
luigi (🥈38 · ⭐ 19K) - Luigi is a Python module that helps you build complex pipelines of batch.. Apache-2
Kedro (🥈38 · ⭐ 11K) - Kedro is a toolbox for production-ready data science. It uses software.. Apache-2
Great Expectations (🥈37 · ⭐ 11K) - Always know what to expect from your data. Apache-2
Activeloop (🥈35 · ⭐ 9.1K) - Deeplake is AI Data Runtime for Agents. It provides serverless.. Apache-2
zenml (🥈35 · ⭐ 5.3K) - ZenML : One AI Platform from Pipelines to Agents. https://zenml.io. Apache-2
ploomber (🥉23 · ⭐ 3.6K · 💤) - The fastest way to build data pipelines. Develop iteratively,.. Apache-2
BatchFlow (🥉22 · ⭐ 200) - BatchFlow helps you conveniently work with random or sequential.. Apache-2
Botflow (🥉16 · ⭐ 1.2K) - Python Fast Dataflow programming framework for Data pipeline work( Web.. BSD-3
Show 21 hidden projects...
- dbt (🥈38 · ⭐ 13K) - dbt enables data analysts and engineers to transform their data using..
❗Unlicensed - rq (🥈38 · ⭐ 11K) - Simple job queues for Python.
❗Unlicensed - faust (🥉29 · ⭐ 6.8K · 💀) - Python Stream Processing.
BSD-3 - PyFunctional (🥉28 · ⭐ 2.5K · 💀) - Python library for creating data pipelines with chain..
MIT - whylogs (🥉27 · ⭐ 2.8K · 💀) - Open standard for end-to-end data and ML monitoring for any..
Apache-2 - Pypeline (🥉25 · ⭐ 1.6K · 💀) - Concurrent data pipelines in Python .
MIT - Optimus (🥉23 · ⭐ 1.5K · 💀) - Agile Data Preparation Workflows madeeasy with Pandas,..
Apache-2spark - dpark (🥉22 · ⭐ 2.7K · 💀) - Python clone of Spark, a MapReduce alike framework in Python.
BSD-3spark - bonobo (🥉22 · ⭐ 1.6K · 💀) - Extract Transform Load for Python 3.5+.
Apache-2 - streamparse (🥉21 · ⭐ 1.5K · 💀) - Run Python in Apache Storm topologies. Pythonic API, CLI..
Apache-2 - pysparkling (🥉21 · ⭐ 270 · 💀) - A pure Python implementation of Apache Sparks RDD and..
❗Unlicensed - dbnd (🥉21 · ⭐ 270 · 💀) - DBND is an agile pipeline framework that helps data engineering teams..
Apache-2 - spark-deep-learning (🥉20 · ⭐ 2K · 💀) - Deep Learning Pipelines for Apache Spark.
Apache-2spark - mrq (🥉20 · ⭐ 900 · 💀) - Mr. Queue - A distributed worker task queue in Python using Redis & gevent.
MIT - flupy (🥉19 · ⭐ 200 · 💤) - Fluent data pipelines for python and your shell.
❗Unlicensed - Mara Pipelines (🥉16 · ⭐ 2.1K · 💀) - A lightweight opinionated ETL framework, halfway between..
MIT - riko (🥉16 · ⭐ 1.6K · 💀) - A Python stream processing engine modeled after Yahoo! Pipes.
MIT - Databolt Flow (🥉16 · ⭐ 950 · 💀) - Python library for building highly effective data science..
MIT - datajob (🥉14 · ⭐ 110 · 💀) - Build and deploy a serverless data pipeline on AWS with no effort.
Apache-2 - bodywork-core (🥉13 · ⭐ 440 · 💀) - ML pipeline orchestration and model deployments on..
❗️AGPL-3.0 - RasgoQL (🥉12 · ⭐ 270 · 💀) - Write python locally, execute SQL in your data warehouse.
❗️AGPL-3.0
XlsxWriter (🥉37 · ⭐ 3.9K) - A Python module for creating Excel XLSX files. BSD-2
typing_inspect (🥉20 · ⭐ 380) - Runtime inspection utilities for Python typing module. MIT
Show 3 hidden projects...
- deepdiff (🥇30 · ⭐ 2.5K) - DeepDiff: Deep Difference and search of any Python object/data...
❗Unlicensed - importlib-resources (🥈25 · ⭐ 71) - Backport of the importlib.resources module.
❗Unlicensed - entrypoints (🥉19 · ⭐ 77 · 💀) - Discover and load entry points from installed packages.
MIT
more-itertools (🥇41 · ⭐ 4.1K · 📈) - More routines for operating on iterables, beyond itertools. MIT
python-dependency-injector (🥈36 · ⭐ 4.8K) - Dependency injection framework for Python. BSD-3
ubelt (🥉22 · ⭐ 740) - A Python utility library with a stdlib like feel and extra batteries... Apache-2
Show 5 hidden projects...
- toolz (🥈34 · ⭐ 5.1K) - A functional standard library for Python.
❗Unlicensed - pinject (🥉24 · ⭐ 1.3K · 💀) - A pythonic dependency injection library.
Apache-2 - pampy (🥉22 · ⭐ 3.5K · 💀) - Pampy: The Pattern Matching for Python you always dreamed of.
MIT - retrying (🥉22 · ⭐ 1.9K · 💀) - Retrying is an Apache 2.0 licensed general-purpose retrying..
Apache-2 - CommonRegex (🥉21 · ⭐ 1.6K · 💀) - A collection of common regular expressions bundled with an easy..
MIT
Show 6 hidden projects...
- cpython (🥇39 · ⭐ 72K · 📉) - The Python programming language.
❗Unlicensed - micropython (🥈33 · ⭐ 22K) - MicroPython - a lean and efficient Python implementation for..
❗Unlicensed - pyston (🥈22 · ⭐ 2.5K · 💀) - (No longer maintained) A faster and highly-compatible..
Apache-2 - grumpy (🥉21 · ⭐ 11K · 💀) - Grumpy is a Python to Go source code transcompiler and runtime.
Apache-2 - stackless (🥉17 · ⭐ 1.1K · 💀) - The Stackless Python programming language.
❗Unlicensed - cl-python (🥉11 · ⭐ 400 · 💀) - An implementation of Python in Common Lisp.
❗Unlicensed
cookiecutter (🥈41 · ⭐ 25K) - A cross-platform command-line utility that creates projects from.. BSD-3
Send2Trash (🥉31 · ⭐ 310) - Python library to natively send files to Trash (or Recycle bin) on.. BSD-3
python-mss (🥉30 · ⭐ 1.3K) - An ultra fast cross-platform multiple screenshots module in pure.. MIT
Show 9 hidden projects...
- pycparser (🥈38 · ⭐ 3.5K) - Complete C99 parser in pure Python.
❗Unlicensed - py4j (🥉31 · ⭐ 1.3K) - Py4J enables Python programs to dynamically access arbitrary Java..
❗Unlicensed - keyboard (🥉29 · ⭐ 4K · 💀) - Hook and simulate global keyboard events on Windows and Linux.
MIT - powerline-shell (🥉27 · ⭐ 6.3K · 💀) - A beautiful and useful prompt for your shell.
MIT - pyscreenshot (🥉26 · ⭐ 510 · 💀) - Python screenshot library, replacement for the Pillow..
BSD-2 - pluginbase (🥉25 · ⭐ 1.1K · 💀) - A simple but flexible plugin system for Python.
BSD-3 - pyscaffold (🥉24 · ⭐ 2.3K · 💀) - Python project template generator with batteries included.
❗Unlicensed - macropy (🥉19 · ⭐ 3.3K · 💀) - Macros in Python: quasiquotes, case classes, LINQ and more!.
❗Unlicensed - openpyxl (🥉17 · ⭐ 14) - A Python library to read/write Excel 2010 xlsx/xlsm files.
MIT
- Best-of lists: Discover other best-of lists with awesome open-source projects on all kinds of topics.
- best-of-ml-python: A ranked list of awesome machine learning Python libraries.
- best-of-web-python: A ranked list of awesome Python libraries for web development.
- best-of-python-dev: A ranked list of awesome Python developer tools and libraries.
- awesome-python: A curated list of awesome Python frameworks, libraries, software and resources.
Contributions are encouraged and always welcome! If you like to add or update projects, choose one of the following ways:
- Open an issue by selecting one of the provided categories from the issue page and fill in the requested information.
- Modify the projects.yaml with your additions or changes, and submit a pull request. This can also be done directly via the Github UI.
If you like to contribute to or share suggestions regarding the project metadata collection or markdown generation, please refer to the best-of-generator repository. If you like to create your own best-of list, we recommend to follow this guide.
For more information on how to add or update projects, please read the contribution guidelines. By participating in this project, you agree to abide by its Code of Conduct.