An image-based document classification system that automatically categorizes documents into predefined classes using advanced deep learning models like EfficientNet, ResNet, and Vision Transformers (ViT).
This project provides an image-based document classification system that automatically classifies document images into predefined categories using deep learning models like EfficientNet, ResNet, and Vision Transformers (ViT).
These instructions will guide you through setting up the project on your local machine for development and testing purposes.
Before you begin, make sure you have the following installed:
-
Git
Git is required to clone the repository:
Download Git Verify Git Installationgit --version
-
UV
An extremely fast Python package and project manager, written in Rust. You can read the uv documentation. Verify uv Installationuv version
-
Make
Make is a build utility that simplifies the process of building, testing, and packaging software.
You can read the Make documentation.Verify Make Installation
Run the following command to check if Make is installed:make --version
Clone the project from GitHub:
git clone https://github.com/fiqihfathor/financial_document_classification.git
cd financial_document_classificationInstall the project using the following command:
uv syncRun the tests using the following command:
make testDonwload Dataset
make datasetTrain Model
make trainYou can change the configuration in config/config.yml
Test API
make serverand you can access it on http://localhost:8000
- Python: The powerhouse of programming languages, enabling versatility and efficiency.
- PyTorch: Cutting-edge deep learning framework for building complex models with ease.
- FastAPI: The lightning-fast web framework to power your API with speed and simplicity.
- UV: An ultra-fast project manager that makes dependency management a breeze.
- Make: The trusted build utility to streamline your software development process.
- Git: The version control system that keeps your code organized and in control.
- MLflow: The open-source platform for managing and tracking machine learning experiments.
- Loguru: The most powerful and user-friendly logging library to simplify your code’s logging.