Skip to content

nbdevanandan/hack-the-future

Repository files navigation

Team Pakka Nerds

SecureParse

A secure file redaction service that automatically detects and redacts sensitive information from various file types.

Prerequisites

  • Python 3.8 or higher
  • Node.js 14 or higher
  • pip3
  • System dependencies (installed automatically by setup script):
    • tesseract-ocr
    • python3-dev

Manual Setup

If the setup script doesn't work for your system, you can install dependencies manually:

  1. Install system dependencies:

    • For Ubuntu/Debian: sudo apt-get install tesseract-ocr python3-dev
    • For Arch Linux: sudo pacman -Sy tesseract
    • For Fedora: sudo dnf install tesseract
    • For macOS: brew install tesseract
  2. Create and activate Python virtual environment:

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install Python dependencies:
pip install --upgrade pip
pip install -r redaction/requirements.txt
python -m spacy download en_core_web_lg
  1. Install Node.js dependencies:
cd server
npm install
  1. Start the server:
npm start

The server will be running at http://localhost:3000

Supported File Types

  • Images: PNG, JPEG
  • Documents: PDF, DOCX, PPTX, XLSX
  • Text: TXT, RTF, CSV, JSON, XML

Features

  • Automatic detection and redaction of sensitive information
  • Support for multiple file types
  • Real-time processing
  • User-friendly interface

License

See the LICENSE file for details.

🌟 Features

  • Smart Detection: Identifies 20+ PII types (emails, phones, IDs, etc.) using Microsoft Presidio
  • Accurate Redaction: Maintains content structure after redaction
  • Web Interface: Simple drag-and-drop UI
  • Secure Processing: Files processed in-memory (never stored permanently)

🛠️ Tech Stack

Frontend:

  • HTML5/CSS3
  • JavaScript (ES6+)

Backend:

  • Node.js (Express)
  • Python 3.8+ (Flask)
  • Key Modules:
    • Microsoft Presidio (analysis)
    • Pytesseract (text extraction)

🚀 Installation

Prerequisites

  • Node.js v16+
  • Python 3.8+
# Clone repository
https://github.com/nbdevanandan/hack-the-future

About

Our project for Hack The Future Hackathon conducted at Amrita Vishwa Vidyapeetham

Topics

Resources

License

Stars

Watchers

Forks

Contributors