Skip to content

bergerache/credit-payment-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

29 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ’ณ Credit Payment Analytics Pipeline

dbt BigQuery Python

๐Ÿ“Š Overview

A comprehensive dbt project that analyses credit card payment behaviour to identify customers eligible for improved credit terms. Built as a portfolio project demonstrating advanced data transformation and business logic implementation skills.

๐ŸŽฏ Business Context

Traditional credit scoring focuses on risk minimisation. This project takes a lender's perspective, identifying profitable customers (revolvers) who generate interest revenue whilst maintaining acceptable risk levels.

๐Ÿ’ฐ Key Business Insight

"Good customers pay eventually. Great customers pay forever (in interest)."

๐Ÿ’ก Key Discoveries from EDA

  • "Paid in full" โ‰  Zero balance: 90% of customers marked as paying in full still carry debt
  • Account in credit (status -2): 7.6% of these customers have overpaid - lowest risk but unprofitable
  • Minimum payment only (status 0): Most profitable customers when managed well
  • Payment status reflects statement balance, not total debt

๐Ÿ“ˆ Data Source

  • Dataset: UCI Credit Card Clients Dataset (2005)
  • Size: 30,000 customers
  • Period: 6 months of payment history
  • Location: Taiwan (amounts in NT$)

๐Ÿ—๏ธ Architecture

Data Flow

EDA Analysis          โ†’ Discovered true payment status meanings
  โ†“
seeds/               โ†’ Raw CSV data with corrected interpretations
  โ†“
staging/             โ†’ Decoded with business-accurate labels
  โ†“
intermediate/        โ†’ 5-tier scoring system + utilisation analysis
  โ†“
marts/              โ†’ Action-oriented decisions

Data Pipeline DAG

๐Ÿงฎ Key Business Logic

Customer Value Score (Discrete Values)

Prioritises profitable revolvers who generate interest revenue:

Score Customer Segment Criteria Business Value
๐ŸŸข 100 Premium Revolvers No late, 3+ minimum payments, >$1000 avg balance Highest profit potential
๐ŸŸข 80 Standard Good No late payments, >$500 avg balance Reliable revenue
๐ŸŸก 60 Acceptable Risk 1 late payment OR low balance Monitor for improvement
๐ŸŸ  40 Multiple Risks 2 late payments OR frequent overpayments Intervention needed
๐Ÿ”ด 30 Dormant Minimal activity, <$100 avg balance Re-engagement opportunity
๐Ÿ”ด 20 Unusual Pattern Edge cases Manual review required
๐Ÿ”ด 15 High Risk 3+ late payments Collections focus

๐ŸŽฏ Recommended Actions Logic

Action Criteria Business Rationale
INCREASE_LIMIT Score โ‰ฅ80 AND 50-75% utilisation Active, profitable, room to grow
MAINTAIN Score โ‰ฅ60 AND โ‰ค85% utilisation Stable, well-managed accounts
MONITOR Score โ‰ฅ80 with >85% utilisation OR other warning signs Risk emerging despite good history
REVIEW_ACCOUNT Score โ‰ค40 OR >95% utilisation Immediate intervention required

Risk Overrides

Even premium customers (score 80-100) are flagged for monitoring if:

  • Utilisation exceeds 85% (financial stress indicator)
  • Multiple months over credit limit (capacity issues)

๐Ÿ“‚ Models

Model Type Purpose Key Metrics
stg_credit_card_data View Transforms numeric codes to readable text Decoded statuses, demographics
int_payment_performance Table Calculates payment reliability and customer value 5-tier scoring system
int_utilisation_analysis Table Analyses credit usage patterns Corrected for negative balances
mart_customer_credit_profile Table Complete customer view for decisions Recommended actions
mart_credit_portfolio_summary Table Executive dashboard by risk segment Portfolio distribution

๐Ÿ“Š Key Findings

Portfolio Distribution (Actual Results)

Action % Customers Count Average Score Avg Utilisation
๐Ÿ“ˆ INCREASE_LIMIT 10.7% 3,204 89.5 62.3%
โœ… MAINTAIN 62.2% 18,673 75.2 41.8%
โš ๏ธ MONITOR 7.6% 2,270 71.3 124.5%
๐Ÿšจ REVIEW_ACCOUNT 19.5% 5,853 34.1 87.2%

Risk Concentration

  • 830 customers with 3+ late payments requiring immediate review
  • 2,270 customers exceeding 90% utilisation showing financial stress
  • 3,204 premium customers eligible for limit increases

๐Ÿš€ Setup & Usage

Prerequisites

  • โœ… Python 3.8+
  • โœ… dbt-core 1.10+
  • โœ… Google BigQuery account
  • โœ… Git

๐Ÿ“ฆ Installation

# Clone repository
git clone https://github.com/yourusername/credit-payment-analytics.git
cd credit-payment-analytics

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure BigQuery connection
# Update ~/.dbt/profiles.yml with your credentials

๐Ÿƒ Running the Pipeline

# Load raw data
dbt seed

# Run all transformations
dbt run

# Run tests
dbt test

# Generate and view documentation
dbt docs generate
dbt docs serve

๐Ÿงช Testing Strategy

Test Coverage

  • โœ… 27 tests implemented across all models
  • โœ… Business logic validation for recommended actions
  • โœ… Edge case handling for score boundaries (40, 60, 80, 100)
  • โœ… Utilisation calculation accuracy for negative balances
  • โœ… Custom macros for data quality checks

Test Results

Completed successfully
================== 
โœ“ 27 tests passed
โœ— 0 tests failed
โš  0 warnings

Key Test Validations

  • All INCREASE_LIMIT customers have score โ‰ฅ80 and 50-75% utilisation
  • No premium customers (score โ‰ฅ80) with >85% utilisation get MAINTAIN status
  • All score โ‰ค40 customers correctly flagged for REVIEW_ACCOUNT

๐Ÿ“Š Interactive Documentation

Full interactive documentation is available via dbt docs:

# Generate documentation
dbt docs generate

# View interactive documentation with DAG
dbt docs serve
# Opens at http://localhost:8080

The documentation includes:

  • Interactive DAG (Directed Acyclic Graph) of all models
  • Model and column descriptions
  • Test results and data lineage
  • Searchable interface for all project components

๐Ÿ”ฎ Future Enhancements

  • Add incremental loading for production scale
  • Implement Looker Studio dashboard
  • Add ML-based risk prediction
  • Include time-series trend analysis
  • Create API endpoints for real-time scoring
  • Expand to multi-product analysis (Term Loans, FlexiPay, Asset Finance)

๐Ÿ› ๏ธ Technologies Used

dbt BigQuery Python SQL Git

๐Ÿ“ Project Structure

credit_payment_analytics/
โ”œโ”€โ”€ ๐Ÿ“‚ models/
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ staging/
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“„ stg_credit_card_data.sql
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ intermediate/
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ int_payment_performance.sql
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“„ int_utilisation_analysis.sql
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ marts/
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ mart_customer_credit_profile.sql
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“„ mart_credit_portfolio_summary.sql
โ”‚   โ””โ”€โ”€ ๐Ÿ“„ schema.yml
โ”œโ”€โ”€ ๐Ÿ“‚ seeds/
โ”‚   โ””โ”€โ”€ ๐Ÿ“„ raw_credit_card_data.csv
โ”œโ”€โ”€ ๐Ÿ“‚ tests/
โ”‚   โ””โ”€โ”€ ๐Ÿ“„ assert_recommended_action_logic.sql
โ”œโ”€โ”€ ๐Ÿ“‚ macros/
โ”‚   โ””โ”€โ”€ ๐Ÿ“„ test_positive_values.sql
โ”œโ”€โ”€ ๐Ÿ“‚ analyses/
โ”‚   โ””โ”€โ”€ ๐Ÿ“„ EDA.ipynb
โ”œโ”€โ”€ ๐Ÿ“„ dbt_project.yml
โ”œโ”€โ”€ ๐Ÿ“„ requirements.txt
โ””โ”€โ”€ ๐Ÿ“„ README.md

๐Ÿ“ง Contact


Made with โค๏ธ using dbt

About

dbt analytics pipeline for credit risk assessment, identifying profitable customers using payment behaviour analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors