A comprehensive dbt project that analyses credit card payment behaviour to identify customers eligible for improved credit terms. Built as a portfolio project demonstrating advanced data transformation and business logic implementation skills.
Traditional credit scoring focuses on risk minimisation. This project takes a lender's perspective, identifying profitable customers (revolvers) who generate interest revenue whilst maintaining acceptable risk levels.
"Good customers pay eventually. Great customers pay forever (in interest)."
- "Paid in full" โ Zero balance: 90% of customers marked as paying in full still carry debt
- Account in credit (status -2): 7.6% of these customers have overpaid - lowest risk but unprofitable
- Minimum payment only (status 0): Most profitable customers when managed well
- Payment status reflects statement balance, not total debt
- Dataset: UCI Credit Card Clients Dataset (2005)
- Size: 30,000 customers
- Period: 6 months of payment history
- Location: Taiwan (amounts in NT$)
EDA Analysis โ Discovered true payment status meanings
โ
seeds/ โ Raw CSV data with corrected interpretations
โ
staging/ โ Decoded with business-accurate labels
โ
intermediate/ โ 5-tier scoring system + utilisation analysis
โ
marts/ โ Action-oriented decisions
Prioritises profitable revolvers who generate interest revenue:
| Score | Customer Segment | Criteria | Business Value |
|---|---|---|---|
| ๐ข 100 | Premium Revolvers | No late, 3+ minimum payments, >$1000 avg balance | Highest profit potential |
| ๐ข 80 | Standard Good | No late payments, >$500 avg balance | Reliable revenue |
| ๐ก 60 | Acceptable Risk | 1 late payment OR low balance | Monitor for improvement |
| ๐ 40 | Multiple Risks | 2 late payments OR frequent overpayments | Intervention needed |
| ๐ด 30 | Dormant | Minimal activity, <$100 avg balance | Re-engagement opportunity |
| ๐ด 20 | Unusual Pattern | Edge cases | Manual review required |
| ๐ด 15 | High Risk | 3+ late payments | Collections focus |
| Action | Criteria | Business Rationale |
|---|---|---|
| INCREASE_LIMIT | Score โฅ80 AND 50-75% utilisation | Active, profitable, room to grow |
| MAINTAIN | Score โฅ60 AND โค85% utilisation | Stable, well-managed accounts |
| MONITOR | Score โฅ80 with >85% utilisation OR other warning signs | Risk emerging despite good history |
| REVIEW_ACCOUNT | Score โค40 OR >95% utilisation | Immediate intervention required |
Even premium customers (score 80-100) are flagged for monitoring if:
- Utilisation exceeds 85% (financial stress indicator)
- Multiple months over credit limit (capacity issues)
| Model | Type | Purpose | Key Metrics |
|---|---|---|---|
stg_credit_card_data |
View | Transforms numeric codes to readable text | Decoded statuses, demographics |
int_payment_performance |
Table | Calculates payment reliability and customer value | 5-tier scoring system |
int_utilisation_analysis |
Table | Analyses credit usage patterns | Corrected for negative balances |
mart_customer_credit_profile |
Table | Complete customer view for decisions | Recommended actions |
mart_credit_portfolio_summary |
Table | Executive dashboard by risk segment | Portfolio distribution |
| Action | % Customers | Count | Average Score | Avg Utilisation |
|---|---|---|---|---|
| ๐ INCREASE_LIMIT | 10.7% | 3,204 | 89.5 | 62.3% |
| โ MAINTAIN | 62.2% | 18,673 | 75.2 | 41.8% |
| 7.6% | 2,270 | 71.3 | 124.5% | |
| ๐จ REVIEW_ACCOUNT | 19.5% | 5,853 | 34.1 | 87.2% |
- 830 customers with 3+ late payments requiring immediate review
- 2,270 customers exceeding 90% utilisation showing financial stress
- 3,204 premium customers eligible for limit increases
- โ Python 3.8+
- โ dbt-core 1.10+
- โ Google BigQuery account
- โ Git
# Clone repository
git clone https://github.com/yourusername/credit-payment-analytics.git
cd credit-payment-analytics
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Configure BigQuery connection
# Update ~/.dbt/profiles.yml with your credentials# Load raw data
dbt seed
# Run all transformations
dbt run
# Run tests
dbt test
# Generate and view documentation
dbt docs generate
dbt docs serve- โ 27 tests implemented across all models
- โ Business logic validation for recommended actions
- โ Edge case handling for score boundaries (40, 60, 80, 100)
- โ Utilisation calculation accuracy for negative balances
- โ Custom macros for data quality checks
Completed successfully
==================
โ 27 tests passed
โ 0 tests failed
โ 0 warnings- All INCREASE_LIMIT customers have score โฅ80 and 50-75% utilisation
- No premium customers (score โฅ80) with >85% utilisation get MAINTAIN status
- All score โค40 customers correctly flagged for REVIEW_ACCOUNT
Full interactive documentation is available via dbt docs:
# Generate documentation
dbt docs generate
# View interactive documentation with DAG
dbt docs serve
# Opens at http://localhost:8080The documentation includes:
- Interactive DAG (Directed Acyclic Graph) of all models
- Model and column descriptions
- Test results and data lineage
- Searchable interface for all project components
- Add incremental loading for production scale
- Implement Looker Studio dashboard
- Add ML-based risk prediction
- Include time-series trend analysis
- Create API endpoints for real-time scoring
- Expand to multi-product analysis (Term Loans, FlexiPay, Asset Finance)
credit_payment_analytics/
โโโ ๐ models/
โ โโโ ๐ staging/
โ โ โโโ ๐ stg_credit_card_data.sql
โ โโโ ๐ intermediate/
โ โ โโโ ๐ int_payment_performance.sql
โ โ โโโ ๐ int_utilisation_analysis.sql
โ โโโ ๐ marts/
โ โ โโโ ๐ mart_customer_credit_profile.sql
โ โ โโโ ๐ mart_credit_portfolio_summary.sql
โ โโโ ๐ schema.yml
โโโ ๐ seeds/
โ โโโ ๐ raw_credit_card_data.csv
โโโ ๐ tests/
โ โโโ ๐ assert_recommended_action_logic.sql
โโโ ๐ macros/
โ โโโ ๐ test_positive_values.sql
โโโ ๐ analyses/
โ โโโ ๐ EDA.ipynb
โโโ ๐ dbt_project.yml
โโโ ๐ requirements.txt
โโโ ๐ README.md
- GitHub: bergerache
- LinkedIn: rachel-berger-data
- Email: bergerache@gmail.com
Made with โค๏ธ using dbt
