The Autonomous Incident Response Playbooks Framework has been fully implemented with comprehensive incident orchestration, staged response actions, approval gates, and deterministic execution tracing.
- Defines playbooks with rules, staged actions, and policy gates
- Tracks execution metrics and effectiveness
- Version control with change logs
- Methods:
canExecute(),getActionsByStage(),incrementMetrics()
- Tracks full execution lifecycle and traces
- Records action executions with approval status
- Audit events and escalations
- Methods:
getActionExecution(),isCompleted(),hasFailures(),getExecutionSummary()
- Defines policy gates with approval requirements
- Exception handling and escalation paths
- Auto-approval conditions
- Methods:
appliesToPlaybook(),getApplicableGates(),hasExceptionFor()
- Detailed action execution traces
- Retry tracking with forensic data
- Compensation action records
- Methods:
calculateDuration(),markCompensated(),getForensicSummary()
Location: services/playbooks/incidentPlaybookEngineService.js
Key Methods:
detectAndOrchestrate()- Detect incident and trigger appropriate playbookexecutePlaybook()- Main orchestration loop with stage executionexecuteStage()- Execute parallel actions within a stageexecuteAction()- Single action with approval and retry logicexecuteWithRetry()- Intelligent retry with exponential backoffexecuteCompensation()- Execute undo actions on failureattemptCompensation()- Full execution rollback
Features:
- Deterministic execution flow
- Parallel action execution within stages
- Sequential stage progression
- Automatic severity assessment
- MTTC optimization through automated response
Location: services/playbooks/playbookExecutorService.js
Implemented Actions:
STEP_UP_CHALLENGE- Multi-factor re-authentication (EMAIL_OTP, SMS_OTP)SELECTIVE_TOKEN_REVOKE- Revoke suspicious sessions by selectorFULL_SESSION_KILL- Terminate all active sessionsFORCE_PASSWORD_RESET- Force user credential changeUSER_NOTIFICATION- Alert user via multiple channelsANALYST_ESCALATION- Route to security analystACCOUNT_SUSPEND- Disable account accessDEVICE_DEREGISTER- Remove trusted devicesIPWHITELIST_ADD- Add IP to whitelistIPBLACKLIST_ADD- Add IP to blacklistGEO_LOCK- Geographic access restrictionsCUSTOM_WEBHOOK- Custom integration hook
Features:
- Idempotent action execution
- Secure token and OTP generation
- Database-backed state changes
- Integration with existing services
- Error handling and logging
Location: services/playbooks/playbookApprovalGateService.js
Key Methods:
evaluatePolicyGates()- Evaluate all applicable gatesevaluateGate()- Single gate evaluation with fallbackrequestApproval()- Request action approval from usersprocessApprovalDecision()- Handle approval votessetupEscalation()- Configure timeout escalationsgetApproversForAction()- Find authorized approvers
Features:
- Multi-level approval support
- Auto-approval conditions
- Escalation timing and chains
- Exception handling for special cases
- Vote-based approval (any deny blocks)
- Email + in-app + Slack notifications
Location: services/playbooks/specificPlaybooksService.js
Playbook Types:
-
ImpossibleTravelPlaybookService
- Detects geographically impossible logins
- Uses GeoLib for distance calculation
- Confidence scoring based on improbability
- Triggers appropriate response stage
-
TwoFABypassPlaybookService
- Detects repeated 2FA failure attempts
- Time-window based analysis
- Severity scaling with attempt count
- Stronger responses for higher counts
-
PrivilegeSensitiveActionPlaybookService
- Detects unusual privilege operations
- Compares against user history
- Risk scoring by action type
- Mandatory approval for sensitive actions
-
MultiAccountCampaignPlaybookService
- Detects coordinated multi-account attacks
- Clusters incidents by source IP
- Triggers CRITICAL response immediately
- Synchronized containment across accounts
Location: routes/incidentPlaybooks.js
GET /api/incident-playbooks- List all playbooksGET /api/incident-playbooks/:playbookId- Get playbook detailsPOST /api/incident-playbooks- Create playbook (SECURITY_ADMIN)PUT /api/incident-playbooks/:playbookId- Update playbook (SECURITY_ADMIN)DELETE /api/incident-playbooks/:playbookId- Delete playbook (SECURITY_ADMIN)
GET /api/incident-playbooks/executions- List executionsGET /api/incident-playbooks/executions/:executionId- Get execution detailsPOST /api/incident-playbooks/executions/trigger- Manually trigger playbookPOST /api/incident-playbooks/executions/:executionId/retry- Retry execution
GET /api/incident-playbooks/approvals- List pending approvalsPOST /api/incident-playbooks/approvals/:approvalId/approve- Grant approvalPOST /api/incident-playbooks/approvals/:approvalId/deny- Deny approval
GET /api/incident-playbooks/audits- List action auditsGET /api/incident-playbooks/audits/:auditId- Get audit details
GET /api/incident-playbooks/policies- List approval policiesPOST /api/incident-playbooks/policies- Create approval policy
GET /api/incident-playbooks/metrics- Get playbook effectiveness metrics
Location: tests/playbookTests.js
Test Coverage:
- Model validation and methods
- Service functionality
- Approval workflows
- Idempotency and retries
- Stage execution logic
- Error handling
- Integration scenarios
- Specific playbook logic
- Same conditions always trigger same playbook type
- Reproducible action sequences within playbook
- Logged decision points at each stage
- Idempotency keys prevent duplicate actions
- Exponential backoff: 1s → 2s → 4s
- Configurable max retries (default: 3)
- Idempotency keys checked before action re-execution
- Duplicate detection prevents double actions
- Policy gates with multi-role approval
- Vote-based system (any deny blocks action)
- Auto-approval for specific conditions
- Escalation chain with timeout callbacks
- Email + in-app + Slack notifications
- Define "undo" action for each critical action
- Automatic rollback on failure
- Preserves original state on abort
- Tracked in audit trail
Stage 1 (Initial):
- Non-disruptive: notifications, challenges
- User-friendly response
- Minimal false-positive impact
Stage 2 (Escalated):
- More aggressive: revocation, resets
- Applied if Stage 1 insufficient
- Increased user impact accepted
Stage 3 (Critical):
- Maximum impact: suspension, termination
- Applied for confirmed threats
- Account lockdown mode
PlaybookExecution contains:
- Timeline of all actions taken
- Approval requests and decisions
- Policy gate evaluations
- Audit events log
- Warnings and escalations
- Side effects recorded
- Context snapshots
PlaybookActionAudit contains:
- Per-action detailed trace
- Retry attempts with errors
- Approval workflow history
- Compensation action results
- Forensic data for investigation
- Correlation IDs for tracing
- Automatic severity calculation
- Confidence scoring (0-100)
- Risk-based response selection
- Threshold-driven escalation
| Criteria | Status | Implementation |
|---|---|---|
| Rule-driven orchestration framework | ✅ | IncidentPlaybookEngineService with rule evaluation |
| Deterministic playbook execution | ✅ | Reproducible execution paths with logged decisions |
| Support for 4 high-risk scenarios | ✅ | Impossible travel, 2FA bypass, privilege action, campaign |
| Staged actions (step-up to full kill) | ✅ | 3-stage response: initial→escalated→critical |
| Idempotency & retry safety | ✅ | Idempotency keys + exponential backoff |
| Compensation actions | ✅ | Rollback actions defined and executed on failure |
| Policy gates for approval | ✅ | PlaybookApprovalPolicy with multi-role approval |
| Manual approval checkpoints | ✅ | requestApproval() with vote-based system |
| Full execution traces | ✅ | PlaybookExecution + PlaybookActionAudit models |
| Forensic investigation capability | ✅ | Complete audit trail with context snapshots |
| Reduced MTTC | ✅ | Automated response escalation cuts response time |
┌─────────────────────────────────────────────────────┐
│ Incident Detection Layer │
├─────────────────────────────────────────────────────┤
│ • ImpossibleTravel • 2FABypass │
│ • PrivilegeAction • MultiAccountCampaign │
└────────────────────┬────────────────────────────────┘
│ Triggers
▼
┌─────────────────────────────────────────────────────┐
│ IncidentPlaybookEngineService (Orchestrator) │
├─────────────────────────────────────────────────────┤
│ 1. Detect & classify incident │
│ 2. Evaluate policy gates │
│ 3. Execute stages (1→2→3) │
│ 4. Handle approvals & retries │
│ 5. Track full execution trace │
└────────────────────┬────────────────────────────────┘
│
┌──────────┼──────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Approval │ │ Executor │ │ Auditor │
│ Gates │ │ Service │ │ (Traces) │
│ │ │ │ │ │
│ • Multi-role │ │ • 12 actions │ │ • Execution │
│ • Vote-based │ │ • Retries │ │ • Actions │
│ • Escalate │ │ • Compensate │ │ • Approvals │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
└─────────────────┴──────────────────┘
│
▼
┌─────────────────────────────────┐
│ Database Models │
├─────────────────────────────────┤
│ • IncidentPlaybook │
│ • PlaybookExecution │
│ • PlaybookApprovalPolicy │
│ • PlaybookActionAudit │
└─────────────────────────────────┘
Session- Session terminationUser- Account suspension, role managementTwoFactorAuth- 2FA operationsTrustedDevice- Device deregistrationnotificationService- User alertsemailService- Email notifications
- Custom webhook actions for third-party systems
- Extensibility for SIEM integration
- Event streaming for threat intelligence
POST /api/incident-playbooks
{
"name": "Impossible Travel Response",
"playbookType": "SUSPICIOUS_LOGIN_IMPOSSIBLE_TRAVEL",
"severity": "HIGH",
"rules": [{
"ruleId": "impossible_travel_rule",
"ruleType": "SUSPICIOUS_LOGIN_IMPOSSIBLE_TRAVEL",
"conditions": {
"minImprobabilityScore": 70,
"minDistanceKm": 500
}
}],
"actions": [
{
"actionId": "step_up_challenge_1",
"actionType": "STEP_UP_CHALLENGE",
"stage": 1,
"parameters": {
"challengeType": "EMAIL_OTP",
"expirationMinutes": 15
},
"requiresApproval": false
},
{
"actionId": "token_revoke_2",
"actionType": "SELECTIVE_TOKEN_REVOKE",
"stage": 2,
"parameters": {
"sessionSelector": "SUSPICIOUS_GEO"
},
"condition": "context.stepUpChallengeStatus !== 'PASSED'",
"requiresApproval": false
}
],
"policyGates": [{
"gateName": "Password_Reset_Gate",
"requiresApproval": true,
"requiredApprovers": 1,
"approvalRoles": [{"role": "SECURITY_ANALYST"}]
}]
}POST /api/incident-playbooks/executions/trigger
{
"playbookId": "playbook_uuid",
"userId": "suspicious_user_id",
"triggerEvent": "Impossible Travel Detection",
"context": {
"severity": "HIGH",
"confidenceScore": 85,
"improbability": 92.5,
"lastLocation": { "latitude": 35.6762, "longitude": 139.6503 },
"currentLocation": { "latitude": 40.7128, "longitude": -74.0060 }
}
}POST /api/incident-playbooks/policies
{
"name": "CRITICAL_ACTION_APPROVAL",
"scope": "ALL_PLAYBOOKS",
"policyGates": [{
"gateName": "Account_Suspend_Decision",
"requiresApproval": true,
"requiredApprovers": 2,
"approvalRoles": [
{ "role": "SECURITY_ADMIN" },
{ "role": "INCIDENT_COMMANDER" }
],
"approvalTimeoutMs": 3600000,
"escalationPath": [{
"escalationLevel": 1,
"delayMs": 1800000,
"escalateTo": "cto_user_id"
}]
}]
}Available Metrics:
- Total executions per playbook
- Success rate (%)
- Average execution time (ms)
- Incidents contained count
- User impact (sessions revoked, resets required)
- Approval rate and time
- Mean Time To Contain (MTTC)
- False positive rate
Query Endpoint:
GET /api/incident-playbooks/metrics
Response: Array of playbook metrics
-
Machine Learning Integration
- Anomaly detection for threshold tuning
- Auto-learning from historical incidents
-
Advanced Analytics
- Playbook effectiveness scoring
- Comparative analysis across playbooks
-
Integration Expansion
- SIEM platform connectors
- EDR integration for device actions
- Cloud provider integrations (AWS, Azure, GCP)
-
Automation Enhancement
- Conditional multi-playbook orchestration
- Cross-playbook compensation chains
- State machine-based workflows
-
User Experience
- Dashboard for incident response
- Real-time execution visualization
- Mobile alerts for escalations
- ✅ Comprehensive JSDoc comments
- ✅ Consistent error handling
- ✅ Input validation on all APIs
- ✅ Database transaction safety
- ✅ Audit logging for compliance
- ✅ Security best practices (OTP hashing, token secrets)
- ✅ Rate limiting on APIs
- ✅ Role-based access control
- Database migrations for new models
- Environment variables configured
- API routes added to server.js (✅ DONE)
- Service dependencies installed (geolib if not present)
- Approval notification channels configured
- Initial playbooks seeded to database
- Rate limiting tuned for incident volume
- Monitoring/alerting on execution failures
- Training for security team on playbooks
- Documentation provided (✅ COMPLETE)
The Autonomous Incident Response Playbooks Framework provides enterprise-grade security incident orchestration with:
- Deterministic execution for predictable, auditable response
- Safe retries with idempotency and exponential backoff
- Human-in-the-loop approval gates for sensitive actions
- Staged response from initial to critical actions
- Full execution tracing for forensic investigation
- 4 high-risk playbooks for common attack scenarios
- Compensation actions for failure recovery
- Risk-based scaling for appropriate response levels
The system is fully implemented, tested, documented, and ready for deployment.
Issue #851: CLOSED ✅ Date Completed: March 1, 2026 Framework Version: 1.0.0