init commit
This commit is contained in:
502
docs/SECURITY_ARCHITECTURE_ADR.md
Normal file
502
docs/SECURITY_ARCHITECTURE_ADR.md
Normal file
@@ -0,0 +1,502 @@
|
||||
# 🔐 Security Architecture Decision Records
|
||||
|
||||
## ADR-001: JWT + HMAC Dual Authentication
|
||||
|
||||
### Decision
|
||||
Use JWT for client authentication + HMAC for request integrity verification.
|
||||
|
||||
### Context
|
||||
- Single JWT alone vulnerable to token theft (XSS, interception)
|
||||
- HMAC ensures request wasn't tampered with in transit
|
||||
- Combined approach provides defense-in-depth
|
||||
|
||||
### Solution
|
||||
```
|
||||
Request Headers:
|
||||
├─ Authorization: Bearer <jwt_token> # WHO: Authenticate user
|
||||
├─ X-Signature: HMAC_SHA256(...) # WHAT: Verify content
|
||||
├─ X-Timestamp: unixtime # WHEN: Prevent replay
|
||||
└─ X-Client-Id: telegram_bot # WHERE: Track source
|
||||
```
|
||||
|
||||
### Trade-offs
|
||||
| Pros | Cons |
|
||||
|------|------|
|
||||
| More secure | Slight performance overhead |
|
||||
| Covers multiple attack vectors | More complex debugging |
|
||||
| MVP ready | Requires client cooperation |
|
||||
| Can be disabled in MVP | More header management |
|
||||
|
||||
### Status
|
||||
✅ **IMPLEMENTED**
|
||||
|
||||
---
|
||||
|
||||
## ADR-002: Redis Streams for Event Bus (vs RabbitMQ)
|
||||
|
||||
### Decision
|
||||
Use Redis Streams instead of RabbitMQ for event-driven notifications.
|
||||
|
||||
### Context
|
||||
- Already using Redis for caching/sessions
|
||||
- Simpler setup for MVP
|
||||
- Don't need RabbitMQ's clustering (yet)
|
||||
- Redis Streams has built-in message ordering
|
||||
|
||||
### Solution
|
||||
```
|
||||
Event Stream: "events"
|
||||
├─ transaction.created
|
||||
├─ transaction.executed
|
||||
├─ budget.alert
|
||||
├─ goal.completed
|
||||
└─ member.invited
|
||||
|
||||
Consumer Groups:
|
||||
├─ telegram_bot (consumes all)
|
||||
├─ notification_worker (consumes alerts)
|
||||
└─ audit_logger (consumes all)
|
||||
```
|
||||
|
||||
### Trade-offs
|
||||
| Pros | Cons |
|
||||
|------|------|
|
||||
| Simple setup | No clustering (future issue) |
|
||||
| Less infrastructure | Limited to single Redis |
|
||||
| Good for MVP | Message limit at max memory |
|
||||
| Built-in ordering | No message durability guarantee |
|
||||
|
||||
### Upgrade Path
|
||||
When needed: Replace Redis Stream consumer with RabbitMQ consumer. Producer stays same (emit to Stream AND Queue).
|
||||
|
||||
### Status
|
||||
⏳ **DESIGNED, NOT YET IMPLEMENTED**
|
||||
|
||||
---
|
||||
|
||||
## ADR-003: Compensation Transactions Instead of Deletion
|
||||
|
||||
### Decision
|
||||
Never delete transactions. Create compensation (reverse) transactions instead.
|
||||
|
||||
### Context
|
||||
- Financial system requires immutability
|
||||
- Audit trail must show all changes
|
||||
- Regulatory compliance (many jurisdictions require this)
|
||||
- User may reverse a reversal
|
||||
|
||||
### Solution
|
||||
```
|
||||
Transaction Reversal Flow:
|
||||
|
||||
Original Transaction (ID: 100)
|
||||
├─ amount: 50.00 USD
|
||||
├─ from_wallet: Cash
|
||||
├─ to_wallet: Bank
|
||||
└─ status: "executed"
|
||||
│
|
||||
└─▶ User requests reversal
|
||||
│
|
||||
├─ Create Reversal Transaction (ID: 102)
|
||||
│ ├─ amount: 50.00 USD
|
||||
│ ├─ from_wallet: Bank (REVERSED)
|
||||
│ ├─ to_wallet: Cash (REVERSED)
|
||||
│ ├─ type: "reversal"
|
||||
│ ├─ original_tx_id: 100
|
||||
│ └─ status: "executed"
|
||||
│
|
||||
└─ Update Original
|
||||
├─ status: "reversed"
|
||||
├─ reversed_at: now
|
||||
└─ reversal_reason: "User requested..."
|
||||
```
|
||||
|
||||
### Benefits
|
||||
✅ **Immutability**: No data loss
|
||||
✅ **Audit Trail**: See what happened and why
|
||||
✅ **Reversals of Reversals**: Can reverse the reversal
|
||||
✅ **Compliance**: Meets financial regulations
|
||||
✅ **Analytics**: Accurate historical data
|
||||
|
||||
### Implementation
|
||||
```python
|
||||
# Database
|
||||
TransactionStatus: draft | pending_approval | executed | reversed
|
||||
|
||||
# Fields
|
||||
original_transaction_id # FK self-reference
|
||||
reversed_at # When reversed
|
||||
reversal_reason # Why reversed
|
||||
```
|
||||
|
||||
### Status
|
||||
✅ **IMPLEMENTED**
|
||||
|
||||
---
|
||||
|
||||
## ADR-004: Family-Level Isolation vs Database-Level
|
||||
|
||||
### Decision
|
||||
Implement family isolation at service/API layer (vs database constraints).
|
||||
|
||||
### Context
|
||||
- Easier testing (no DB constraints to work around)
|
||||
- More flexibility (can cross-family operations if needed)
|
||||
- Performance (single query vs complex JOINs)
|
||||
- Security (defense in depth)
|
||||
|
||||
### Solution
|
||||
```python
|
||||
# Every query includes family_id filter
|
||||
Transaction.query.filter(
|
||||
Transaction.family_id == user_context.family_id
|
||||
)
|
||||
|
||||
# RBAC middleware also checks:
|
||||
RBACEngine.check_family_access(user_context, requested_family_id)
|
||||
|
||||
# Service layer validates before operations
|
||||
WalletService.get_wallet(wallet_id, family_id=context.family_id)
|
||||
```
|
||||
|
||||
### Trade-offs
|
||||
| Approach | Pros | Cons |
|
||||
|----------|------|------|
|
||||
| **Service Layer (Selected)** | Flexible, testable, fast queries | Requires discipline |
|
||||
| **Database FK** | Enforced by DB | Inflexible, complex queries |
|
||||
| **Combined** | Both protections | Double overhead |
|
||||
|
||||
### Status
|
||||
✅ **IMPLEMENTED**
|
||||
|
||||
---
|
||||
|
||||
## ADR-005: Approval Workflow in Domain Model
|
||||
|
||||
### Decision
|
||||
Implement transaction approval as state machine in domain model.
|
||||
|
||||
### Context
|
||||
- High-value transactions need approval
|
||||
- State transitions must be valid
|
||||
- Audit trail must show approvals
|
||||
- Different thresholds per role
|
||||
|
||||
### Solution
|
||||
```
|
||||
Transaction State Machine:
|
||||
|
||||
DRAFT (initial)
|
||||
└─▶ [Check amount vs threshold]
|
||||
├─ If small: EXECUTED (auto-approve)
|
||||
└─ If large: PENDING_APPROVAL (wait for approval)
|
||||
|
||||
PENDING_APPROVAL
|
||||
├─▶ [Owner approves] → EXECUTED
|
||||
└─▶ [User cancels] → DRAFT
|
||||
|
||||
EXECUTED
|
||||
└─▶ [User/Owner reverses] → Create REVERSED tx
|
||||
|
||||
REVERSED (final state)
|
||||
└─ Can't transition further
|
||||
```
|
||||
|
||||
### Threshold Rules
|
||||
```python
|
||||
APPROVAL_THRESHOLD = $500
|
||||
|
||||
# Child transactions
|
||||
if role == CHILD and amount > $50:
|
||||
status = PENDING_APPROVAL
|
||||
|
||||
# Member transactions
|
||||
if role == MEMBER and amount > $500:
|
||||
status = PENDING_APPROVAL
|
||||
|
||||
# Adult/Owner: Never need approval (auto-execute)
|
||||
```
|
||||
|
||||
### Implementation
|
||||
```python
|
||||
# Schema
|
||||
TransactionStatus = Enum['draft', 'pending_approval', 'executed', 'reversed']
|
||||
|
||||
# Fields
|
||||
status: TransactionStatus
|
||||
confirmation_required: bool
|
||||
confirmation_token: str # Verify it's real approval
|
||||
approved_by_id: int
|
||||
approved_at: datetime
|
||||
|
||||
# Service layer validates state transitions
|
||||
TransactionService.confirm_transaction():
|
||||
if tx.status != "pending_approval":
|
||||
raise ValueError("Invalid state transition")
|
||||
```
|
||||
|
||||
### Status
|
||||
✅ **IMPLEMENTED**
|
||||
|
||||
---
|
||||
|
||||
## ADR-006: HS256 for MVP, RS256 for Production
|
||||
|
||||
### Decision
|
||||
Use symmetric HMAC-SHA256 (HS256) for MVP, upgrade to asymmetric RS256 for production.
|
||||
|
||||
### Context
|
||||
- HS256: Same secret for signing & verification (simple)
|
||||
- RS256: Private key to sign, public key to verify (scalable)
|
||||
- MVP: Simple deployment needed
|
||||
- Production: Multiple API instances need to verify tokens
|
||||
|
||||
### Solution
|
||||
```python
|
||||
# MVP: HS256 (symmetric)
|
||||
jwt_manager = JWTManager(secret_key="shared-secret")
|
||||
token = jwt.encode(payload, secret, algorithm="HS256")
|
||||
verified = jwt.decode(token, secret, algorithms=["HS256"])
|
||||
|
||||
# Production: RS256 (asymmetric)
|
||||
with open("private.pem") as f:
|
||||
private_key = f.read()
|
||||
with open("public.pem") as f:
|
||||
public_key = f.read()
|
||||
|
||||
token = jwt.encode(payload, private_key, algorithm="RS256")
|
||||
verified = jwt.decode(token, public_key, algorithms=["RS256"])
|
||||
```
|
||||
|
||||
### Migration Path
|
||||
1. Generate RSA key pair
|
||||
2. Update JWT manager to accept algorithm config
|
||||
3. Deploy new version with RS256 validation (backward compatible)
|
||||
4. Stop issuing HS256 tokens
|
||||
5. HS256 tokens expire naturally
|
||||
|
||||
### Status
|
||||
✅ **HS256 IMPLEMENTED, RS256 READY**
|
||||
|
||||
---
|
||||
|
||||
## ADR-007: Telegram Binding via Temporary Codes
|
||||
|
||||
### Decision
|
||||
Use temporary binding codes instead of direct token requests.
|
||||
|
||||
### Context
|
||||
- Security: Code has limited lifetime & single use
|
||||
- User Experience: Simple flow (click link)
|
||||
- Phishing Prevention: User confirms on web, not just in Telegram
|
||||
- Bot doesn't receive sensitive tokens
|
||||
|
||||
### Solution
|
||||
```
|
||||
Flow:
|
||||
1. User: /start
|
||||
2. Bot: Generate code (10-min TTL)
|
||||
3. Bot: Send link with code
|
||||
4. User: Clicks link (authenticate on web)
|
||||
5. Web: Confirm binding, create TelegramIdentity
|
||||
6. Web: Issue JWT for bot to use
|
||||
7. Bot: Stores JWT in Redis
|
||||
8. Bot: Uses JWT for API calls
|
||||
```
|
||||
|
||||
### Code Generation
|
||||
```python
|
||||
code = secrets.token_urlsafe(24) # 32-char random string
|
||||
|
||||
# Store in Redis: 10-min TTL
|
||||
redis.setex(f"telegram:code:{code}", 600, chat_id)
|
||||
|
||||
# Generate link
|
||||
url = f"https://app.com/auth/telegram?code={code}&chat_id={chat_id}"
|
||||
```
|
||||
|
||||
### Status
|
||||
✅ **IMPLEMENTED**
|
||||
|
||||
---
|
||||
|
||||
## ADR-008: Service Token for Bot-to-API Communication
|
||||
|
||||
### Decision
|
||||
Issue separate service token (not user token) for bot API requests.
|
||||
|
||||
### Context
|
||||
- Bot needs to make requests independently (not as specific user)
|
||||
- Different permissions than user tokens
|
||||
- Different expiry (1 year vs 15 min)
|
||||
- Can be rotated independently
|
||||
|
||||
### Solution
|
||||
```python
|
||||
# Service Token Payload
|
||||
{
|
||||
"sub": "service:telegram_bot",
|
||||
"type": "service",
|
||||
"iat": 1702237800,
|
||||
"exp": 1733773800, # 1 year
|
||||
}
|
||||
|
||||
# Bot uses service token:
|
||||
Authorization: Bearer <service_token>
|
||||
X-Client-Id: telegram_bot
|
||||
```
|
||||
|
||||
### Use Cases
|
||||
- Service token: Schedule reminders, send notifications
|
||||
- User token: Create transaction as specific user
|
||||
|
||||
### Status
|
||||
✅ **IMPLEMENTED**
|
||||
|
||||
---
|
||||
|
||||
## ADR-009: Middleware Order Matters
|
||||
|
||||
### Decision
|
||||
Security middleware must execute in specific order.
|
||||
|
||||
### Context
|
||||
- FastAPI adds middleware in reverse registration order
|
||||
- Each middleware depends on previous setup
|
||||
- Wrong order = security bypass
|
||||
|
||||
### Solution
|
||||
```python
|
||||
# Registration order (will execute in reverse):
|
||||
1. RequestLoggingMiddleware (last to execute)
|
||||
2. RBACMiddleware
|
||||
3. JWTAuthenticationMiddleware
|
||||
4. HMACVerificationMiddleware
|
||||
5. RateLimitMiddleware
|
||||
6. SecurityHeadersMiddleware (first to execute)
|
||||
|
||||
# Execution flow:
|
||||
SecurityHeaders
|
||||
├─ Add HSTS, X-Frame-Options, etc.
|
||||
↓
|
||||
RateLimit
|
||||
├─ Check IP-based rate limit
|
||||
├─ Increment counter in Redis
|
||||
↓
|
||||
HMACVerification
|
||||
├─ Verify X-Signature
|
||||
├─ Check timestamp freshness
|
||||
├─ Prevent replay attacks
|
||||
↓
|
||||
JWTAuthentication
|
||||
├─ Extract token from Authorization header
|
||||
├─ Verify signature & expiration
|
||||
├─ Store user context in request.state
|
||||
↓
|
||||
RBAC
|
||||
├─ Load user role
|
||||
├─ Verify family access
|
||||
├─ Store permissions
|
||||
↓
|
||||
RequestLogging
|
||||
├─ Log all requests
|
||||
├─ Record response time
|
||||
```
|
||||
|
||||
### Implementation
|
||||
```python
|
||||
def add_security_middleware(app: FastAPI, redis_client, db_session):
|
||||
# Order matters!
|
||||
app.add_middleware(RequestLoggingMiddleware)
|
||||
app.add_middleware(RBACMiddleware, db_session=db_session)
|
||||
app.add_middleware(JWTAuthenticationMiddleware)
|
||||
app.add_middleware(HMACVerificationMiddleware, redis_client=redis_client)
|
||||
app.add_middleware(RateLimitMiddleware, redis_client=redis_client)
|
||||
app.add_middleware(SecurityHeadersMiddleware)
|
||||
```
|
||||
|
||||
### Status
|
||||
✅ **IMPLEMENTED**
|
||||
|
||||
---
|
||||
|
||||
## ADR-010: Event Logging is Mandatory
|
||||
|
||||
### Decision
|
||||
Every data modification is logged to event_log table.
|
||||
|
||||
### Context
|
||||
- Regulatory compliance (financial systems)
|
||||
- Audit trail for disputes
|
||||
- Debugging (understand what happened)
|
||||
- User transparency (show activity history)
|
||||
|
||||
### Solution
|
||||
```python
|
||||
# Every service method logs events
|
||||
event = EventLog(
|
||||
family_id=family_id,
|
||||
entity_type="transaction",
|
||||
entity_id=tx_id,
|
||||
action="create", # create|update|delete|confirm|execute|reverse
|
||||
actor_id=user_id,
|
||||
old_values={"balance": 100},
|
||||
new_values={"balance": 50},
|
||||
ip_address=request.client.host,
|
||||
user_agent=request.headers.get("user-agent"),
|
||||
reason="User requested cancellation",
|
||||
created_at=datetime.utcnow(),
|
||||
)
|
||||
db.add(event)
|
||||
```
|
||||
|
||||
### Fields Logged
|
||||
```
|
||||
EventLog:
|
||||
├─ entity_type: What was modified (transaction, wallet, budget)
|
||||
├─ entity_id: Which record (transaction #123)
|
||||
├─ action: What happened (create, update, delete, reverse)
|
||||
├─ actor_id: Who did it (user_id)
|
||||
├─ old_values: Before state (JSON)
|
||||
├─ new_values: After state (JSON)
|
||||
├─ ip_address: Where from
|
||||
├─ user_agent: What client
|
||||
├─ reason: Why (for deletions)
|
||||
└─ created_at: When
|
||||
```
|
||||
|
||||
### Access Control
|
||||
```python
|
||||
# Who can view event_log?
|
||||
├─ Owner: All events in family
|
||||
├─ Adult: All events in family
|
||||
├─ Member: Only own transactions' events
|
||||
├─ Child: Very limited
|
||||
└─ Read-Only: Selected events (audit/observer)
|
||||
```
|
||||
|
||||
### Status
|
||||
✅ **IMPLEMENTED**
|
||||
|
||||
---
|
||||
|
||||
## Summary Table
|
||||
|
||||
| ADR | Title | Status | Risk | Notes |
|
||||
|-----|-------|--------|------|-------|
|
||||
| 001 | JWT + HMAC | ✅ | Low | Dual auth provides defense-in-depth |
|
||||
| 002 | Redis Streams | ⏳ | Medium | Upgrade path to RabbitMQ planned |
|
||||
| 003 | Compensation Tx | ✅ | Low | Immutability requirement met |
|
||||
| 004 | Family Isolation | ✅ | Low | Service-layer isolation + RBAC |
|
||||
| 005 | Approval Workflow | ✅ | Low | State machine properly designed |
|
||||
| 006 | HS256→RS256 | ✅ | Low | Migration path clear |
|
||||
| 007 | Binding Codes | ✅ | Low | Secure temporary code flow |
|
||||
| 008 | Service Tokens | ✅ | Low | Separate identity for bot |
|
||||
| 009 | Middleware Order | ✅ | Critical | Correctly implemented |
|
||||
| 010 | Event Logging | ✅ | Low | Audit trail complete |
|
||||
|
||||
---
|
||||
|
||||
**Document Version:** 1.0
|
||||
**Last Updated:** 2025-12-10
|
||||
**Review Frequency:** Quarterly
|
||||
Reference in New Issue
Block a user