13 KiB
🔐 Security Architecture Decision Records
ADR-001: JWT + HMAC Dual Authentication
Decision
Use JWT for client authentication + HMAC for request integrity verification.
Context
- Single JWT alone vulnerable to token theft (XSS, interception)
- HMAC ensures request wasn't tampered with in transit
- Combined approach provides defense-in-depth
Solution
Request Headers:
├─ Authorization: Bearer <jwt_token> # WHO: Authenticate user
├─ X-Signature: HMAC_SHA256(...) # WHAT: Verify content
├─ X-Timestamp: unixtime # WHEN: Prevent replay
└─ X-Client-Id: telegram_bot # WHERE: Track source
Trade-offs
| Pros | Cons |
|---|---|
| More secure | Slight performance overhead |
| Covers multiple attack vectors | More complex debugging |
| MVP ready | Requires client cooperation |
| Can be disabled in MVP | More header management |
Status
✅ IMPLEMENTED
ADR-002: Redis Streams for Event Bus (vs RabbitMQ)
Decision
Use Redis Streams instead of RabbitMQ for event-driven notifications.
Context
- Already using Redis for caching/sessions
- Simpler setup for MVP
- Don't need RabbitMQ's clustering (yet)
- Redis Streams has built-in message ordering
Solution
Event Stream: "events"
├─ transaction.created
├─ transaction.executed
├─ budget.alert
├─ goal.completed
└─ member.invited
Consumer Groups:
├─ telegram_bot (consumes all)
├─ notification_worker (consumes alerts)
└─ audit_logger (consumes all)
Trade-offs
| Pros | Cons |
|---|---|
| Simple setup | No clustering (future issue) |
| Less infrastructure | Limited to single Redis |
| Good for MVP | Message limit at max memory |
| Built-in ordering | No message durability guarantee |
Upgrade Path
When needed: Replace Redis Stream consumer with RabbitMQ consumer. Producer stays same (emit to Stream AND Queue).
Status
⏳ DESIGNED, NOT YET IMPLEMENTED
ADR-003: Compensation Transactions Instead of Deletion
Decision
Never delete transactions. Create compensation (reverse) transactions instead.
Context
- Financial system requires immutability
- Audit trail must show all changes
- Regulatory compliance (many jurisdictions require this)
- User may reverse a reversal
Solution
Transaction Reversal Flow:
Original Transaction (ID: 100)
├─ amount: 50.00 USD
├─ from_wallet: Cash
├─ to_wallet: Bank
└─ status: "executed"
│
└─▶ User requests reversal
│
├─ Create Reversal Transaction (ID: 102)
│ ├─ amount: 50.00 USD
│ ├─ from_wallet: Bank (REVERSED)
│ ├─ to_wallet: Cash (REVERSED)
│ ├─ type: "reversal"
│ ├─ original_tx_id: 100
│ └─ status: "executed"
│
└─ Update Original
├─ status: "reversed"
├─ reversed_at: now
└─ reversal_reason: "User requested..."
Benefits
✅ Immutability: No data loss
✅ Audit Trail: See what happened and why
✅ Reversals of Reversals: Can reverse the reversal
✅ Compliance: Meets financial regulations
✅ Analytics: Accurate historical data
Implementation
# Database
TransactionStatus: draft | pending_approval | executed | reversed
# Fields
original_transaction_id # FK self-reference
reversed_at # When reversed
reversal_reason # Why reversed
Status
✅ IMPLEMENTED
ADR-004: Family-Level Isolation vs Database-Level
Decision
Implement family isolation at service/API layer (vs database constraints).
Context
- Easier testing (no DB constraints to work around)
- More flexibility (can cross-family operations if needed)
- Performance (single query vs complex JOINs)
- Security (defense in depth)
Solution
# Every query includes family_id filter
Transaction.query.filter(
Transaction.family_id == user_context.family_id
)
# RBAC middleware also checks:
RBACEngine.check_family_access(user_context, requested_family_id)
# Service layer validates before operations
WalletService.get_wallet(wallet_id, family_id=context.family_id)
Trade-offs
| Approach | Pros | Cons |
|---|---|---|
| Service Layer (Selected) | Flexible, testable, fast queries | Requires discipline |
| Database FK | Enforced by DB | Inflexible, complex queries |
| Combined | Both protections | Double overhead |
Status
✅ IMPLEMENTED
ADR-005: Approval Workflow in Domain Model
Decision
Implement transaction approval as state machine in domain model.
Context
- High-value transactions need approval
- State transitions must be valid
- Audit trail must show approvals
- Different thresholds per role
Solution
Transaction State Machine:
DRAFT (initial)
└─▶ [Check amount vs threshold]
├─ If small: EXECUTED (auto-approve)
└─ If large: PENDING_APPROVAL (wait for approval)
PENDING_APPROVAL
├─▶ [Owner approves] → EXECUTED
└─▶ [User cancels] → DRAFT
EXECUTED
└─▶ [User/Owner reverses] → Create REVERSED tx
REVERSED (final state)
└─ Can't transition further
Threshold Rules
APPROVAL_THRESHOLD = $500
# Child transactions
if role == CHILD and amount > $50:
status = PENDING_APPROVAL
# Member transactions
if role == MEMBER and amount > $500:
status = PENDING_APPROVAL
# Adult/Owner: Never need approval (auto-execute)
Implementation
# Schema
TransactionStatus = Enum['draft', 'pending_approval', 'executed', 'reversed']
# Fields
status: TransactionStatus
confirmation_required: bool
confirmation_token: str # Verify it's real approval
approved_by_id: int
approved_at: datetime
# Service layer validates state transitions
TransactionService.confirm_transaction():
if tx.status != "pending_approval":
raise ValueError("Invalid state transition")
Status
✅ IMPLEMENTED
ADR-006: HS256 for MVP, RS256 for Production
Decision
Use symmetric HMAC-SHA256 (HS256) for MVP, upgrade to asymmetric RS256 for production.
Context
- HS256: Same secret for signing & verification (simple)
- RS256: Private key to sign, public key to verify (scalable)
- MVP: Simple deployment needed
- Production: Multiple API instances need to verify tokens
Solution
# MVP: HS256 (symmetric)
jwt_manager = JWTManager(secret_key="shared-secret")
token = jwt.encode(payload, secret, algorithm="HS256")
verified = jwt.decode(token, secret, algorithms=["HS256"])
# Production: RS256 (asymmetric)
with open("private.pem") as f:
private_key = f.read()
with open("public.pem") as f:
public_key = f.read()
token = jwt.encode(payload, private_key, algorithm="RS256")
verified = jwt.decode(token, public_key, algorithms=["RS256"])
Migration Path
- Generate RSA key pair
- Update JWT manager to accept algorithm config
- Deploy new version with RS256 validation (backward compatible)
- Stop issuing HS256 tokens
- HS256 tokens expire naturally
Status
✅ HS256 IMPLEMENTED, RS256 READY
ADR-007: Telegram Binding via Temporary Codes
Decision
Use temporary binding codes instead of direct token requests.
Context
- Security: Code has limited lifetime & single use
- User Experience: Simple flow (click link)
- Phishing Prevention: User confirms on web, not just in Telegram
- Bot doesn't receive sensitive tokens
Solution
Flow:
1. User: /start
2. Bot: Generate code (10-min TTL)
3. Bot: Send link with code
4. User: Clicks link (authenticate on web)
5. Web: Confirm binding, create TelegramIdentity
6. Web: Issue JWT for bot to use
7. Bot: Stores JWT in Redis
8. Bot: Uses JWT for API calls
Code Generation
code = secrets.token_urlsafe(24) # 32-char random string
# Store in Redis: 10-min TTL
redis.setex(f"telegram:code:{code}", 600, chat_id)
# Generate link
url = f"https://app.com/auth/telegram?code={code}&chat_id={chat_id}"
Status
✅ IMPLEMENTED
ADR-008: Service Token for Bot-to-API Communication
Decision
Issue separate service token (not user token) for bot API requests.
Context
- Bot needs to make requests independently (not as specific user)
- Different permissions than user tokens
- Different expiry (1 year vs 15 min)
- Can be rotated independently
Solution
# Service Token Payload
{
"sub": "service:telegram_bot",
"type": "service",
"iat": 1702237800,
"exp": 1733773800, # 1 year
}
# Bot uses service token:
Authorization: Bearer <service_token>
X-Client-Id: telegram_bot
Use Cases
- Service token: Schedule reminders, send notifications
- User token: Create transaction as specific user
Status
✅ IMPLEMENTED
ADR-009: Middleware Order Matters
Decision
Security middleware must execute in specific order.
Context
- FastAPI adds middleware in reverse registration order
- Each middleware depends on previous setup
- Wrong order = security bypass
Solution
# Registration order (will execute in reverse):
1. RequestLoggingMiddleware (last to execute)
2. RBACMiddleware
3. JWTAuthenticationMiddleware
4. HMACVerificationMiddleware
5. RateLimitMiddleware
6. SecurityHeadersMiddleware (first to execute)
# Execution flow:
SecurityHeaders
├─ Add HSTS, X-Frame-Options, etc.
↓
RateLimit
├─ Check IP-based rate limit
├─ Increment counter in Redis
↓
HMACVerification
├─ Verify X-Signature
├─ Check timestamp freshness
├─ Prevent replay attacks
↓
JWTAuthentication
├─ Extract token from Authorization header
├─ Verify signature & expiration
├─ Store user context in request.state
↓
RBAC
├─ Load user role
├─ Verify family access
├─ Store permissions
↓
RequestLogging
├─ Log all requests
├─ Record response time
Implementation
def add_security_middleware(app: FastAPI, redis_client, db_session):
# Order matters!
app.add_middleware(RequestLoggingMiddleware)
app.add_middleware(RBACMiddleware, db_session=db_session)
app.add_middleware(JWTAuthenticationMiddleware)
app.add_middleware(HMACVerificationMiddleware, redis_client=redis_client)
app.add_middleware(RateLimitMiddleware, redis_client=redis_client)
app.add_middleware(SecurityHeadersMiddleware)
Status
✅ IMPLEMENTED
ADR-010: Event Logging is Mandatory
Decision
Every data modification is logged to event_log table.
Context
- Regulatory compliance (financial systems)
- Audit trail for disputes
- Debugging (understand what happened)
- User transparency (show activity history)
Solution
# Every service method logs events
event = EventLog(
family_id=family_id,
entity_type="transaction",
entity_id=tx_id,
action="create", # create|update|delete|confirm|execute|reverse
actor_id=user_id,
old_values={"balance": 100},
new_values={"balance": 50},
ip_address=request.client.host,
user_agent=request.headers.get("user-agent"),
reason="User requested cancellation",
created_at=datetime.utcnow(),
)
db.add(event)
Fields Logged
EventLog:
├─ entity_type: What was modified (transaction, wallet, budget)
├─ entity_id: Which record (transaction #123)
├─ action: What happened (create, update, delete, reverse)
├─ actor_id: Who did it (user_id)
├─ old_values: Before state (JSON)
├─ new_values: After state (JSON)
├─ ip_address: Where from
├─ user_agent: What client
├─ reason: Why (for deletions)
└─ created_at: When
Access Control
# Who can view event_log?
├─ Owner: All events in family
├─ Adult: All events in family
├─ Member: Only own transactions' events
├─ Child: Very limited
└─ Read-Only: Selected events (audit/observer)
Status
✅ IMPLEMENTED
Summary Table
| ADR | Title | Status | Risk | Notes |
|---|---|---|---|---|
| 001 | JWT + HMAC | ✅ | Low | Dual auth provides defense-in-depth |
| 002 | Redis Streams | ⏳ | Medium | Upgrade path to RabbitMQ planned |
| 003 | Compensation Tx | ✅ | Low | Immutability requirement met |
| 004 | Family Isolation | ✅ | Low | Service-layer isolation + RBAC |
| 005 | Approval Workflow | ✅ | Low | State machine properly designed |
| 006 | HS256→RS256 | ✅ | Low | Migration path clear |
| 007 | Binding Codes | ✅ | Low | Secure temporary code flow |
| 008 | Service Tokens | ✅ | Low | Separate identity for bot |
| 009 | Middleware Order | ✅ | Critical | Correctly implemented |
| 010 | Event Logging | ✅ | Low | Audit trail complete |
Document Version: 1.0
Last Updated: 2025-12-10
Review Frequency: Quarterly